Springpath Blog


Request A Free Trial

The HALO core behind Springpath Data Platform

The focus of Springpath Data Platform is to leverage the underlying resources of the servers to ensure that the applications are served with the right combination of performance and capacity. In addition, doing this across a wide variety of compute environments and to simplify any upgrade processes to ensure no disruption is entirely a different challenge.  To enable this we designed the Hardware Agnostic Log-structure Objects (HALO) architecture to run entirely in user space resulting in full portability to meet the changing needs of the enterprise.

Springpath Halo Architecture

One of the core values of a distributed system is parallelism. The Data Distribution layer stripes IO requests coming from virtual machines or containers to all the controllers/nodes in the entire cluster. This avoids any hot spots by spreading the load evenly across the cluster. HALO also monitors the cluster usage of the stored data, resource changes (additions/removals), and distributes data uniformly when there is an imbalance in the system using sophisticated algorithms.

When it comes to using storage systems, performance becomes one of the top considerations. HALO delivers high performance using combination of memory and SSD based Data Caching. HALO’s caching architecture employs a two level cache scheme. It employs de-dupe to effectively manage the stored data, which is quite novel compared to other caching architectures. HALO caches both reads and writes. Write requests are logged to SSDs and replicated (# of copies is configurable) across servers for high availability before they are acknowledged. This results in strong consistency.

HALO’s Data Persistence layer is responsible for storing data that gets de-staged from the caching tier. This layer provides long-term storage of the data as opposed to caching, which is mostly temporal in nature. Data persistence layer leverages capacity centric media (such as HDDs or cost effective SSDs) to store data. Data persistence layer utilizes all resources within a cluster to store data, thereby avoiding any bin-packing issues that many storage systems are susceptible to.

Both HALO’s Caching and persistence layers are stacked on top of Log-structured Objects, which is the backbone of Springpath Data Platform. HALO’s log-structured objects is a completely distributed object layer that stores <Keys, Values>. HALO’s log structured object layer provides access to objects stored anywhere within a cluster.

HALO provides Data Optimization capabilities across the entire cluster both for the caching and persistence tier. Both in-line de-dupe as well as built-in compression are offered without compromising any performance penalties.  This is something that is often not the case even with the latest of hyper-converged solutions on the market.  In some cases specialized hardware is required to ensure top rated performance, adding to the cost of the overall system.

Many leading storage platforms still lag behind when it comes to delivering Data Services, such as native snapshot and cloning capabilities. Snapshot and cloning are the fundamental building units for many IT environments, like VDI, test/dev and enterprise application production workloads. Most modern systems often are limited in scale (# of snapshots/clones), granularity, CRUD performance of snapshots/clones. HALO implements distributed pointer-based snapshot by combing ROW (Redirect on write) to deliver industry leading fast & efficient snapshot and cloning capability.

HALO provides industry-leading data integrity. It uses crc64 checksum for every data that gets written to both caching and persistence layers. In addition, the entire HALO file system is built using SHA1 fingerprints, which provides extra level of data integrity.

The core HALO capabilities described above are state of the art and bring to market an enterprise grade distributed storage system that leverages commodity servers to meet the needs of not just virtualized compute platforms, but also modern compute platforms such as containers and Hadoop.

In the next blog posts we will discuss in detail each layer of the HALO architecture. Please stay tuned.

Leave a Reply