This installment of Smooth Scaling focuses on scale out, and addresses some of the key clustering, configuration, and continuity considerations for selecting a storage solution for higher levels of scalability.

In fast-growing organizations, where near-term capacity growth patterns and performance demands are ambiguous at best, creating a scale-out cluster is an excellent strategy to mitigate those uncertainties while consolidating multiple workloads onto a common storage infrastructure. Furthermore, it offers significant cost benefits, which will be addressed in greater detail in part 3 of this blog.

The upside to scaling out can be substantial, but without the right type of scale-out storage solution, IT teams risk drowning in configuration complexity, excessive performance overhead, or worse: having most of their waking hours consumed by manual data migrations.

Building a storage cluster shouldn’t require the services of a network engineering team; the storage team should be able to easily set up and configure the scale-out cluster. Smooth scaling would involve a solution that requires no dedicated backplane, automates network connections, as well as simplifies creating and defining storage pools (a pool is a set of arrays in a cluster over which data of resident volumes is striped and automatically rebalanced). Moreover, because smooth scaling requires flexibility, the right scale-out solution can allow for the mixing and matching of arrays across product families within the same cluster. For example, Nimble Storage’s scale-out architecture allows for up to four of any Nimble CS-Series Arrays to be clustered together.

It is not uncommon for IT organizations to periodically modify clusters. For example, capacity, compute, or cache within individual array nodes may need to be upgraded, or an array might be repurposed as a replication target and replaced with one of higher performance. Regardless, it is essential that the applications supported on the cluster continue to run without any disruption. Not only should scale-out storage facilitate seamless upgrades to individual nodes, it should allow arrays to be easily added or removed, with the resulting data migrations handled in a robust, automated fashion.

Again, Nimble Storage is a great example of a scale-out architecture that expertly automates data management across a cluster as it changes. For example, if an array is to be removed from the cluster, Nimble transparently (and non-disruptively) migrates the data volumes off of the outgoing array to the remaining array(s) in the pool.

Storage clusters are even configured specifically to enhance storage management by leveraging the ability to migrate volumes between arrays without disruption.  Though automated data migrations can sometimes take hours to complete, this practice frees up IT teams for more productive endeavors.

Scaling Capacity and Performance Within a Cluster

Here’s what to expect in terms of how performance and capacity scale through clustering.

Naturally, performance and capacity scale simultaneously when configuring multiple arrays in a scale-out cluster. Whereas capacity scales to the sum of the individual arrays’ effective capacities, performance scaling through clustering is less straightforward, as some level of performance overhead is incurred. For many current scale-out storage solutions, data has to be forwarded between arrays in order to fulfill an IO request, primarily due to a lack of knowledge as to which array in the cluster that piece of data belongs to. The process of data forwarding between nodes adds considerable latency and unnecessary compute load.

Nimble Storage’s scale-out architecture successfully scales performance while minimizing scale-out performance overhead. First, this architecture performs fine-grained striping of data across arrays (cluster nodes), allowing the volumes that span those arrays to fully leverage the combined cache and compute resources. Second, Nimble employs an intelligent Multi-Path IO (MPIO) module at the host, which determines to which array in the cluster a piece of data should be directed. IO load is dynamically balanced at the host, minimizing impact to cluster performance. The overall result is performance that scales in a linear fashion.

Here is a summary of key scale-out characteristics and their benefits that should be carefully considered:

Scale Out Storage Characteristics Table


Seven C’s of Scalable Storage Blog

Part 1: Capacity, Cache, and Compute
Part 3: Cost-Efficiency