“The wind and the waves are always on the side of the ablest navigator.”
— Edward Gibbon, Historian
As an IT professional focused on both critical applications and storage, you’re probably sick of warnings about the torrent of bits and bytes predicted to overwhelm your organization. No doubt about it, we’re in the midst of a data deluge. This three-part blog series offers comprehensive guidance on choosing – as well as managing – a storage system with the level of the scalability that can keep your organization floating above a churning sea of data.
With storage, you will be faced with navigating the seven C’s of scalability: Capacity, Compute, Cache, Clustering, Configuration, Continuity – and the C most important to those in upper management: Cost Efficiency.
The seven C’s of storage scalability are all interconnected, and with the right storage platform you can successfully navigate your way through to successfully supporting your organization as it faces uncertain data growth patterns and the changing performance demands of current and future critical applications.
Scaling Methodology and the Seven C’s
No storage project should begin on a low note, but here it is: Even with the best tools and the most comprehensive input from groups across your organization, it’s impossible to accurately predict how much your performance and capacity needs will need change over time.
That’s why scalability is so important. A truly scalable storage solution can easily and cost-effectively adapt for a wide range of workloads, accommodate more end users, and house a greater amount of data. The following framework will allow you to determine whether a particular storage product offers the degree of flexibility you need.
The methodology is this: First, scale performance and capacity within a single array, and then both, simultaneously, via a scale-out cluster.
Almost every storage vendor will tell you how big and quickly their system can grow. But how that scale is achieved varies significantly. The likelihood that you’ll smoothly scale across all C’s increases dramatically once you get a look at what’s under the hood of the array, and at the underlying file system and its set of scale-out capabilities.
The first C’s to consider are Capacity (predominantly disk-based), Cache, and Compute.
Yesterday’s storage architectures (which can be found underlying many of today’s current storage products) offer only limited scalability. The reason is that with legacy storage systems ,including those with bolted-on flash, disk speed and spindle count drive performance. To generate more IOPS means connecting more disk drives or shelves to the head unit. This type of scaling is simple, but ultimately wasteful. That’s because capacity also scales – whether you need extra space or not. And, you’ll also need more power and cooling, which present additional costs. What started out as a simple upgrade can result in a swell of unused capacity and wasted OPEX.
The latest generation of storage is built on some combination of disk (Capacity), flash (Cache), and controllers (Compute) driven by multi-core CPUs. Hybrid storage (flash combined with disk) and all-flash storage each take a different approach to leveraging flash. And, each approach impacts scalability. The most crucial factor to consider is a product’s ability to scale performance and capacity – independently. This guarantees storage resources will be deployed with a minimum of waste. (Cost efficiency considerations will be addressed in greater detail later on.)
As mentioned earlier, performance and capacity are conjoined in legacy storage architectures. There’s no getting around it: Yesterday’s file systems simply can’t make up for disk’s cripplingly slow random write speeds, or lay out data for better performance.
On the other hand, all-flash arrays deliver massive performance, and offer flash and controller CPU upgrades. However, scaling capacity within an all-flash infrastructure can be expensive, depending on the effectiveness of data footprint reduction methods, such as compression and deduplication.
Nimble Storage’s CASL™ (Cache-Accelerated Sequential Layout) architecture enables independent performance and capacity scaling because it decouples performance from disk. Although it incorporates flash and disk in a single system, CASL is a CPU-driven architecture. Its unique data layout sequentializes random write data and writes it to disk in full stripes, taking full advantage of disk’s excellent sequential write performance. And, it makes efficient use of disk space, which is easily expanded through the addition of disk shelves. CASL leverages flash as a cache for active data to accelerate read operations. It, too, can flexibly scale to accommodate entire working sets across different applications by adding higher-density SSDs or even an all-flash expansion shelf. Controllers can be upgraded without disruption to those with more CPU cores to scale overall IOPS. (See Nimble’s product portfolio.)
A storage solution with best-in-class scalability allows you to upgrade what you need, when you need it.
The table below summarizes the initial set of scalability characteristics that should be carefully considered when evaluating a particular solution’s scalability.
Seven C’s of Scalable Storage Blog