By Stephen Daniel – Product Management
One of the biggest challenges facing IT architects when planning a data center refresh is how to measure and compare the performance of different enterprise data storage arrays, especially when flash storage is involved.
For instance, consider an IT architect who wants to purchase 20 TB of usable storage, capable of running 50,000 low-latency I/O operations per second (IOPS). After researching the top-rated legacy and emergent vendors in the market, they select a vendor, set up a proof-of-concept (POC) trial, and run some tests with Iometer, a widely-used storage measurement and characterization tool. At first, Iometer shows that they’re getting well under 50,000 IOPS, so the vendor’s systems engineer suggests half a dozen changes to the Iometer profile, and voilà – it’s now hitting 50,000 IOPS.
This is the quagmire facing the enterprise data storage buyer – all the prospective vendors are quoting IOPS, but are the IOPS aren’t necessarily comparable. Does this mean that Iometer is useless for assessing the performance of data storage systems? No, it’s just another example of the difference between marketing performance numbers and a benchmark. Iometer can be configured to run a wide variety of different workloads by tuning the size of I/O operations, the ratio of reads to writes, the size of the data set, the number of worker threads, and the number of I/O operations outstanding, as well as by combining multiple workloads into a single run. In all cases, however, it reports IOPS.
All IT architects need benchmarks – well-defined workloads with realistic results, presented in a standard way. That’s what drove the major vendors in the data storage to form the Storage Performance Council (SPC), a multi-vendor council that creates and administers rigorous benchmarks of storage system performance.
By standardizing a workload, requiring full disclosure of how a benchmark is run, and complete pricing of the storage system that was tested, the SPC strives to create a body of results that are realistic, comparable, and above all, useful.
For instance, the SPC benchmark neutralizes some of the common ways that vendors might try to “game” the system, such as:
- Overprovisioning storage by using only a few percent of available space. Almost any storage system will perform better when there is lots of free space. Spinning disks will have shorter average seek times, flash storage will spend less bandwidth on free space management and wear leveling.
- Unrealistic dedupe or compression ratios (e.g. 20:1 space savings). Reducing the data through compression or deduplication allows a storage system to manage larger amounts of data with less bandwidth. Measuring performance with a highly compressible or highly duplicated data set will give unrealistic results.
- Tiny data sets. Most storage arrays have at least one level of cache. By running a benchmark where most of the data fits in the array’s main memory it is possible to demonstrate very high performance, but real-world applications will require data sets much larger than can fit in the array’s main memory.
- Workloads that are 100% reads, or use 512-byte I/O sizes. Typical applications use I/O sizes that are 8KB or larger. By using a very small block size vendors can demonstrate much higher I/O rates. Furthermore, for many storage arrays reads are significantly faster than writes, so by using 100% read tests it is possible to show performance that is higher than most read/write applications will achieve.
- Workloads that are 100% sequential, yet count each block transferred as a separate I/O “operation”. I’ve seen some benchmarks where a vendor did 64KB sequential reads, but counted each 8KB block as a separate read operation. Since a single 64KB read is much less expensive than eight reads of 8K, and since sequential reads are lower cost than random reads, this does not produce accurate results.
Nimble Storage is active in industry standards-setting groups, and I’ve personally been a longstanding participant in SPC, working on enhancing the council’s benchmarks for storage arrays with built-in compression and deduplication. I’m honored to have recently been elected chair of the SPC steering committee, where I now focus on making the body of public SPC results broader and more useful.
Naturally, I’d love to see IT architects simplify their purchasing problems by requiring “SPC-IOPS”, not merely “IOPS” in their requests for proposals (RFPs). But beyond that, for storage benchmarks to be as useful as possible, they must constantly reflect the best ideas contributed by the industry’s smartest storage experts. I invite you to send me an email (firstname.lastname@example.org) and share your insights on what storage benchmarks mean to you, and how we can keep advancing the state of the art.