Are SSD-based Arrays a Bad Idea? »

Last week Robin Harris posted a note titled “Are SSD-based arrays a bad idea?”. Here is my take on the topic

Robin argues that, in storage arrays, using flash in SSD form with SAS/SATA interconnect is only a short-term opportunity, while the real strategy is to move to “raw flash” with closer integration with the motherboard.

This argument may be based on the view that flash is like DRAM. In reality, it is somewhere between DRAM and hard disk and has many idiosyncrasies of its own. E.g., it is page addressable, not byte-addressable, and it needs wear leveling to mitigate its write endurance limitations.

Let’s consider the five dimensions that were used to compare SSD and raw flash:

  • Latency. The latency added by a SAS HBA is small compared to the rest of the ecosystem in networked storage. The HBA seems to add a few 10s of microseconds, while it takes about 100 microseconds just for a roundtrip between the application and networked storage. Furthermore, most enterprise applications issue multiple I/O requests in parallel, and therefore are more sensitive to I/O throughput (measured in IOPS) than to latency with single outstanding request.
  • SSD bandwidth. This is even less of an issue than latency. Drive interconnects and HBAs provide bandwidths in many GB/s, and are highly unlikely to be a bottleneck in the vast majority of storage ecosystems and applications.
  • Reliability. There are good reasons to be able to replace failed flash devices similar to how hard disks can be hot swapped. The raw bit error rate (RBER) of flash is actually worse than that of hard disks, and it gets worse as blocks are rewritten. It is also getting worse as manufacturers are moving to increase density. (See this paper from FAST 2012: The Bleak Future of NAND Flash and a related blog post.)
  • Cost. Robin states that an SSD costs 50—100% more than raw flash. This margin originates not from its SAS/SATA interface, but from its embedded flash controller. Without this controller, the host would be completely responsible for operations such as wear leveling, which are data intensive and can use up significant bus/memory bandwidth. Thus an on-board controller is beneficial, despite the extra cost.
  • Flexibility. The SSD form factor allows hot swaps not just for replacing failed drives, but also for expanding flash capacity.

It is likely that new attachment technologies will evolve that will get the best of SSD form without the concomitant overheads of SAS/SATA. It’s just that these overheads are relatively small today and the benefits of SSD form outweigh the drawbacks.

Furthermore, as new attachment technologies become cost effective, it will not be difficult for storage vendors to adopt them.

So, while this debate on SSD-vs-raw is interesting, what is more important over the longer term is how data is laid out.  That is a bigger debate. There are a few important questions framing this debate:

  • Use only flash or a hybrid combination of flash and disk?
  • If using a hybrid of flash and disk, use flash as an end point of storage or as cache?

Pure flash is like pure gold—dazzling, but unsuitable for the vast majority of practical uses. It needs to be “hardened” with hard disks. Reasons:

  • Even low-end flash SSDs are about 20x costlier on $/GB than enterprise-class hard disks ($1.6/GB vs $0.08/GB). Most flash-only systems use flash SSDs that are at least twice as costly. Add to this the cost of overprovisioning to lower garbage collection overhead. And the cost of possibly higher RAID parity to protect against failures. Now you are talking 50x costlier than disk.
  • The long-term reliability of flash is unproven, given concerns around its high raw bit error rate, limited write endurance, and limited retention. And these metrics are getting worse as flash manufacturers are increasing density. I would not trust flash with the only copy of my business’s data.
  • Flash shines on random reads, being 100x faster on latency than hard disk. However, it is lackluster on random writes. Raw flash takes over 10ms to erase and rewrite a block. In practice, SSD vendors mitigate this problem somewhat using write coalescing and overprovisioning, and even that comes at a price. Finally, the sequential bandwidth of flash SSDs is in the same ballpark as hard disk, and perhaps a little worse on a per dollar basis. So, flash is no panacea for all storage problems.

Will flash ever completely replace the hard disk in primary storage? Maybe. Surely, storage scientists will someday evolve a solid-state replacement to spinning media that offers the right cost economics and effectively addresses the limitations above.

For today, the answer is hybrid storage. Customers vote with their pockets. And for most customers and their mainstream applications, it is all about getting the best performance at an affordable price point with the right level of data protection.