by Umesh Maheshwari – Co-founder and CTO
If you follow storage, you know that using flash memory in datacenter storage is a hot topic. After all, a flash drive has no moving parts and is a hundred times faster than a hard disk on random reads. All major storage vendors have released products that incorporate flash in some form, and most storage startups showcase flash as a key innovation.
So, it was not surprising that flash had the largest share of papers in FAST 2012 (a research conference on file and storage technologies): 9 out of 26 total. This was ahead of older favorites such as deduplication (4 papers) and cloud storage (also 4 papers).
Most of the flash-related papers were around how to optimize for flash, but one caught everyone’s attention by its title, “The Bleak future of NAND flash” [PDF], authored by folks from UC San Diego and Microsoft Research. They note that, as the density of flash increases, its reliability, endurance, and performance are all expected to decline. This is particularly alarming for write endurance and write performance, which are already major cause for concern with flash.
Actually, this trend has been known for some time. For example, SLC (1 bit per cell) flash has a write endurance of about 100K cycles, MLC (2 bits per cell) with 50nm features about 10K cycles, MLC with 30nm features about 5K cycles, MLC with 20nm features about 3K cycles, and TLC (3 bits per cell) only 1K cycles. While this much has been known, the authors corroborated this trend with many more data points, and discussed the fundamental reasons behind it. Then they put their conclusion in blunt words—calling out the emperor’s new clothes.
To put this in perspective: storage architects are used to lamenting that hard disks have less than doubled in performance while increasing in capacity forty fold over the last decade. Current projections say that flash, for a similar increase in capacity, is going to lose a quarter of its performance and a chunk of its endurance.
The other flash-related papers at FAST are titled more optimistically, but, if one digs deeper, one finds that many of them are about overcoming the peculiarities of flash. For example, there is one on “Optimizing NAND Flash-Based SSDs via Retention Relaxation” [PDF]. It reminds us that, besides having limited write endurance, flash has limited read retention. In general the retention is long enough (1—10 years) that it can be handled through periodic wear leveling. The paper offers that, in applications that can tolerate lower retention, retention can be traded off for faster writes. It is already known that retention can be traded off for higher endurance.
Collectively, all this leads to a sobering conclusion: there are stark tradeoffs between flash density, endurance, performance, and retention. And none of these seems dispensable in a medium for storing enterprise data.
So, can flash replace the hard disk in the data center? For at least the next 5 years, and for the vast majority of applications, it is unlikely. It is one thing to replace the hard disk in laptops, and another to do the same in datacenters, which exert more random writes and demand more reliability.
Will flash ever completely replace the hard disk? Maybe. Surely, storage scientists will someday evolve a solid-state replacement to spinning media. We all hope that it happens sooner rather than later. Whether that replacement is flash or some other solid-state technology still remains to be seen.
Where does this leave us? The good news is that there is one design approach that leverages the strengths of flash while working around its limitations and tradeoffs: use disk as the endpoint of storage, and flash as a large read cache, holding a subset of data already stored on disk.
In such a system, the questionable reliability of flash can be overcome by adding a checksum to each object stored on it. If the checksum does not match, the system can just remove the object from the cache and fall back to the disk subsystem. Moreover, as a read cache, the projected declines in flash write performance have no impact on the overall system write performance.
- Umesh Maheshwari