All Flash: Data Reduction Beyond Dedupe
By Jeff Feierfeil – Product Management

In comparing all flash arrays, one important factor is data deduplication, or dedupe, a technique for eliminating redundant data in a data set so that only one unique instance of the data is retained. But dedupe is only one part of the broader topic of data reduction, and to really compare different storage solutions it’s important to understand all these parts, and how they fit together.

In particular, there are four main components in Nimble’s approach to data reduction:

  • Deduplication
  • Inline compression
  • Zero-pattern elimination
  • Zero-copy clones.

Here’s a quick look at each of these four technologies, and how they combine to make the effective capacity of Nimble arrays so much larger than the usable capacity or the raw capacity.


Nimble’s approach to deduplication is characterized by unparalleled performance, consistency, metadata efficiency, locality awareness, and granular space reporting:

Performance – Nimble has designed inline, always-on performance-optimized, variable block dedupe.

  • “Inline” refers to the fact that dedupe happens prior to compression (which is where you want it), and before de-staging to flash.
  • “Always-on” means sustained performance over large ingests file (such as VMotions), without resorting to stopping or deferring deduplication.
  • “Performance optimized” refers to our deduplication algorithm, which has been engineered to use rapid, memory-efficient duplicate detection.
  • “Variable block dedupe” requires less metadata (which is good) by adapting the detection size to the application block size, thereby improving efficiency and speed.

Consistency – Regardless of how “dedupable” the data may be, Nimble’s architecture is able to maintain absolute performance consistency. The performance of other all flash arrays can vary wildly in such situations.

Metadata efficiency – Nimble’s lightweight garbage collection process enables us to maintain consistent latency, regardless of how full the system may be. This is important, as there will be far fewer metadata updates with Nimble than with competing architectures, particularly in cases with high dedupe ratios, such as VDI (Virtual Desktop Infrastructure). This approach also minimizes our memory footprint, allowing dramatically higher capacity scalability and lower $/GB compared with other AFA vendors.
Locality awareness – One way Nimble optimizes deduplication is by loading adjacent fingerprints into memory. For instance, in a VDI environment the blocks adjacent to those associated with a specific application will typically be more likely to be able to be deduped than average. Other applications, such as databases, generally have little to no dedupe savings and thus we can adapt our dedupe heuristics accordingly.

Granular space reporting – Space savings can be reported per application category, providing more insight into how well an individual application’s volume(s) are contributing to the overall deduplication rate. This lets you separate compression and cloned savings from deduplication savings, so it becomes instantly recognizable which data reduction technology is contributing most to overall space savings. Now you have all the data necessary to decide whether to enable or disable dedupe on a per-application basis.

All dedupe designs come with some memory and performance tax. That hit may not be readily apparent when there’s no way to disable dedupe, but there’s still an impact. Nimble All Flash arrays also have dedupe enabled globally, by default, for the entire pool. However, for customers wanting the ultimate in performance, you can disable dedupe at any time on a per-volume or per-application-type basis. This means there is no mandatory dedupe tax on volumes that provide no meaningful dedupe benefits.

Finally, Nimble’s dedupe architecture leverages the real world insight that most meaningful savings come from like data types (e.g. VMDKs vs. VMDKs, files vs. files). Dedupe is faster and more efficient because index updates aren’t scattered as randomly, and thus less dependent on memory-rich and CPU-heavy hardware. The net result is being able to support larger flash capacities than competing architectures, but with less DRAM.

Inline Compression

Nimble’s always-on universal compression offers a field-measured average 2x benefit, according to InfoSight data, on many application workloads. including databases. Our variable-size blocks enable fast inline compression, and eliminate the read-modify-write penalty, allowing compression to work on all applications.

Zero-Pattern Elimination

Quite simply, if there are patterns of zeros, we eliminate having to actually write them, and instead simply refer to them using a lightweight reference. This is a relatively small optimization, but can make a measureable difference in workloads like databases which may contain lots of zeros.

Zero-Copy Clones

Snapshots and clones are existing, core technologies in Nimble’s CASL file system,

and do not employ any kind of metadata operations. This space savings and a higher numbers of snapshots / clones without the additional burden of larger amounts of costly DRAM to hold that metadata, and without any impact on the garbage collection process.

Zero-copy clones allow instantly cloning a volume without actually copying the data. They are deal for use in virtualization / VDI (Virtual Desktop Infrastructure) and test & development and are an efficient use of storage capacity where savings of as much as 90% to 95% can be realized, as only changes need to be saved. Our customers use them extensively to accelerate the dev / test of database application environments.

All in all, when you look closely at the specifications for flash arrays from different vendors, it’s clear that Nimble provides higher effective capacity for the same amount of an array’s raw capacity. That’s a key value differentiator, and a direct result of intelligently integrating all these data reduction technologies into a single system.