A Comparison of Filesystem Architectures »

Nimble Storage uses a new filesystem architecture called CASLTM (Cache Accelerated Sequential Layout). As I described in a previous post, CASL was designed from the ground up to provide a powerful combination of capacity and performance optimizations.

There is no dearth of well-designed filesystems. The vast majority of filesystems are “write in place” (WIP). When an application updates a block, the filesystem overwrites the block’s existing location on disk.

CASL belong to a different class of filesystems that may be called “write in free space” (WIFS). Two well-known examples of this class are NetApp’s WAFL and Sun’s ZFS. In this post, I will describe how CASL provides all the benefits of WIFS while overcoming the shortcomings of existing WIFS filesystems.

A WIFS filesystem does not overwrite blocks in place. Instead, it redirects each write to free space, and updates its index to point to the new location.  This enables the filesystem to coalesce logically random writes into a physically sequential write on disk.

Furthermore, because WIFS does not overwrite the old versions of blocks, it provides a simple and efficient method to take snapshots. These snapshots are often called “redirect on write” (ROW). On the other hand, a WIP filesystem generally creates “copy on write” (COW) snapshots, wherein the first write to each block after a snapshot triggers a copy of the old version to a separate location.

However, most existing WIFS filesystems such as WAFL and ZFS are “hole filling” in nature.  When most of the space is free, the filesystem is able to write in full stripes across the disk group, providing good performance.  Over time, as random blocks are overwritten and snapshots are deleted, free space gets fragmented into “holes” of various sizes, resulting in a Swiss cheese pattern. The filesystem redirects writes into these holes, resulting in random writes. Furthermore, even sequential reads of data written in this manner turn into random reads on disk.

Some WIFS file systems attempt to overcome these shortomings, for example by periodically defragmenting the free space. However  the process is heavyweight and does not ensure sequentiality. ZFS attempts to reduce the impact of hole filling on parity updates by fitting a full RAID stripe within the hole. However, this RAID stripe does not span the whole disk group, so it still results in random writes and reads.

CASL is a WIFS filesystem, but it is NOT hole filling.  It always writes in full stripes spanning the whole disk group.  It employs a lightweight sweeping process to consolidate small holes into free full stripes. Its internal data structures are designed ground-up to run sweeping efficiently, and it caches these data structures in flash for additional speed.

An important side-benefit of always writing in full stripes is that CASL can coalesce blocks of different sizes into a stripe. Among other significant benefits, this enables a particularly efficient and elegant form of compression. The resulting layout is shown below.

On the other hand, hole-filling filesystems are forced to use less efficient mechanisms.  E.g., since they write in units of blocks and not full stripes, they may try to compress a bunch of successive blocks into a smaller number of slots. Now imagine what would happen if an application updates one of those blocks.  The filesystem would need to do a read-modify-write on the bunch: read the old bunch, decompress it into blocks, update the one block, and re-compress and re-write the bunch.

The read-modify-write is has a big impact on performance, making compression in hole-filling files systems unsuitable for workloads with random block updates, such as databases. In contrast, CASL supports compression with little impact on performance. This turns out to be a huge benefit, because databases often compress very well (2—4x).

Overall, CASL provides the best of both worlds: big capacity savings and consistently good performance, even for random workloads.