There’s a quiet shift underway in the IT landscape. No, not cloud computing – few would call that a quiet shift. It’s the trend away from traditional backup and DR to something faster, simpler and lower cost: Extended Snapshots and Replication (ESR). IT practitioners talk about it. Analysts see a trend, for example ESG found (table below) that small-mid environments already use this commonly for VMs. Industry experts take some flak for calling it as they see it. Even folks historically linked with traditional backup acknowledge the shift. Naturally, vendors not best served by this trend vehemently argue against it. When you hear someone argue – “we could have offered this for years, but it’s just not the right approach”, make sure the real reason isn’t an inherent weakness of their underlying technology.
So what’s the fuss about? Let’s review a typical form of traditional backup and DR seen in a mid-sized enterprise, and contrast it with the ESR approach. We’ll skip archiving requirements, which have different solutions, and acknowledge some organizations have more specialized needs.
Traditional Backup and DR – Repeated Copying of Redundant Data
Backup software scans servers nightly for new data, and bulk copies changed data to a dedicated backup device, today likely to be disk based (although tape still rules for archives). Scanning and copying are resource hogs, impacting servers, storage and networks, so they’re done during designated backup windows. Because of restore performance and reliability issues, incremental backups are supplemented with massive weekly full copies which usually consume the weekends Backup dedupe makes it more affordable to retain the 30-90 days of backups most organizations need. However, the bulky upfront copy means you can’t afford to backup too often, so Recovery Points are sparse – typical RPO is one day. And restores still take hours to reconstitute data from the full and incremental backups. Deduped disk backups do have the benefit of enabling WAN efficient offsite replication. Once again though, Recovery Points are spread far apart, and restore times are long. Nor is there an option to run an application right off the DR copy – you need restores to primary storage.
Extended Snapshots and Replication Approach
The primary storage device captures (app consistent) near instant snapshots based on a predefined schedule (every few minutes, or once an hour) without affecting application performance. Efficient snapshot implementations are “un-duped and compressed” and reside on low cost disk, so you can afford the extended retention you need (say 30-90 days). Another subset is replicated (say every hour) using very efficient replication to an offsite DR array, where they are retained for say 60 days. When needed, the entire application or a subset can be restored from snapshots within minutes. Applications can also run directly off the backup/DR copies without any format conversion. There are no backup windows to manage.
Comparing the Approaches
Here’s how each approach handles common failure scenarios:
Traditional backup has had the advantage of incumbency. IT shops are familiar with it. Backup software has supported this approach longer. However, IT shops hate traditional backup, and many are looking to change. And software vendors are catching up in terms of managing snapshots. Finally, newer approaches have so dramatically improved the cost and simplicity of ESR, the contrast more striking than ever:
In the one case you have multiple devices juggling data, 3 data copies, and a lot of daily heavy lifting to get a barely acceptable level of SLAs for recovery. With the other approach, you have 2 devices, 2 data copies (unsurprisingly at a lower cost), no daily backup windows or pain, and much faster, better recovery options.
Which would you choose?