By Nimesh Bhagat – Member of Technical Staff, and Shannon Loomis, PhD – Data Scientist
From its inception more than eight years ago, Nimble Storage has focused on the complete data storage solution – primary storage with integrated data protection. Key to this are snapshots (read-only copies of the data set frozen at a point in time) and replication (storing snapshots on multiple devices). But the real test for any data protection system is whether people actually use it.
We do a lot of research with InfoSight, the predictive analytics tool built into every Nimble array, and have gathered some interesting statistics showing that Nimble customers are using snapshots and replication to protect their data far better than most organizations.
Based on data from the arrays of Nimble’s 6,800+ customers:
- Nearly 70% of customers with multiple arrays are using snapshot-based replication as their backup strategy
- More than 95% of them replicate at least once per hour
- Close to 20% of them replicate every 15 minutes or less
- On average, our customers replicate over 60% of their volumes
Compared with previous industry-wide surveys on corporate backup practices, these numbers are higher than the average, in some cases much higher. So you might start by asking whether or not your IT department is currently replicating your data arrays four or more times per hour.
There are a number of reasons why Nimble customers have so aggressively implemented replication-based data protection:
- Speed – You can create snapshots really quickly, and there’s no reduction in app performance while snapshots are being stored and very little capacity overhead.
- Ease of Use – You don’t need a PhD in storage. The setup wizard really speeds implementation, especially for customers with multiple arrays.
- Value – Assign performance-sensitive workloads to all-flash service levels, assign routine business apps to a combination of flash and disk, and still have practically unlimited disk capacity for backup. All with a single platform, a single operating system. Plus, there’s no need to invest in special appliances or software for backup.
- Efficiency – Our snapshot implementation is so efficient we can protect all the data in a volume with just 3% of the data per day (a daily change rate of 3%). This makes the replication time so short that customers can protect hundreds of volumes every day with a very small RPO (recovery point objective).
A blog post from Ajay Singh a few years ago nicely summarized the benefits of replication-based data protection:
“The primary storage device captures (app consistent) near-instant snapshots based on a predefined schedule (every few minutes, or once an hour) without affecting application performance. Efficient snapshot implementations are ‘unduped and compressed’ and reside on low-cost disk, so you can afford the extended retention you need (say 30 – 90 days). Another subset is replicated (say every hour) using very efficient replication to an offsite DR array, where they are retained for say 60 days. When needed, the entire application or a subset can be restored from snapshots within minutes. Applications can also run directly off the backup/DR copies without any format conversion. There are no backup windows to manage.”
As the statistics above show, this approach is clearly working. Nimble customers with offices all around the world are replicating to each other. For more than 40% of customers it’s bi-directional replication – the array in California replicates to the array in Singapore, and the one in Singapore replicates to the same array in California. In other cases, customers with multiple branch offices will have each remote Nimble array replicating to a single main office array.
But what if you only have a single Nimble array? Many cloud service providers now offer Nimble-based DR-as-a-service, a trend that’s likely to increase as corporate IT shops increasingly look to leverage the best of on-premises and cloud-based storage technologies. We’ll have more on this in a future blog post.