After many years of data protection stagnation, there has been incredible innovation in the past few years.
For example, snapshot technologies have come a long way. Instead of relying on a few copy-on-write (COW) snapshots, we can now employ redirect-on-write (ROW) snapshots that don't impact application performance nearly as much. Snapshots can be taken at much shorter intervals, significantly improving recovery point objectives.
Another innovation emerging today is becoming known as "flat backup." The term is still new and IT is just recognizing its significance. Basically, flat backup means performing direct backups of snapshots to another, generally lower-cost storage system without the use of traditional backup software.
The primary storage array's operational snapshots provide the fastest way to recover, but when something goes wrong with the primary storage and snapshots are not available (or are corrupted) you can "recover" from protection storage. But, "recovery" requires no format changes, as it does when using traditional backup software. It would simply be a matter of moving the snapshot to primary storage or mounting the appropriate snapshot directly to the application. The recovery of a file or an object would be performed in a similar manner. These benefits all accrue from the fact that the protection storage understands the format of the primary storage.
Flat backup isn't a new concept
This concept is not entirely new. NetApp has been allowing their customers to do this for many years. But the technologies involved in flat backup have advanced recently. Now, at least two more major array vendors (HP and EMC) have thrown their weight behind this technology. HP allows flat backup between 3PAR StoreServ and StoreOnce, and EMC between VMAX and Data Domain.
The question is: How effective is this technology and when should one deploy it? The advantages are fairly obvious. It simplifies the environment, reduces license fees and improves RPO and RTO. Sound too good to be true? There must be a catch somewhere, you say. Well, maybe not anymore. Here's why:
First, COW-based snapshots simply impact application performance too much due to their requirement for one-read-two-writes each time a block is changed. This means one could not realistically take a snapshot more than, say, one every hour. Today, with ROW-based snapshots, it is possible to take a snapshot every minute if one chooses to, since each change only requires one write to the snapshot area. This fundamentally changes everything.
Second, using snapshots for data protection was ill-advised in the past, because snapshots reside on the same storage as the primary data. What would happen if the primary storage got hosed? This was a valid objection, but with flat backup, snapshots are replicated to protection storage. And the advances in protection storage have been nothing short of spectacular. The efficiency of inline data deduplication, exemplified by products such as HP StoreOnce and EMC Data Domain, is well accepted.
Using typical backup software, these devices deliver great results. But with flat backup, the data bypasses the application server and the media server and transfers directly into these devices. That reduces the impact of backup on applications. It also means less bandwidth is necessary to move data. Elimination of the media server and associated software also means greater simplicity and lower cost. Protection storage may be in the same data center as the primary storage or in a remote site or both. In the latter case, the remote site also becomes the DR site and the replication happens between the two protection storage arrays across the WAN.
Third, application-consistent snapshots are now easy to implement. Microsoft VSS-based snapshot technology is commonplace -- most arrays support it. Oracle RMAN support is also available. HP, for instance, now supports SAP on Oracle and SAP HANA with its 3PAR StoreServ snapshots. Lack of application-level support was a common complaint against using snapshots before, but that is becoming less of an issue. Granted, it isn't practical to take application-consistent snapshots every minute. However, taking crash-consistent snapshots every minute, supplemented with application-consistent snapshots every five or 10 minutes may be sufficient. Even consistency groups are supported.
Fourth, many vendors offer seasoned software to manage flat backup and recovery. For example, NetApp has SnapProtect, HP has StoreOnce Recovery Manager Central (RMC) and EMC has ProtectPoint. NetApp is a leader in snapshot technology (they invented it) and the company's SnapProtect technology is very mature. In many cases, using software of this caliber can enable the application administrator or the virtualization administrator to manage data protection and recovery, end to end.
Fifth, application recovery is incredibly fast with flat backup. Unlike traditional backup software that changes the format of the backed-up data, snapshot-based backups keep the disk-based format. This means the concept of "recovery" is changed dramatically. Data simply needs to be moved to the primary storage (or another storage system) or the snapshot can simply be mounted and used. This reduces RTOs to seconds or minutes.
This applies equally to applications running in a physical environment or as VMs. Both VMware and Hyper-V are supported by the vendors mentioned above. Since there is no format change, searching for the right snapshot is a piece of cake. The file hierarchy is retained, so it is just a matter of picking the right snapshot. Snapshots can even be searched using keywords. In the past, snapshot-based backups did not provide cataloging and indexing, but that is no longer an objection. Many products even allow non-disruptive recovery verification as often as one would like.
Flat backup worth a look
I believe flat backup is worthy of consideration for users of primary storage products from vendors mentioned above. Other major vendors will likely offer flat backup in the near future. IBM has ProtecTIER and Dell has the DR family of protection storage devices that could be brought to market. If you have been watching the hyper-convergence trend recently, you will find that these vendors are telling IT there is no need for external backup software anymore and that all data protection is "built in." They don't call it flat backup, but in essence, that is exactly what it is.
I do not believe for a second that flat backup will replace backup software anytime soon -- if ever. Backup software has way too many other positives in its favor.
For large installations where thousands of servers need to be protected, tape support is necessary, and other regulatory and compliance issues are involved, backup software is the way to go. Backup software is also required if you are dealing with a heterogeneous storage environment, as all flat-backup products today are homogeneous. But in simpler environments, it might be worth considering (or reconsidering) snapshot-based backup. Snapshots have come a long way, along with associated technologies, making flat backup a serious contender in data protection.
Where snapshots fit in a data protection strategy
How snapshots can be used for backup
Learn the advantages of backup-less backup
- Software-Defined Storage for Backup and Recovery –Hedvig Inc
- Comprehensive Data Backup and Recovery –Commvault
- How to Buy Backup and Recovery: A Customer's Evaluation –Rubrik
- Backup and Disaster Recovery for AWS Workloads –Veeam Software