rolffimages - Fotolia
Published: 06 Sep 2018
Organizations expect IT to recover from any disaster quickly and without data loss. Traditional backup software and hardware can't meet these expectations, so IT has been forced to look elsewhere. Enter backup snapshots and replication, an alternative to standard backups. The snapshot and replication combination should enable IT to provide both the rapid data protection and recovery desired by higher-ups -- at least in theory. The problem is holes in this strategy don't let organizations completely replace traditional backups.
Let's delve into the shortcomings and then explore what IT can do about them to effectively incorporate backup snapshots and replication technology into a comprehensive data protection and disaster recovery plan.
The problems with backup snapshots
Just about every storage system, operating system and hypervisor has some form of snapshot technology built into it. However, IT professionals considering using the technology for data protection must understand the capabilities of their particular snapshot product and the shortcomings of the technology in general to ensure these tools do the job.
These are the top five problems with backup snapshots.
Lack of global snapshot standard. While storage systems, OSes and hypervisors usually have some form of snapshot technology built in, those capabilities are unique to the environment the product is used in. Snapshots from vendor A won't typically work on vendor B's product. As a result, enterprises will have to manage multiple snapshot tools in their data centers -- at a minimum one for each storage system. The lack of a common snapshot standard means each platform -- operating system, hypervisor, storage system -- has to have its snapshot schedule separately managed and monitored.
Lack of indexing. Another snapshot challenge is that there's, generally, no indexing of snapshots. That means there's no simple way to search across backup snapshots to find every version of a file. This forces IT to "look into" each snapshot to find data within it. In most cases, this means separately mounting and searching an image of every snapshot that may contain the file you need.
Lack of depth. While most storage platforms claim that thousands of snapshots can be taken without affecting performance, lack of indexing makes the value of taking thousands of snapshots suspect. When using only snapshots for data retention, this inability to find a particular version of a file quickly and easily is a problem, and as time goes by, the ability to remember what's in a particular snapshot diminishes.
Lack of reliability. Snapshots as your sole data protection technology really aren't that reliable. If the storage system or even the volume a snapshot is based on fails, it renders all backup snapshots associated with that system or volume useless. One way around that is to replicate snapshotted data to a second system. But, in most cases, the second system must be similar to the first and usually from the same vendor. Obviously, buying a near-identical system doubles the price of the platform. And, in most cases, the second system is off site at a disaster recovery location, which means you must retrieve the data across the WAN to use it.
Lack of granularity. Most snapshots operate at a volume level. To gain insight into anything inside the snapshot version of the volume requires mounting that volume, again making searching for and finding a file more difficult. And because most replication features are built on snapshot technology, admins must replicate an entire volume even if only a portion of it is critical enough to be available for disaster recovery.
Despite these shortcomings, snapshot technology is a powerful capability an organization shouldn't ignore. Admins should, however, limit the role of snapshots to recovering only the most recent versions of data, which is -- of course -- the most frequent type of recovery. It's necessary to couple snapshots with traditional backup or archives for IT to provide complete and fully functional data protection.
Snapshots as a complement to backup
Snapshots and backup can work together to deliver high-quality and high-fidelity data protection. Worst case, IT can run the two data protection tools independently of each other. An administrator can time trigger the snapshot right before the backup occurs and point the backup at the snapshot version of the volume. This provides the backup with an inactive file system, which makes for an ideal backup. Once the backup is complete, IT must remember to delete the snapshot.
There is opportunity here for integration. The first level of integration is a series of scripts, typically called by the backup application. For example, the backup software makes a call to the storage system to execute a snapshot of the volume requiring backup and mounts that snapshot version of the volume to a specified location. The backup application then backs up the snapshot version and, when complete, deletes the snapshotted version of the volume.
The challenge with this scripting approach is that each storage system, again, has its own way of interfacing with its snapshot feature. That means IT will need to develop a script for each brand of storage system in the data center. It will also be necessary to maintain and update each of these scripts in conjunction with updates to both the backup software and the storage system.
A second layer of integration is when a backup vendor directly integrates its software with a particular primary storage vendor's storage system and backup snapshot technology. It doesn't replace the snapshot technology, it centralizes snapshot management. This type of integration eliminates the need for managing each storage system's snapshots and enables backup administrators to set up all these jobs while not having to be an expert in each storage system. One caveat: As the backup software and storage system evolve, it's the responsibility of the backup software vendor to ensure its snapshot interface still works.
The copy data management alternative
Another option to help IT meet management's rapid and comprehensive data recovery goals is to use copy data management. There are two types of CDM available. The first type completely replaces the snapshot function by splitting writes between primary storage and secondary storage and then taking its own snapshots, using its own technology, on secondary storage to provide point-in-time recoveries. Indexing the snapshots enables admins to locate required data with simple search commands. In most cases, the CDM software can replicate to another CDM instance in a branch location for disaster recovery preparation.
The second type of CDM integrates with a storage systems' snapshot technology and provides an index of the storage system's snapshotted data to make it possible to find that data. Some of these CDM products can also manage the snapshot process, such as when snapshots are triggered and when and if they're replicated.
In both cases, CDM provides an alternative to traditional snapshot and replication as well as snapshots integrated with backup. With these CDM methods, IT can search for data across snapshots and centralize the snapshot process -- both taking and releasing -- across storage systems. The value of the first type is that data protection storage can come from almost any vendor, including cloud storage, significantly lowering the cost of storage. The value of the second type is CDM extends existing snapshot technology instead of replacing it. This requires using the same vendor's hardware in both primary and disaster recovery locations.
Long-term archiving is a potential challenge for CDM tools, as most of them don't provide tape support. A few CDM products do provide cloud support, however. They can use the cloud to drive down long-term storage costs and to either spin up instances of applications or run analytics on the data.
A third layer of integration enables backup applications to use storage system snapshots for recoveries. When a backup application receives a request for a particular file or set of files, it can determine if one of the backup snapshots has that data. If so, it can restore the data from the snapshot instead of backup storage. The value of complete backup integration with snapshots of storage systems is it allows the backup application to become the command center for all data protection operations. Admins get a single interface into all of the data center's storage systems and complete oversight into the protection process.
Bottom line, don't skip snapshots
Backup snapshots are an excellent and useful technology for meeting organizational expectations for rapid data protection and recovery. The technology is a little raw, however, and lacks the sophisticated search interface IT administrators are accustomed to in traditional backup tools. While lack of search is seldom a problem for finding yesterday's version of a critical file, it becomes a major problem for finding the second to last modified version of a contract that was signed last year.
Integrating snapshots with backup or replacing them with copy data management (see "The copy data management alternative") to build an index of protected data is a critical data protection step. Don't skip it.