Most organizations consider backup their insurance policy. If all else fails, backup software and hardware can copy data faithfully back into place, albeit slowly and with some changes lost. However, products from backup software and hardware vendors have evolved so they can answer a broader array of challenges such as high-availability functionality, copy data management, archiving and data classification. While corporate data backup still serves as the recovery of last resort, it can be more than a safety net.
Corporate data backup as an availability choice
Most organizations can't wait hours to recover mission-critical systems, nor can they afford to lose 12 or more hours of data once the recovery is complete. Organizations looking to increase recovery performance while decreasing data loss will first explore high-availability choices like replication and clustering. The problem with these choices is their high cost and complexity. Corporate data backup products are responding with the following two features that work together and provide close-to-comparable availability without the high cost:
- Changed Block Tracking. CBT backups transfer only those blocks of data that have changed since the last backup. This more granular transfer allows for quicker backup completions while lowering the impact on the network. This means CBT backups can occur much more frequently, such as every few hours, compared to regular incremental data backups. While CBT backups can't occur on a continuous basis from most applications and data sets like replication can, protection every few hours is more than acceptable. As a result, CBT backups reduce the data loss exposure when compared to traditional, once-a-night backups.
- In-place recovery. In-place recovery allows a backup product to mount an application's data set directly on the backup storage device. The application can then map to this data set and start serving users. IPR reduces the time it takes to transfer data across the network and to the application's primary storage device. It also prevents the organization from needing a secondary storage system at the ready in case of failure.
Corporate data backup as copy data management
As the ability to instantiate copies of data directly on secondary storage evolves, backup vendors are offering more than just the ability to recover an application rapidly. Some vendors have added a "virtual lab" capability that can start several virtual machines (VMs) and place them on a virtual private network. Administrators can use these systems to verify recoveries, run reports or perform application testing.
Some products can provide multiple snapshot instances of data they have under protection. These copies can feed other processes in the data center such as analytics and big data. The combination of rapid CBT backups and snapshot in-place recoveries can dramatically reduce the extra copies of data that sprawl across an enterprise.
Backup as an archive replacement
Backup and archive have coexisted since the beginning of the data center. While a best practice is to manage these two processes separately, some organizations may be able to combine them or at least share components between them. One reason to separate corporate data backup from archive is that backup software historically had very fragile databases. The database tracks files -- as their number and age increase, so does the likelihood of corruption. Backup administrators are advised to keep the size of their backup history to a minimum. But modern backup offerings have much more sophisticated databases that can track an almost unlimited number of files.
From a storage perspective, backup hardware is seeing significant improvement. It can scale to almost infinite proportions, support a variety of data, and support large block transfers and smaller file transfers common to archive. Today, secondary storage products can even be the target for replication functions.
Backup for data classification
Modern corporate data backup offerings also have the ability to provide deep-level analysis of the data they are protecting. This analysis can be used to categorize data for its value and retention requirements. These products have the ability to provide context-level search of files, finding embedded information like Social Security and credit card numbers.
The future: Open backup and RESTful APIs
The next step for corporate data backup is to open itself up, exposing both the data it contains and allowing for external automation. Backup products, because they store every file in the environment, are a wealth of information. The problem is that most of these products store their data in a proprietary data format, and the only way to automate them is through a proprietary command line interface (CLI). An open backup will either store its data in an open format or at least allow access to its format from within other applications.
Automation of the backup process needs to evolve from scripting the CLI to fully supporting a RESTful API set that would enable data center operators to integrate the backup process into the rest of the operation. For example, a script that automatically deletes a VM could direct the backup process to make a copy of that VM to a particular type of backup media.
The requirements to recover and retain information are stricter than ever. Backup offerings, both hardware and software, have evolved right along with user demands. Today's products can not only meet the new recovery needs of an enterprise, but eliminate data copy sprawl, consolidate the archive process and eliminate the need for a separate data classification product.
Streamline data backup and DR
Backup admins, IT pros work together for data protection
Backup appliances ease implementation and boost customer support