In my previous article on virtual machine (VM) backup, I discussed using VMware Consolidated Backup to back up virtual machines. In this next part, I'll discuss "point products" that address virtual machine data backup, and using data deduplication and continuous data protection (CDP) for VM backup.
There are a few point products designed specifically to address virtual machine backup that can be incorporated into the data backup process. VizionCore Inc. was early to the market with its vRanger Pro product, and has been doing VMware backups longer than anyone else. Another popular alternative is esXpress from PHD Virtual Technologies. Both products are able to do VMDK-level full and incremental backups, and file-level restores with or without VMware Consolidated Backup. The two products think and behave very differently, however, so make sure you find the best match for your environment. Note that volume-level backups with both products still require reading the entire VMDK file, even if they only write a portion of it in an incremental backup.
You can also use source data deduplication backup software, such as Asigra Inc.'s Asigra, EMC Corp.'s Avamar or Symantec's NetBackup PureDisk. The first way source deduplication backup software can be used is by installing it on the virtual machine where it can perform regular backups. However, source deduplication backup requires fewer CPU cycles and is less I/O-intensive than a regular backup (even an incremental one), so it significantly reduces the impact on the ESX server. Doing backups this way also lets you use any database/application agents that the products may offer. The downside is that you're not usually able to do a "bare-metal" restore of a VM if this is the only backup you do.
Some products take this approach a bit further by running a backup inside the ESX server itself, capturing the extra blocks necessary to restore the virtual machine. But this method requires the backup app to read all the blocks in all of the VMDK files to figure out which ones have changed. That could significantly impact I/O on the CPU as it calculates and looks up all those hashes.
Continuous data protection and near-CDP approaches
Continuous data protection (CDP) and near-CDP backup products are used in much the same way that deduplication software is used. They're installed on your virtual machine and back up virtual machines as they would any other physical server. The CPU and I/O impact of such a backup is very low. Most CDP software won't allow you to recover the entire machine, so you'll need to have an alternative if your VM is damaged or deleted.
So far, all of the methods covered have as many disadvantages as advantages -- if not more. But there's a completely different solution that merits serious consideration: Use a storage system that has VMware-aware near-CDP backup already built into it. (Keep in mind that near-CDP is just a fancy name for snapshots and replication.) Dell EqualLogic, FalconStor Software Inc. and NetApp all have this ability. Other storage vendors are developing similar capabilities, so check with your storage vendor.
The concept is relatively simple. VMDKs are stored on their storage, and each has a tool designed for VMware that you can run to tell it to back up VMware. VMware then performs a snapshot similar to what it does for VCB, allowing your storage box to then perform its own snapshot of the VMware snapshot. Replicate that backup to another box and you have yourself a backup.
The CPU hit on the ESX server is minimal. And the I/O hit on the storage is also minimal, as all it has to do is take a snapshot and then perform a smart, block-level incremental of today's new blocks by replicating them to another system. (Note that this block-level incremental is being done by the storage that already knows which blocks need to be copied, so the I/O impact is as low as it can be.) Vendors that offer these capabilities have their own ways of providing file-level restores from these backups as well.
Dell EqualLogic systems, because they're iSCSI, can communicate directly with the virtual machines via IP to coordinate the snapshots. FalconStor has agents that run in all your VMs to coordinate snapshots and do the "right thing" for a number of applications. NetApp uses VMware tools to do snapshots; however, NetApp's truly unique trait is that it can dedupe VMware data -- even live data. Think of all of the redundant blocks of data you can get rid of by using the deduplication tool included with NetApp's Data Ontap operating system.
Bottom line for VM backup
There are a number of technologies you can deploy today to make VMware backups better. However, many of them are still saddled with disadvantages, especially when compared to traditional backup processes. Perhaps the best current alternative is to move your VMware instances to VMware-aware near-CDP-capable storage. Or maybe VMware will solve some of these backup problems with vSphere.
This article was previously published in Storage magazine.
About this author: W. Curtis Preston (a.k.a. "Mr. Backup"), Executive Editor and Independent Backup Expert, has been singularly focused on data backup and recovery for more than 15 years. From starting as a backup admin at a $35 billion dollar credit card company to being one of the most sought-after consultants, writers and speakers in this space, it's hard to find someone more focused on recovering lost data. He is the webmaster of BackupCentral.com, the author of hundreds of articles, and the books "Backup and Recovery" and "Using SANs and NAS."
This was first published in October 2009