It's time for users to pull their heads out of the sand regarding virtual machine (VM) backups. Server virtualization software such as VMware or Microsoft Hyper-V may solve a lot of problems for system administrators, but they create a lot of problems for storage and backup administrators. The core problem is physics. While it may be possible for 10 to 20 Exchange or SQL Server VMs to successfully share a single physical server, two or three backup applications running at full speed will completely consume all CPU, RAM and I/O resources of that same server. There are two key backup methodologies that are attempting to address this issue: image-level backups, and continuous (or near-continuous) incremental backups.
Image-level backups attempt to solve the virtual machine backup a couple of different ways. For example, they often use a server for the backup that is different than the virtual server being backed up. These other physical (proxy) servers are given access to the storage where the virtual machines are being stored, allowing them to back up the "image" of the VM via a path that does not involve the virtual server. In order for this to work, the backup software and the virtual server software must be cooperating so that the backup software is backing up a stable snapshot rather than an actively used volume.
This cooperation is also necessary in order for any applications running inside the VMs to know that they were properly backed up. For Windows VMs, this cooperation is typically provided via the Windows VSS service; other operating systems use a variety of methods. In addition to moving the physical I/O out of the virtual server, more recent image-level backup products are also able to perform block-level incremental backups of the images as well, further reducing the I/O requirements for the storage. Readers interested in image-level backups should examine products that fully leverage the VMware vStorage Data Protection APIs or the Hyper-V VSS Writers.
Block-level incremental backup technologies
Virtual machines are also a great time to investigate the use of block-level incremental backup technologies such as continuous data protection (CDP) or near-CDP. CDP is replication with a change log that allows you to restore a system to any previous point in time (i.e., sub-second granularity). Near-CDP is replicated snapshots that allow you to restore a system to any point in time when you took a snapshot, which are typically taken once an hour.
Depending on which product that we are talking about, these technologies are available as software that would run inside each virtual machine, in an appliance in the storage network, or as functionality provided by your storage system. The idea is to fundamentally change the way data backup and recovery is performed by never again doing a full backup, and by continually and incrementally copying changed bytes from the backed-up system to the backup system.
By changing the backup process from a nightly, bulk transfer of data to a continuous, incremental transfer of changed blocks (that runs throughout the day), you fundamentally change the impact that the backup system has on virtual servers. CDP and near-CDP technologies also fundamentally change the way restores are performed. Instead of requiring a bulk copy of data from the backup server (i.e., a restore), both technologies support using the backup system as the primary system during an outage. The only downtime is the amount of time it takes to point the virtual server at the new storage location. Compare this to the hours required to do a typical restore, and you will understand the popularity of these technologies for this application. Readers interested in these technologies should ask backup software providers and storage providers about their support for CDP and near-CDP. Check out this brief list in my article on continuous data protection products.
The way we back up our virtual machines has to change. Customers looking to make things incrementally better should examine image-level backup tools; customers looking to fundamentally change the way backups and recoveries work should examine CDP and near-CDP backup tools.
W. Curtis Preston (a.k.a. "Mr. Backup"), Executive Editor and Independent Backup Expert, has been singularly focused on data backup and recovery for more than 15 years. From starting as a backup admin at a $35 billion dollar credit card company to being one of the most sought-after consultants, writers and speakers in this space, it's hard to find someone more focused on recovering lost data. He is the webmaster of BackupCentral.com, the author of hundreds of articles, and the books "Backup and Recovery" and "Using SANs and NAS."