Best practices, tips and tools for VMware virtual recovery and backup
A comprehensive collection of articles, videos and more, hand-picked by our editors
Virtual server backups were once a kludgy and network-choking process, but backup applications have evolved to handle the special needs of virtualized servers.
Server virtualization is without question one of the most significant technologies introduced into the data center in the last five years. It has changed almost every aspect of how architectures are designed, including networks, storage and the servers themselves. Data protection is one of the key operations that has been most impacted by the shift to a virtual environment. Gaps in data protection for virtualized infrastructures led to the origin of startup vendors that focused solely on providing virtual machine-specific backup and recovery solutions.
The impact of virtualization on backups
Prior to virtualization, applications ran on dedicated servers with access to all the resources (storage, memory, CPU, network) available to that server. When a backup was triggered for that application it could, for the most part, use all the available resources to complete the task at hand to copy data from the server to a backup destination.
Virtualization changed things. Resources are now shared across multiple virtual machines (VMs) each running an application of their own. If the backup process doesn't adjust to this new reality, then all the VMs could start sending all their data at the same time—all from a single server. That could lead to a potential server crash as the hypervisor runs out of memory resources, or at least produce mediocre performance as it runs out of CPU and networking resources.
Early attempts to fix VM backup
In the "early" days of VM backup, most data centers protected VMs as if they were standalone servers, and administrators would balance backup schedules so that only one or two VMs were backed up at the same time. This meant IT managers could continue to use their legacy backup applications. But as virtualization continued to grow and VM density increased, the scheduling balancing act became untenable and alternatives were needed.
The virtualization backup advantage
Despite the negative impact of virtualization on data protection performance, it did bring its own set of advantages. A "server" was now encapsulated as a single large file instead of thousands or potentially millions of small files. And that file is accessible by multiple servers via the virtualized cluster put in place to enable features like live migration of VMs between hosts and automated resource balancing.
That added up to fairly easy access by an alternative server to back up the "file" (the server). In addition, most hypervisors had snapshot capabilities built into their clustered file systems so they could be snapshotted and protected by the alternative server without impacting the primary host server's resources and performance. Essentially, the capability for off-host backup was born.
This led to the rise of companies like Nakivo Inc., PHD Virtual Technologies, Veeam Software and Vizioncore Inc. (bought by Quest and then acquired by Dell). They leveraged the above capabilities and expanded them to include granular recoveries from virtual server systems.
In the early days of virtual server backups there were a limited number of ways in which the backup software could interface with the hypervisor to perform the task at hand. As a result, when hypervisors were altered or upgraded, compatibility issues with the backup application sometimes arose. While this was an acceptable risk for smaller backup vendors, larger enterprise software vendors were more conservative in providing VM-specific backup capabilities. With traditional backup apps lumbering along, startups were able to capture an early lead in VMware data protection.
Today, hypervisor vendors are providing API sets that backup software companies can leverage as part of their code bases. In theory, at least, this means their backup applications should work despite revisions to the hypervisor code, as the amount of backup application code rewriting should be minimized.
The changing role of the backup disk
Thanks to features like Changed Block Tracking, cloud-based recovery and in-place recovery, the design of disk backup devices needs to evolve. In the past, data transferred to the disk backup appliance was bandwidth-focused (large files, a lot of data all at once); now, it's much more random in nature (small block changes transferred throughout the day). In addition, since virtual machines (VMs) may now be executed directly from the device, the performance of the disk backup appliance matters. We may soon see disk backup appliances that have some solid-state storage installed for the execution of VMs.
VM backup today
With the availability of an API set, most vendors, whether legacy or VM-specific, can provide off-host VM backup, something that should now be considered a basic requirement for VM data protection. But there are specific features beyond off-host backup that IT planners should consider.
Agent versus agent-free backup. An agent is software that's installed in the VM to assist in the backup process. Even though the above-mentioned APIs allow for off-host backups, some vendors still rely on agents installed in VMs. The agents may be used to help with application-awareness (allowing for granular backup and recovery of databases or email stores) and, in some cases, may accelerate raw backup performance.
While agent-free backup offerings don't install code into the virtual machine, granular recovery of application data is still available; however, the backed up VM image may have to be mounted as a separate VM and have that data copied out of it. Some agent-free backup products have developed "helper" applications that will allow for scanning, searching and extracting granular data components from well-known data types such as Microsoft Exchange, SQL Server and Oracle without having to mount the VM image.
Changed block backup. Hypervisor APIs have increasingly added capabilities such as VMware's Changed Block Tracking (CBT) that allows the backup software to understand which parts of a virtual machine's image file have changed since the last backup. This is a key feature that allows backups to occur more frequently since the amount of data transferred is minimized and should result in reduced data loss in the event of VM corruption.
Enhanced restores. Restores have also been significantly improved in virtualized environments. First, instead of having to restore the entire VM image by tapping into the hypervisors' API, most off-host backups can now recover a single file or set of files when a recovery is needed. Some vendors also leverage CBT to provide changed block restores. For example, if a large database is corrupted, a changed block recovery would just restore the parts of the database that changed since the last backup was made.
Restores can be further enhanced in products that allow for the execution of the VM directly from the recovery device, often called "in-place recovery." With an in-place recovery scenario, no data needs to be transferred across the network and the VM and its data can be returned to operation in a matter of minutes. For many organizations, this capability combined with hourly CBT backups can eliminate the need for separate business continuity software.
Some vendors are extending this capability to the cloud. Where the "in-place" part of the recovery actually occurs is in the remote data center. In those architectures, data is typically backed up locally, then replicated to the cloud and placed in position for recovery in the event of a site disaster. This not only solves the local protection and availability issues, but also provides DR readiness.
There is a tradeoff between in-place recovery and changed block restores. With in-place recovery there will come a time when the VM needs to be moved back to its primary storage destination. Also, it's unlikely the backup device has the performance and redundancy of the primary storage device, something that's especially true with the cloud recovery model described earlier. CBT recovery, on the other hand, incurs downtime up front, but eliminates the more prolonged downtime required to move the whole VM into place. Ideally, IT planners should look for a product that offers both methods.
Tape support. It may seem surprising that tape support is working its way into VM-specific applications that were originally disk-only, but tape is inexpensive, portable and ideal for long-term storage of VMs. It's a perfect addition to the rapid backup and restore capabilities of disk because it allows the disk investment to stay small and be used for the most immediate of recoveries. Tape support should be given strong consideration, even in disk-only environments. The long-term storage capacity savings plus the ability to "overnight" a VM on tape can pay big dividends.
Physical server support. A major differentiator among backup applications is their ability to back up physical servers. Many of the new VM-specific backup apps are VM only. While many data centers are striving for 100% server virtualization, the majority of them aren't even close. That means that if a VM-specific application is selected, you must be prepared to deal with at least two separate backup and recovery processes.
Most legacy enterprise solutions support physical and virtual server protection, but tend to be behind in some of the VM-specific features described earlier. You may need to choose between the luxury of a single data protection product or running two products to capture best-of-breed functionality. In general, the choice comes down to how much mission-critical data resides on physical systems.
VM backup bottom line
The state of virtual server backups has improved significantly over the past few years, thanks in large part to vendors like VMware that established a robust API set that allowed for innovation and integration. The capabilities aren't only improving backup, they're helping to eliminate the need for separate business continuity and disaster recovery applications.
About the author:
George Crump is president of Storage Switzerland, an IT analyst firm focused on storage and virtualization.