What you will learn in this tip: Creating backups in virtual environments isn’t as straightforward as it is in physical environments. While there are a variety of approaches for backing up virtual machines (VMs), there are just as many pitfalls you may encounter due to the unique nature of virtual environments. In this tip, learn how to efficiently create VM backups and avoid common mistakes.
Don’t back up through the guest OS
Backing up through the guest operating system (OS) is probably the most common mistake made when backing up VMs. You cannot use traditional backup methods that use agents installed on the guest OS to back up VMs. While this works, it is inefficient because the virtualization layer sits in between the guest OS layer and the physical hardware layer. The guest OS no longer has direct access to physical hardware where the data resides, so a backup agent inside the guest OS must go through the virtualization layer to get to the virtual machine data. This method also causes unnecessary resource usage on the host and if multiple backups are running simultaneously, it can cause performance bottlenecks.
Instead of using guest OS backup agents, backup servers should go directly to the virtualization layer and not involve the guest OS. By using this method, the guest OS is not aware of a backup process, nor is it wasting host resources. It is also much more efficient as the backup server can mount the VMs virtual disk directly from the host data store. This type of backup is known as an image-level backup because the VM’s disk is backed up at the block level and not at the file level as traditional guest OS agents do. To properly do an image-level backup at the virtualization layer you need to use backup applications that are virtualization-aware and can leverage the APIs of the virtualization layer to access virtual disk files.
You should never try and back up virtual disk files directly at the physical storage device and bypass the virtualization layer. The guest OS and virtual disk need to be prepared so they are in a proper state to be backed up and if you bypass the hypervisor this does not happen.
Virtual machine snapshots are not backups
Virtual machine snapshots preserve the state of a VM from the point in time when the snapshot was taken. Additionally, multiple snapshots can be created to provide multiple restore points to choose from. While this can be useful in certain situations, it should never be used as a primary backup method for your VMs. One problem with VM snapshots is that once you revert back to a previous snapshot, you can’t go back to the present. The current state of your VM is lost and you can only revert to previous snapshots. Snapshots are not useful for restoring individual files because they only bring a whole VM image back to a present state. Snapshots can also cause other problems because they grow in 16 MB increments: The entire LUN that a VM is on has to be locked when they grow in size, which prevents other hosts from writing to the LUN.
This process is known as SCSI reservations and too many of them occurring can decrease the performance of your VMs as they wait for LUNs to be unlocked. Each snapshot is an individual file that grows as data is written to it, and having a lot of snapshots running can cause your datastores to run out of disk space. Snapshots are useful as a secondary backup method for short-term or ad hoc backups if you need to permanently revert to a previous state, such as when applying patches or upgrading applications.
Make sure you are quiescing properly
Most virtualization backup applications back up at the image level and are not aware of what is going on inside the guest OS. Before you back up VMs, you need to ensure they are quiesced so they are in a consistent state to be backed up. If you don’t quiesce them, you risk having data that is not in a state to be restored properly. The quiesce operation is handled inside the guest OS, and for Windows VMs, the Volume Shadow Copy Service (VSS) handles this. Since the backup server is backing up the VMs at the virtualization layer—and not inside the guest OS—it requires another application to tell the guest OS to quiesce the VM.
In vSphere, that application is VMware Tools, which tells the VSS service to quiesce the guest OS. The application installs on the guest OS to serve as a conduit between the guest OS and the hypervisor.
For VMs running Linux operating systems with no native services like VSS, VMware Tools also provides a special vmsync driver that can provide the same functionality as VSS. This makes it very important that VMware Tools be installed and kept up to date on all your VMs. There are also instances where VMware Tools may not support certain guest OS versions, so always check to see if your version is supported by the application.
Many backup vendors supply their own special agent that will handle the quiesce process if VMware Tools doesn’t offer support.
Schedule backups carefully
VMs share the resources of a host and hosts share storage devices, and creating backups is a resource-intensive operation. In a virtual environment, creating a backup can cause resource starvation among your hosts and VMs. While backing up at the virtualization layer reduces resource usage on your VMs when backups occur, resource usage will still be high on your hosts and storage devices when backups are running.
To avoid too much concentrated I/O—which can affect the performance of your VM—you should schedule your backups to limit the number of concurrent VM backups on a host and shared datastores. Hosts typically share the same datastores in virtual environments, and bottlenecks caused by too many simultaneous VM backups on a single datastore will affect all hosts that have VMs running on that datastore.
Likewise, if too many VMs on the same host are being backed up at the same time, it will create bottlenecks for all the VMs on that host.
You should plan backup schedules carefully to ensure that backups occur in a balanced manner which do not cause resource problems for your VMs. And don’t rely on sluggish VMs to tell you that you have a problem while your backups are running. Instead, look at performance statistics taken at the virtualization layer to learn whether you have a problem. This allows you to monitor the I/O and make adjustments as needed to balance it out.
Don’t resource starve your backup server
Backup servers are basically like pumps: Data is read from a source and goes into the backup server, and then the data is sent from backup server to the target device. The volume that a backup server can handle is determined by the resources assigned to it, and the more resources available, the faster it can pump data. Backups can heavily tax network and storage resources, but there is more to backups then just moving data from point A to point B. Backup servers handle advanced functions like deduplication, compression and determining which disk blocks need to be backed up.
For your backup server to achieve maximum throughput, it needs to have sufficient resources to avoid creating a bottleneck in any one resource area.
You should monitor the resource usage of the backup server: In practice, it’s better for a backup server to have too many resources than too few. If those resources are maxed out, chances are the backup server will need more. By ensuring that your backup server has the resources that it needs, you can ensure that it pumps data at maximum speed and decrease the time of your backup windows.
The virtualization architecture introduces a lot of unique and creative ways to back up your VMs when compared to traditional physical environments. Backup applications that integrate with virtualization can take advantage of these features and leverage them to increase the efficiency of backups. VMware has developed specific APIs that benefit backup applications, such as the vStorage APIs for Data Protection (VADP), which allows backup applications to interface directly with hosts and storage devices. VADP offers more efficient access to virtual disk files and contains features—such Changed Block Tracking (CBT)—which can greatly reduce the time it takes to perform incremental backups.
A big part of an incremental backup is figuring out what changed since the last backup. CBT queries a virtual machine’s VMkernel, which keeps track of disk block changes, to quickly determine which disk blocks of a VM’s virtual disk have changed since a specific point in time.
Backup applications normally figure this out on their own, so making this information instantly available can mean faster completion of the incremental backup process.
In order to achieve the most efficient backups possible, always make sure your backup application takes advantage of the many benefits provided by the virtualization architecture.
About this author: Eric Siebert is an IT industry veteran with more than 25 years experience covering many different areas but focusing on server administration and virtualization. He is a very active member in the VMware Vmtn support forums and has obtained the elite Guru status by helping others with their own problems and challenges. He is also a Vmtn user moderator and maintains his own VMware VI3 information website, vSphere-land. In addition, he is a regular contributor on TechTarget's SearchServerVirtualization and SearchVMware websites.