In spite of an administrator's best efforts, virtual machine backups sometimes fail. When this happens, you must...
quickly determine the cause of the failure. Here are some of the most common conditions that cause a failed backup.
1. Catalogs are corrupt
Most backup applications maintain a catalog or an index of the data that has been backed up. If this catalog becomes corrupt, then a backup may fail as a result. Due to a (recently corrected) bug, this problem is especially common with VMware Data Recovery (vDR) backups. But, it is possible for a corrupt catalog or index to cause problems with just about any backup application. When catalog corruption occurs, there is no one universal thing to look for, but generally the logs display write errors or failed to update errors. The solution is usually to use the event logs to verify the problem and then rebuild the catalogs.
2. Insufficient permissions
If you suspect that a VSS failure might be to blame for a VM backup failure, then check the state of the VSS writers within the VM.
Some backup applications require each protected host server to have a service account that can be used to facilitate the backup process. These types of backup applications are prone to backup errors related to insufficient permissions. For example, a backup may fail if the account policy forces a password change, but the backup application itself is not made aware of the password change. When this occurs, the backup usually fails outright before any data at all can be processed, and the logs usually reflect a security error or a read failure.
3. Unsupported OS versions
Unsupported guest operating systems are another cause of virtual machine backup failures. For example, a backup application that fully supports backing up virtual machines that run Windows Server 2012 might view Windows Server 2012 R2 as an unsupported operating system. The problem can easily be avoided by verifying backup support before upgrading virtual machines to a new operating system. This one is more difficult to spot. An unsupported OS might lead to an agent failure. More often, though, backup software will resort to performing a file-level backup rather than a hypervisor- or application-level backup.
4. File-size issues
Some older backup applications have trouble with virtual machines that have excessively large virtual hard disks attached. It occurs because the backup applications have a limit on the largest sized file that they can back up. If the backup application treats the virtual machine's virtual hard disk as a file, then there is a good chance that the virtual hard disk size may exceed the backup application's maximum supported file size.
5. The host can't handle the backup load
Another factor that can lead to failed backups is an overstressed host server. If a virtual machine resides on a disk that is already I/O bound, then the disk may not be able to deliver sufficient performance to keep the backup from timing out. The solution to this problem is to correct the storage bottleneck.
6. Virtual hard disk corruption
Just as a physical hard disk can become corrupt, so too can a virtual hard disk. If corruption exists within a virtual hard disk, then your backup application may have trouble backing up the corresponding virtual machine.
Typically when this occurs the backup applications logs will contain either read errors or data integrity errors. These errors can be a clue that corruption might exist within a virtual hard disk.
7. Volume Shadow Copy Service failure
Backups of virtual machines that are running Windows Server as a guest OS generally depend on the Volume Shadow Copy Service (VSS). The Volume Shadow Copy Service uses a collection of VSS writers to facilitate the backup of various applications and operating system components (such as the Active Directory). If any of the VSS writers that are required by the backup process were to fail, then the entire backup can potentially fail as a result.
If you suspect that a VSS failure might be to blame for a virtual machine backup failure, then check the state of the VSS writers within the virtual machine. Using the VSSAdmin List Writers command displays the state of each VSS writer.
8. The backup agent becomes unregistered
Occasionally a Windows patch or a backup application patch can cause a backup agent to become unregistered from the guest operating system. When a backup agent is based on DLL files or OLE controls, those components must generally be linked to the Windows registry.
The problem of OLE controls and DLL files accidentally being removed from the registry is common enough that Microsoft includes a command-line utility in Windows you can use to manually register these components. The utility is called Regsvr32.exe. It is important to understand that 64-bit Windows servers include both the 32-bit and the 64-bit versions of this tool, and you must use the correct version or else Windows will not be able to use the registration.
9. Buggy applications
Virtual machine backups can fail because an application that is running on a VM is buggy. Just recently in fact, Microsoft released Cumulative Update 3 for Exchange Server 2013. Among other things, this update contains a fix for a bug that randomly causes Exchange Server backups to fail. The same bug can also cause restoration failures.
If you are experiencing inconsistent problems with backing up a VM, check to see if there are any known bugs with applications running on the VM.
10. Security software configuration issues
Every once in a while, security software might keep a backup from completing properly. For example, there have been plenty of documented instances of anti-malware software interfering with certain backup applications. Similarly, some backup applications may require exceptions to be added to the Windows Firewall.
As you can see, any number of conditions can cause virtual machine backups to fail. When a backup does fail, it is always a good idea to start the troubleshooting process by reviewing the event logs. That way, you can begin ruling out potential causes that do not correspond with the errors that have been logged.
About the author:
Brien M. Posey, MCSE, has received Microsoft's MVP award for Exchange Server, Windows Server and Internet Information Server. Brien has served as CIO for a nationwide chain of hospitals and has been responsible for the department of information management at Fort Knox. You can visit Brien's personal website at www.brienposey.com.