Learn from large-scale organizations
The problem of virtual server sprawl isn't unique to large organizations. Even SMB organizations can experience problems related to VM sprawl. Regardless of an organization's size, however, there are lessons to be learned from the way that enterprise class organizations handle their virtual machines.
In a large enterprise, VMs are frequently moved around. Technologies such as Live Migration or vMotion are used to relocate VMs in response to changes in host workloads, hardware maintenance requirements, and a number of other factors. As such, the backup administrator has no way of knowing which VM will reside on which host at any given moment.
The biggest lesson that can be learned from large organizations is that it is not practical to focus your backup efforts on individual VMs. Sure, you need to think about individual VMs with regard to your ability to perform granular restoration, but it is a mistake to base your backup efforts around backing up individual virtual machines. Doing so would be a losing battle because VMs are constantly being created and deleted and reconfiguring backup jobs every time a change is made is not practical.
Host vs. VM backup planning
It is better to focus on backing up host servers rather than backing up individual virtual machines. By doing so, you can be sure to capture every VM that resides on a particular host. However, even this approach can be somewhat short-sighted.
Blindly backing up virtualization host servers in a highly dynamic environment makes backup capacity planning becomes difficult. Most medium to large organizations tie virtualization hosts to SAN storage, iSCSI NAS storage, or some other form of centralized storage. This makes backup capacity planning difficult because backup operators cannot base their planning efforts around a host server's internal storage hardware.
The VM provisioning process
One of the best ways that you can make VM backups more practical in a highly dynamic (and sprawling) environment is to use the VM provisioning process to your advantage. Earlier, I said that there are lessons to be learned from enterprise class organizations. One of those lessons is that most enterprise-class organizations do not create VMs manually. Instead, they tend to rely on provisioning tools that create virtual machines based on predefined templates.
There are a number of VM provisioning tools available, one example is Microsoft's System Center Virtual Machine Manager. These tools allow the administrator to define various classes of virtual machines or various classes of storage. For instance, some organizations define silver, gold, and platinum storage tiers -- silver might be low-end JBOD storage, gold may be high performance HDDs, and platinum could be solid-state storage.
Storage classes or virtual machine classes are used because some VMs are more important than others. Storage classes allow higher-end storage to be reserved for more important virtual machines. It is worth noting that some VMs may not need to be backed up. So, it might be possible to tie your backup policy to your storage classes or to your VM classes. In doing so, you could make it so that only specific classes of storage or VMs are backed up by storing those VMs on hosts or storage arrays that are not targeted for backup. If this concept is also tied into a charge back system then there will be a financial incentive for those who create VMs to think about whether or not those VMs need to be protected.
Another common solution that is sometimes used by enterprise-class organizations is the allocation of resource pools. In situations in which multiple people need to create VMs, it has become common practice for an administrator to grant each person a pool of resources. The recipient is free to create VMs on an as needed basis until the resources are exhausted. This approach prevents any one person from consuming an excessive amount of hardware resources. More importantly, it makes the backup capacity planning process a lot easier because the backup administrator knows ahead of time the maximum amount of storage that could potentially be consumed. The admin will know that the backup could be as large as the size of the pool, which is a lot different than just knowing that the backup could be as large as the amount of physical storage that is available.
This was first published in November 2013