How much are virtualized server environments affecting backup processes? Mark Bowker, analyst with the Enterprise Strategy Group, answers the most common questions about virtual server backup from storage administrators. His answers are also available below as an MP3 file to download.
Download the Virtual server backup FAQ podcast.
Table of contents:
>>How virtualizing servers impacts backups
>>Virtual machine images
>>Traditional file-level backup and virtual machine backup
>>Backup window and backup capacity
>>Live virtual machine backups
>>The restore process
>>Innovations to this space
With server virtualization, things are definitely different. There are a lot of advantages, but things are very different when you are backing up virtual servers. What we are finding is the amount of data that needs to be backed up has increased significantly.
However, in some cases server virtualization can reduce the number of backup licenses. This is a nice benefit. You don't have to run as many agents as in the past, and you can save some money there.
In some environments, server virtualization has prompted the use of a secondary storage system, for example, a disk-to-disk backup system to maintain multiple copies of images.
This is a big difference. Typically, with traditional backup, you put the agent on there, we're backing up the files, and everyone's happy. If we need to restore a file or an email, it's fairly simple. We go back into the agent and recover it.
In a virtual server environment, the actual virtual machine image is stored as a single file. So, that's very different. The operating system, the applications, and the data itself is all stored within a single file.
In VMware, this is called a VMDK file. With Microsoft it's called a VHD file.
Most traditional client/server architecture remains the same in the server virtualization environment where the client has an agent installed. However, this can become cumbersome. We still have to manage agents. There's processor overhead, which can be an issue, agent-based backup and virtual machines. This is especially true if multiple virtual machines are running on a single physical server, which is very much the case in a virtualized world. If those virtual machines all kick off the backup process at the same time, that CPU can be hit pretty hard and affect the applications.
So, an alternative is to use a different technology to perform the backup of the virtual machine disk image directly. This requires the virtual machine to be suspended so that a consistent capture of the virtual machine can be performed. Once the machine is suspended, the backup process can take place and then it can be restarted again. This works well, but suspending the virtual machine is an issue because taking an application offline is not acceptable in some environments. But, there are other ways to integrate with the virtualization solution and perform a snapshot of the virtual machine. Then, you can back up that snapshot.
Another thing you should be aware of is that you can recover the entire virtual machine, but you can't recover a single file within that virtual machine. So, you have to restore the entire virtual machine, remount the virtual machine and then recover the file.
The amount of capacity, just because of what's being backed up has increased significantly. We just conducted a survey of people who have recently adopted server virtualization, and 37% say that the amount of data they back up increased after deploying server virtualization. It's significant. You're backing up the operating system, you're backing up applications, and you're backing up data.
Then, there is the proliferation of virtual machines or VM sprawl. Once you get the infrastructure set up and in place, actually deploying the virtual machine is pretty simple. It's really just a matter of clicking a couple of buttons, and you have a new machine set up. But after you do that, is the proper backup process in place?
So, there's more data to be backed up and there's more virtual machines, but you still have to back up the data within the same window. It's important to be aware of that.
There are a couple different data backup methods depending on what your restore goals are. You can take a virtual machine image, and you can take a snapshot or backup of that while the virtual machine is still writing to that image. But, that's like walking up to a server and pulling the plug, then plugging it back in, hitting the power button and hoping it turns back on. Often times it works, the application may take care of itself and recover.
Another alternative to powering it down, is to use the snapshot utility in the server virtualization software itself. This will quiesce the virtual machine image, freezing it temporarily while it performs the snapshot. Once the snapshot takes place, you can back up the snapshot. That gives you the complete system state of the machine. So, you can actually take that and bring it to a secondary data center. You could bring it to another environment for testing. It's a true point-in-time copy of the virtual machine image.
I'm hearing more and more questions about data deduplication and virtualization. And, it's not just server virtualization, another area is desktop virtualization. With server virtualization, you have lots of virtual machine images out there and they have the same operating system. For example, Windows Server 2003 might be installed multiple times. I'm backing up multiple copies of something that I essentially only need one copy of. Whether it's an operating system file, a patch, an application, a device driver, whatever it may be, I only really need one copy. So, the benefit of data deduplication can be enormous.
In physical environments, next-generation backup has been somewhat slow and marginalized; maybe solving niche problems on the fringe of the data center. This is most likely due to cost. Implementing server virtualization really requires a refresh of the data center, so it's an opportunity to do so.
It has. When you're looking at virtual machines, there are a few things you need to consider. The restore process can be just as it has been traditionally. If you have an agent-based backup system, you can restore a single file back into the virtual machine just as we have in the past.
But, if you are doing system-level backups where you are backing up the entire virtual machine image, you have to restore the entire virtual machine image. This can be very time consuming. Once you recover it, you have to mount it somewhere in a virtualized environment. Often, this is not the production environment because you can have conflicts with something else that's running there. Once you mount it, you can recover files from that point and transfer them back to where they need to be.
In a recent survey, we asked the question, "Are you using the same backup tools for your virtual environment as you are for your non-virtualized environment?" Seventy-five percent are using the same tools and of the 25% implementing new tools, we're seeing that people want management tools similar to storage system management tools integrated in the server virtualization management platform.
Take something like VMware Consolidated Backup (VCB). VCB is a kind of framework for the backup process that snapshots the network storage environment and allows the backup to take place without affecting the production environment. Using VCB you can actually take virtual machine-level backups of those machines, but you can also provide file-level recovery. So, something that had been a two-step process can be performed in one step. You're also removing the backup traffic from the production LAN. And, you don't need the backup agent installed in the virtual machine.
Mark Bowker is an analyst with the Enterprise Strategy Group