Home > Data Backup Tips > Backup and recovery > Learn how to back up virtual machines
Data Backup Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

Learn how to back up virtual machines


George Crump
Rating: -2.50- (out of 5)

What you will learn: Two approaches to backing up virtual machines.

Virtual machine disk format (VMDK) files created for virtual machines (VM) exist in a VMware file system referred to as VMFS. A VMDK file then represents a physical hard drive that VMFS presents to your virtual machine. All user data and configuration information about the virtual server is stored in the VMDK file.

In general, a VMDK file tends to be quite large, so files as large as 2 TB are not uncommon. Because of this, they are characterized by large block I/O patterns. The VMDK file is updated for any user data change or virtual server configuration change. Since there is no built in incremental type data capture functionality in the VMDK, any change to this file means that the whole file needs to be backed up again.

How you back up VMDK files depends on what version of VMware ESX you ...


RELATED CONTENT
Backup and recovery
Tiered data backup storage strategies
An introduction to Microsoft SharePoint 2007 backup and recovery
How to back up encrypted files and how to use the Encrypting File System
Protecting disk-to-disk backups and continuous data protection
Cloud data backup management: Users see new options for cloud storage administration
New features in VMware vSphere that benefit data backup and recovery
Preventing tape backup system disasters
Using different types of storage snapshot technologies for data protection
Top five tape storage backup and recovery tips
Storage snapshot technologies in data backup and recovery

Backup for virtual servers
Virtual server and virtual machine backup: Hot data backup storage technologies for 2010
New features in VMware vSphere that benefit data backup and recovery
Data backup and recovery planning in 2010: Mr. Backup's predictions
A review of VMware disk-to-disk backup apps: Veeam, Vizioncore, PHD Virtual and VDR
Veeam integrates with VMware vStorage APIs in Backup and Replication 4
Even with new and advanced VMware data backup tools, users stick with older technologies
VMware and virtual data backup and recovery technology tutorial
Data backup for virtual machines: Alternative methods to VMware Consolidated Backup
Is VMware Consolidated Backup right for your enterprise?
VMware Data Recovery Manager: A guide to installing and using VDR

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary


are using.

Backup approaches for ESX 3.0 (or earlier)

If you are running a pre-3.0 version of VMware or if your ESX 3.0 server is not connected to network storage, you might choose to install backup agents on each virtual machine to get file-level backup, effectively treating them as hardware systems. As a result, you would be using your traditional backup process to backup the data. The challenge with this approach is that your backup agent puts a significant I/O and CPU load on the virtual server being backed up. If many virtual machines are being backed up at the same time, it is likely you will overload the ESX host server with multiple active agents.

Alternatively, you might choose to install an agent on the ESX Service Console. This delivers a disaster recovery (DR) backup capability by grabbing the entire set of virtual machines, redo files, service consoles and host states.

The ESX Service Console backup does not give you the ability to do file-level recoveries, so you still need the agent backup on each virtual machine if that is your requirement. There is a lot of redundant data in these two backup types if you are doing both. Not only is the virtual machine instance essentially being backed up twice (once by each process), there typically is a lot of similarity between the VMDK files. You may have 14 virtual windows machines, each with its own application, however all 14 OS installations are very similar.

This type of backup is the ideal scenario for a data deduplication device and will likely set new levels in data deduplication ratios. Efficiencies of more than 40:1 for VMware backups are possible. Without using data deduplication, you are less likely to do frequent backups from the ESX Service Console because of its impact on backup capacity. Performance is also an issue as backup windows are finite in a production virtualized environment.

VMware DR backups are particularly hard to replicate to another site. Since each console's backup is a large net new file (or image set), replication across a WAN segment is problematic. Again, this is where a data deduplication capable of optimized replication shines. Even though you are backing up the entire image to disk, only the segment level differences between the new image and the existing backup of the image are stored and then only those deltas need to be replicated to the remote site.

SAN suppliers will suggest that you use their built-in replication to move your virtual server content to a remote site, again replicating at a block level. The problem with that strategy is that first you have to have a SAN and that SAN has to be used for all virtual machines and images. Second, the disk at the DR site has to be from the same SAN supplier, and third, the capacity in the remote site must equal the capacity in the primary site. All three of these issues drive costs up. In addition, there is a fair amount of complexity in getting SAN-based replication working and it still does not solve the core backup issue. This also means big dollars for high bandwidth because there's no optimization happening here.

Backing up to a deduplication storage system on the other hand, helps solve the backup issue while at the same time driving down costs. Multiple generations of the local VMDK files can be stored for months and they can be replicated to a DR site with the same segment level efficiency of a SAN-based replication. But these two sites will benefit from data deduplication and, as stated earlier, that reduction in data storage and associated costs could be significant.

Backup approaches for VMware 3.1 users

If you are using VMware 3.1 (the latest version) and your ESX server is on a SAN, VMware 3.1 makes the process substantially easier with VMFS3 Consolidated Backup (VCB). With VCB, you can get centralized file-level backups with no agents being installed on each guest VM. VCB moves the backup process out of the virtual machine and into the infrastructure. Essentially, the ESX server will take a file system-consistent live snapshot of a selected virtual machine. Then that snapshot can be mounted to a backup server attached to the SAN, which can direct the data to a backup target.

Again, a data deduplication system is the ideal target for the VCB. Your primary goal with a VCB is to get the proxy to mount the image and get it backed up rapidly to relieve system resources. The advantage of using a disk target that has data deduplication is that, as stated earlier, there is highly redundant data within the image being backed up and that image is highly redundant to the images already backed up on disk.

From a DR perspective, VMware discusses either replicating the VMFS disks using the replication capability that probably came with the SAN or backing up the VMs using an enterprise backup application to tape and then recovering at the DR hot-site. For the same reasons discussed above, SAN replication is less than ideal. The problems with tape are numerous, and recovering from tape at the hot site is too time consuming.

As with non-VCB VMware backups, a data deduplication system's ability to replicate at a block level provides the ability to have up-to-date DR server farms across the country or around the world. In some cases, this can provide for long distance business continuance by replicating servers via backups, multiple times per day, and having those servers in a stand-by mode at the DR site by restoring them periodically to a remote ESX host.

Backing up VMware environments has created storage management challenges and increased backup costs for IT staffs. Using a deduplication storage system can actually reduce the backup costs and improve the quality of your DR site.

About the author: George Crump founder of Storage Switzerland is an independent storage consultant with over 20 years of experience.


Rate this Tip
To rate tips, you must be a member of SearchDataBackup.com.
Register now to start rating these tips. Log in if you are already a member.




DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



Enterprise Backup Solutions - Continuous Data Protection (CDP)
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2008 - 2010, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts