W. Curtis Preston, Executive Editor at TechTarget and independent backup expert, discusses the challenges of backing up Hyper-V virtualized servers, the differences between backing up Hyper-V and VMware servers, and more in this Q&A. His answers are also available as an MP3 below.
Table of contents:
>> What are the challenges associated with backing up Hyper-V
>> What are the different approaches for backing up Hyper-V?
>> Are there specific things you need to enable in Hyper-V to ensure data backup performance/effectiveness?
>> How does backing up Hyper-V differ from backing up VMware servers?
>> What are the benefits or drawbacks of using Microsoft Data Protection Manager (DPM) to back up Hyper-V virtualized servers instead of a third party backup software product?
>> What are the challenges with using a third-party backup tool?
What are the challenges associated with backing up Hyper-V virtualized servers?
I'd say the biggest challenge is the predominance of VMware. Generally you run into a blank stare when you ask someone from your backup software company about Microsoft Hyper-V. Hyper-V is certainly increasing its market share, and data backup and recovery is at least a small reason as to why that's the case. The other challenges associated with Hyper-V are similar to the ones you run into with VMware, which are basically the laws of physics. For example, you've taken 20 physical servers and you've put them inside one physical server. So you have all of this data that needs to be moved around for the purposes of data backup and recovery, and it can only go through one server. That's pretty much the main challenge that you have to deal with.
Agent-based is the most common method for backing up virtual servers. Basically, you put an agent in each of the virtual machines (VMs), and then you kind of put your head in the sand and pretend that everything's physical. And that gives you a lot of benefits where you're able to back up a lot of databases and things using the same agents that you're used to. So the biggest advantage to that approach is familiarity.
The other approach is host-based, which is looking at the actual Hyper-V server. In this case, that's just a Windows server. It's not quite the same as VMware where there's this whole other world where Hyper-V is running and you don't have a host to connect to. In this case, it is a Windows host and you can talk to it like any other Windows host, but if you want to connect and back up those virtual machines outside of the VM world, then you need to interface with their infrastructure put in place, mainly VSS, so that you can get good backups of the VMs. Any data backup product that interfaces with Hyper-V should be able to do that.
Volume Shadow Copy Service is the biggest thing that needs to be enabled because it is the overall infrastructure that allows a backup app to quiesce -- First, an application such as Exchange or Sequel Server. And second, the file system upon which that application is storing its data. And so, when a backup app wants to back up a virtual machine that is inside Microsoft Hyper-V, it needs to connect to the Hyper-V integration services, which talk to VSS. VSS then talks to the VSS writers. Each application has its own VSS writer, and once all the writer's have communicated that they've done their job, then VSS creates a snapshot or a shadow copy of the volumes that are part of that VM. At this point, VSS turns around and tells the backup app that it can back up that snapshot. Once that backup has been completed, the snapshot can be released, and if the backup app used the right backup type, then the applications will be notified via their writers that a backup was just taken, and that they can do the right thing after the backup, which typically means they're going to truncate their transaction log so that you have a nice clean system after the backup.
Everything I just described about VSS needs to happen in VMware. Unfortunately it does not. VMware only talks to the application VSS writers in Windows 2003. And remember how I said that if they use the proper backup type, the applications will do things like truncate their logs? Well, VMware chose not to use the VSS backup type. It uses VSS copy. So basically it's telling VSS and the applications that it's creating just a snapshot of the data. It's not creating a backup, but rather a copy. And because it's telling VSS that it's just creating a copy, VSS does not truncate the logs, because you wouldn't do that if you're just making a copy of the data, you would only do that if you're making a backup. So the concept is the same, but it's not as full of an implementation as what Microsoft Hyper-V does.
So overall, VMware only talks to the applications in Windows 2003; it does not talk to them at all in Windows 2008, which means you're creating a crash consistent copy. With a crash consistent copy, when you restore the VM from that backup, the application is going to have to go through a crash recovery process in order to bring the app to a consistent point. That's why it's called a crash consistent -- it's as consistent as a crash, which frankly is a phrase that scares me. So if you're not running Windows 2003, you're only getting a crash consistent copy of the apps. Even if you are running Windows 2003, you're not doing the step of truncating the logs. So overall, Hyper-V's implementation of VSS is much more advanced than in VMware.
Microsoft Data Protection Manager is a near-CDP product. Near-CDP is close to continuous data protection (CDP), where it's not truly continuous, it's something like once an hour. First off, DPM fully integrates everything I described about VSS implementation -- for obvious reasons -- it's put out by Microsoft. That's the first advantage. You can be assured of complete integration and that they're going to talk to all the VSS writers.
The second thing is that when you compare it to traditional third-party backup apps, those are traditional full and incremental backup apps. Even when we look at IBM Corp. Tivoli Storage Manager (TSM), which offers a progressive incremental feature, it does that only for file systems. When we look at doing data backups of databases and applications inside VMs, VMware also does full and incremental backups. So when you compare Data Protection Manager to the typical backup app, the typical backup app is going to create full and incremental backups. The only thing DPM does each time you create a snapshot is that it transfers the byte that it changed from the snapshot that was taken previously. And so it's a very incremental-forever block-level technology. So there are two things to remember about using DPM with Hyper-V. First, it should have very little impact on the performance of Hyper-V, and second, it should have very tight integration with Windows because it's made by Microsoft.
The biggest thing to remember when using a third-party data backup tool is to make sure that it fully integrates with VSS. Also, make sure that it's talking to all the VSS writers. Everyone's going to want to talk to Exchange and Sequel Server, but what about Active Directory, and Oracle? There are several other smaller VSS writers that aren't as popular. Are there other applications that are present in the virtual machines, and do they have a VSS writer? And is my backup app talking to all the VSS writers? The proper way for a backup to behave is to do a metadata query of VSS. VSS should give you a list of all the VSS writers that it has. It should then talk to each of the VSS writers that it discovers in the process. And so hopefully it's able to discover all those, and hopefully it's able to talk to them so each of the VSS writers can do the right thing. These are all things that you simply need to verify with documentation and testing.
This was first published in March 2010