This article first appeared in "Storage" magazine in the April issue. For more articles of this type, please visit...
What you will learn from this tip: For disk-to-disk backup, virtual tape libraries (VTLs) treat disk as tape and offer many advantages compared to disk-as-disk backup targets. But VTLs aren't perfect, and there some caveats about the technology that you need to know before implementing a VTL.
VTLs, which treat disk as tape, offer two main advantages over disk-as-disk backup targets: ease of management and better performance. A disk-as-disk target requires all of the usual provisioning steps of standard shared storage arrays. In contrast, if you tell a VTL how many virtual tape drives and virtual cartridges it should emulate, the VTL software automatically handles all of the provisioning and allocates the appropriate amount of disk to each virtual cartridge.
If the VTL needs to be expanded (not all VTLs are expandable), you simply connect the additional storage, tell the VTL it's there and the VTL will automatically begin using the new storage. There's no volume manager to run and no RAID groups to administer.
Another important management advantage of VTLs is how easy it is to share VTLs among multiple servers and applications. To share a VTL among multiple backup servers running the same software, use the built-in library sharing capability that most commercial backup products have. To share a VTL among multiple servers running different applications, partition the VTL into multiple smaller VTLs, assign a certain number of virtual cartridges to each VTL and associate each VTL with a different backup server. Both of these scenarios are much easier than what's required to share a disk-as-disk target among multiple backup servers.
To understand the performance advantages of VTLs, think of how backup applications write data to tape. A backup application typically continues writing to a tape until it hits the physical end of tape (PEOT). It will append to a tape, even if some of the previously written data has expired. Once the backup application hits PEOT, the tape is considered full. Most backup applications leave everything on the tape until all of the backups on that tape have expired; then they expire the whole tape and write to it from the beginning. Other backup applications wait until a certain percentage of the backups on a tape have expired before "reclaiming" that tape by migrating the non-expired backups to a second tape. The first tape is then expired and ready to be overwritten. The bottom line is that portions of a tape can't be overwritten.
This differs from how backup applications write to a file system. The application tells the OS that it wants to write to a certain file name and then begins writing data to that file. Each backup gets its own file and when that file expires, it's deleted. The backup application has no knowledge of how this data is actually written to disk. Underneath the covers, the bytes of any given file are fragmented all over the disk, which results in performance degradation of the backup.
Because a VTL treats disk like tape, it eliminates fragmentation by writing backups to contiguous sections of disk. The blocks allocated to a tape stay allocated to that tape until the backup application starts overwriting that tape, at which point the VTL can once again write to contiguous sections of disk -- just like data is written to tape. Because VTL vendors control the RAID volumes, they ensure that a given RAID group is only written to by a single virtual tape. A disk can perform much better if it's only writing/reading for a single application using contiguous sections of disk. This key difference explains why the fastest file systems write in hundreds of megabytes per second, while the fastest VTLs write in thousands of megabytes per second.
VTLs offer other advantages, as well. With one exception (see the next section), VTLs work with all existing backup software, processes and procedures (see NetBackup's inline tape copy, Do IBM Tivoli Storage Manager users need a VTL? and EMC/Legato's NetWorker understands disk, too). In other words, everything works exactly as it would with a physical tape library (PTL). That isn't the case with disk-as-disk targets, where backup software can behave quite differently.
Read the rest of this tip in Storage magazine.
For more information:
About the author: W. Curtis Preston is the vice president of GlassHouse Technologies, Framingham, Mass. He is also the author of "Using SANs and NAS, Unix Backup & Recovery" and the "Storage Security Handbook".