Virtual tape libraries (VTLs) are dead, right? Weren't they supposed to be temporary solutions that would be long forgotten once everyone started backing up to "real" disk? While that might be what the VTL naysayers had in mind, we're more than a few years into the VTL "fad" and many of the products are doing just fine.
What happened was that an industry segment morphed to encompass both VTLs and intelligent disk targets (IDTs), a segment that was ultimately validated when EMC Corp. acquired Data Domain for $2.4 billion.
The VTL/IDT market has become so overshadowed by the data deduplication craze that some people may have forgotten why the industry developed virtual tape libraries in the first place. Virtual tape libraries were developed because tape was too fast; and VTLs made the unfamiliar familiar (for many backup administrators, backing up to disk was unfamiliar). In addition, virtual tape libraries were scalable, sharable, and they also avoid the fragmentation issues associated with backing up to file systems. They solved this problem using proprietary file systems that wrote data contiguously.
STATE OF VIRTUAL TAPE LIBRARIES TODAY TABLE OF CONTENTS
Let's look at how VTLs progressed in those areas they were supposed to fix.
Virtual tape library scalability. Scalability isn't just an issue for big enterprises; it's also necessary to meet the needs of small- and medium-sized businesses (SMBs). When the VTL market was in its early days, there were very few products that could scale well for either of these segments. But times have changed, and there are now several products that scale both up and down. With some notable exceptions -- Copan Systems Inc., IBM Corp., NEC Corp. of America and Sepaton Inc. -- all VTL/IDT vendors offer products for SMBs. Companies with less than 20 TB of data to back up each night can choose from a number of products -- some less than $5,000 -- that offer a lot of the same functionality available in high-end products. Offering products to the SMB market before they're deemed bulletproof typically spells failure, so the arrival of these SMB virtual tape libraries and intelligent disk targets is a sign that vendors have done a good job of working out any kinks in their products.
Midsized enterprises with 20 TB to 40 TB to back up each night can choose from almost every vendor. To back up that kind of data you need a system capable of handling 500 MBps to 1,000 MBps. Almost every vendor listed in the "Product sampler: VTLs and IDTs" has a product with that capability.
The high end of the enterprise (companies with 40 TB or more to back up every night) has fewer products to choose from. Users with that much data to back up connect large servers to a Fibre Channel storage-area network (FC SAN) and back them up using local-area network (LAN)-free backups. The last thing those users want to do is send those backups over IP; therefore, a product targeting this market segment must have FC as a transport.
Another reason why there are only a few products appropriate for this market is the lack of global data deduplication in some products. A user with 100 TB to back up each night needs 2,300 MBps throughput. They won't want to (nor should they have to) create and maintain three separate 33 TB backup collections that'll back up to three devices that can only handle 40 TB per night each. They need a single system that can handle this load over FC without splitting it into multiple backup collections. There are only a few companies with products capable of doing that: FalconStor Software Inc. and Sepaton (and their respective OEM partners Copan Systems and Sun Microsystems Inc., and Hewlett-Packard [HP] Co.). The aggregate throughput of NEC's Hydrastor is actually much higher than 2,300 MBps, but it doesn't yet offer Fibre Channel as a transport. If you need this kind of throughput over FC, but don't need deduplication, EMC, Fujitsu and Tributary Systems Inc. (which acquired Storage Director from Gresham Storage Solutions Inc.) have products that can help.
Noticeably absent from the list is EMC/Data Domain. Their fastest FC-based VTL runs at 900 MBps. Data Domain's DDX "array" boasts a number much higher than that, but it's actually 16 separate DDR units in the same rack that aren't integrated as far as deduplication goes. Data Domain doesn't support global deduplication, although the company has said it's on its roadmap. However, there's been no indication as to when this feature may become available.
VTL and IDT products range from the "ridiculously easy" to use to "so hard you can't believe it passed any kind of functionality testing." But most are relatively easy to use. Still, ease of use varies considerably, so you should definitely test with any products you're considering.
Integration with backup appliances. All VTLs and IDTs can be backup targets for just about any backup software product on the planet, and most can also replicate their data to another VTL/IDT. But few products today integrate with the backup software so that it knows about replicated copies and can use them for restores and copies to tape.
Symantec's NetBackup OpenStorage (OST) API offers one solution to this problem. With this API, the disk target isn't addressed as a virtual tape or a file system; the backup job is named and passed to the target, and the target stores it however it wants to. Once the backup is stored on the target, NetBackup can tell the IDT to replicate the data; when the replication is done, the IDT tells NetBackup. So, NetBackup is aware of the replicated data and the replication process, and can use it to create a tape copy. The process yields an onsite copy, an offsite disk copy and an offsite tape copy without anyone ever touching a tape. Today, only Data Domain, FalconStor and Quantum Corp. support this API -- and only FalconStor supports it via Fibre Channel; Data Domain and Quantum use IP as their transport.
CommVault Systems Inc. has a similar feature that works with network-attached storage (NAS)-based IDTs (but not VTLs). A media agent watches a directory that you're replicating to and looks for changes. It communicates with the CommServe (the main backup server) and tells it about the other copy, resulting in both copies being available for restores. If this other media agent were located offsite, you could then use that replicated copy to create an offsite tape copy of your replicated backup.
HP also offers this capability for its Data Protector software and the HP Virtual Library System (VLS). The product is similar to CommVault's, except it uses a completely separate Data Protector backup server (with its own catalog) to watch for newly replicated virtual tapes. Once those tapes are detected, it asks the other Data Protector server for its catalog information. Both servers can then use those virtual tapes, which would allow creation of a tape copy of the replicated backup.
Because all appliances are just servers running software, the difference between a software VTL and an appliance is more a matter of packaging than a technical issue. It comes down to preferences: prepackaged or build your own. Most VTLs and IDTs are prepackaged, but there are some exceptions, such as the software-only versions of FalconStor's and Tributary Systems' products.
You may also opt to buy a virtual tape library/intelligent disk target with its disk already attached or choose to add your own. In the latter case, options include software-only products or gateway products such as those offered by Data Domain and IBM.
Interoperability with tape libraries. A VTL may provide a direct connection to and integration with a physical tape library. The appeal of this feature has diminished with the increased interest in data deduplication. VTL-tape library integration made it easier to stage data from disk to tape to save space on expensive disk. But with deduplication, there's less need to do this. Products that integrate with physical tape are available from FalconStor, Fujitsu, HP, Quantum and Tributary Systems.
IDTs vs. VTLs. Whether you should back up to a file system device or a virtual tape library truly boils down to personal preference. If you want FC as a transport, your choice is easy; if you want a scalable, deduplicated system, only VTLs offer that today.
File system-based devices have two advantages over virtual tape libraries: what happens when your backup software expires a backup, and simultaneous read and write support.
When an IDT deletes a file, it automatically reclaims the space. But a VTL has no idea that the tape it's holding has expired. A workaround is to manually re-label tapes when they expire. When the VTL sees a new label being written to the tape, it knows it can throw away the rest of the data on that tape.
File system devices support simultaneous read and write, but VTLs don't. If a backup is writing to one virtual tape, another process can't read that tape to do a restore or copy. But this only happens if you're backing up and restoring/copying at the same time -- probably a rare occurrence that can be made even less likely by using smaller virtual tapes.
This article was previously published in Storage magazine.
W. Curtis Preston (a.k.a. "Mr. Backup"), Executive Editor and Independent Backup Expert, has been singularly focused on data backup and recovery for more than 15 years. From starting as a backup admin at a $35 billion dollar credit card company to being one of the most sought-after consultants, writers and speakers in this space, it's hard to find someone more focused on recovering lost data. He is the webmaster of BackupCentral.com, the author of hundreds of articles, and the books "Backup and Recovery" and "Using SANs and NAS."