How to implement virtual tape libraries

This section of our Backup Best Practices guide offers a series of best practices that can help avoid common implementation mistakes and get the most from VTL technology.

Virtual tape libraries (VTLs) overcome the traditional speed and reliability issues that have plagued tape for decades. A VTL is a conventional disk array that emulates a tape library, which allows backup software to operate normally, recognizing virtual tape cartridges and performing full backup processes with the enhanced speed and integrity of disk systems. VTLs are basically drop-in replacements for tape systems -- unlike snapshot, replication, continuous data protection (CDP) or other disk-based storage systems, no substantial changes to the backup software or established backup processes are required.

Still, VTL systems are limited by their internal storage capacity, and only a finite number of backup volumes can be stored. Data reduction technologies such as data deduplication can ease this limitation, dramatically extending the effective storage capacity and retention period. Because disk isn't portable like tape, VTL data must be replicated offsite or periodically offloaded to a remote storage system for long-term protection. This chapter of the guide offers a series of best practices that can help avoid common implementation mistakes and get the most from VTL technology.

Implement data reduction techniques

Tape offers almost infinite storage potential -- when tape cartridges are full, just rotate in blank cartridges. This is not the case with a VTL, where capacity is fixed by the quantity and size of installed disk drives. To extend the storage capacity of a VTL, a storage administrator will need to add more physical capacity, offload old or unused backups or implement data reduction techniques such as data deduplication (sometimes called single-instance storage (SIS) or single-instance repository (SIR)). Deduplication works by eliminating redundant data, saving only unique iterations of files, blocks or bytes to disk. When used over a period of time, deduplication can reduce storage requirements as much as 50-to-1 (though 30-to-1 is more common). At a 30-to-1 ratio, a 750 GB hard drive would have an effective storage capacity of 22,500 GB (22.5 TB). This kind of compression allows far more backup jobs to be retained for much longer periods. Today, data deduplication is almost a standard feature on VTL platforms, and it should be used to the fullest extent.

Be selective about what to backup

Another way to reduce storage demands and support speedier recovery is to eliminate nonessential files or data from the backup set -- not every file or folder may require VTL backup. For example, music files downloaded from iTunes or employees' vacation video clips have no purpose in a corporate data backup. An organization with data classification tools or backup software with exclusion filters for certain file types can keep unwanted data out of the backup job, reducing the backup window while shrinking the recovery point objective (RPO)/recovery time objective (RTO). Be sure to consult with other departments and key stakeholders before making any changes to the backup process. Because a VTL has a finite amount of disk storage space, it is more important to weed out any unnecessary or unwanted files than when using tape.

Provide sufficient storage to meet retention requirements

It's easiest to keep backup jobs right on the VTL itself, so consider the retention requirements for your backup data once data deduplication is enabled. For example, a VTL without deduplication might only be able to save one month's worth of weekly backups. With deduplication enabled, that same amount of disk space would probably provide enough effective storage capacity to retain backup jobs for many months or even years. VTLs can also make restores much faster and easier, so it is important to determine how much data you should retain on the VTL to satisfy most restore requests. While industry estimates vary, the vast majority of restores are done on the most recently backed up data, so retaining a few weeks or months of backups should suffice. Verify that there is enough disk space to meet your backup retention needs, and upgrade disk space on the VTL as required. Remember to consult with your legal council or corporate compliance officer to determine the appropriate retention period for VTL backup jobs.

Extend data protection beyond the VTL

VTL systems retain their data locally, and this can be a disadvantage to users who rely on offsite data storage with tape. If offsite storage is an important attribute of the backup process, VTL systems should be coupled with a secondary means of data protection. For example, some VTL systems support disk-to-disk-to-tape (D2D2T) data handling through a supplemental tape library connection, allowing backup jobs on VTL platforms to be systematically moved to tape for offsite storage. In other cases, the VTL system is replicated remotely to a second VTL or storage system, or the VTL itself is placed offsite. Some backup software can track and catalog tapes created from VTL backup data sets. This feature can reduce time and effort for the user, and it reduces the possibility of human error.

Avoid expensive storage in VTL platforms

The key to a VTL is storage volume rather than performance, so most users will opt for high-capacity, low-cost SATA drives, even when smaller, better-performing SAS drives can be used. When deploying SATA drives, be sure to implement RAID 6 (dual parity or DP) disk protection to guard against the possibility of multiple simultaneous disk failures. Enable VTL features such as preemptive rebuilds so marginal disks will begin the rebuild process before an actual failure occurs.

Implement single-file restores if available, but use security

Traditional tape backups are often complete data packages that must be restored before individual files can be retrieved. However, VTL systems like the S2100 VTL from Sepaton Inc. can change this formula, allowing individual files to be recovered from a backup without restoring the entire backup volume first. This type of functionality can allow users to recover lost or damaged files without intervention from a storage administrator. This frees the administrator to deal with more complicated tasks, so allow single-file restores wherever possible. However, it's important to prevent users from recovering files that they're not authorized to access. If your VTL system offers user recoverability, take the time to implement the appropriate security measures. This will take a bit of planning and tweaking from IT.

Use encryption to maintain security in backup sets.

Virtual tapes may enjoy a bit more physical security because they can't be removed from the VTL for offsite transport. However, backup sets should still implement security measures such as AES 256-bit encryption to prevent unauthorized access -- encryption may also be required to meet corporate compliance obligations. And if you eventually "spin off" your backup data to tape, you should encrypt those tapes. Use the VTL's encryption features and be sure to add key management procedures to your regular IT activities. If the VTL does not include encryption, be sure to enable encryption through the backup software. It is important to note that deduplication and compression must be performed before encryption. If you perform the deduplication or compression after encrypting the data, it will have almost no effect.

Optimize the VTL for best performance

VTL deployment usually involves some amount of performance tuning in order to accomplish backups in the minimum amount of time. Not only should storage administrators pursue the fastest data transfers (e.g., in terabytes per hour), but also tune the VTL with features like dynamic load balancing to achieve peak data streams across various disks within the system. The VTL vendor is a primary source for tuning and optimization information. Keep in mind that some optimizations may require the use of optional software. For example, FalconStor Software provides HyperTrac Backup software to accelerate backups through the users' existing third-party backup software.


Dig Deeper on Disk-based backup

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.