Part of what's going on is that data backup has been segmented into three different tasks. The most immediate is file recovery: getting back files and folders that have been lost due to a system or (most commonly) user error. This requires fast access to the stored copy and the ability to restore quickly (as in seconds). The next task is short-term storage: keeping copies of data for 90 days or fewer with the ability to recover individual files to entire storage system images in times from a minute to hours. The third category is long-term storage: keeping data that has to be stored for anywhere from a few months to forever. This is usually archival storage and typically involves much smaller amounts of data per data set (although the total amount of data may become extremely large).
Generally speaking, the longer you want to keep data the more attractive tape becomes. The first two categories of stored backup data are increasingly the province of disk-based systems. In the third category, long-term storage, tape's advantages make it the most common choice. Traditionally, long-term storage accounted for only part of the backup market, so tape's overall share is shrinking as disks become more popular for file recovery and short-term and near-line storage. Tape also has several advantages when it comes to long-term storage and a few other jobs, such as mass data transfer.
The cost of tape
The biggest advantage tape has is its cost. Tape still has a lower cost per gigabyte of storage than any of its other competitors. While the cost of short-term disk storage, such as VTLs, has fallen to a level nearly equivalent to the cost of tape libraries, the libraries have a much greater storage capacity by swapping out tapes.
Never underestimate the bandwidth of a Station Wagon full of tapes. You can move your data from Point A to Point B over a WAN, or even the Internet, but when it comes to moving really large quantities of data where time isn't critical, tape is still the most cost-effective method.
In theory, a tape drive or a tape library is infinitely scalable. All you have to do is buy more tapes. By contrast, disks typically require buying more hardware to get a major capacity increase. Adding disk arrays is expensive, not to mention the additional power and cooling costs, floor space requirements and other items.
Of course, this only works if you can swap out tapes, a restriction that seldom matters for long-term storage. If you need to keep the data constantly available in the tape library, tape's scalability is less of a factor.
Tapes consume less power
Disk drives have to run to be used. Except for MAID or arrays that can be turned off intermittently, disk storage is a major energy consumer in the data center, no matter how infrequently the data is needed. Tape drives only run when data is being read or written.
In many cases, tape survives simply because of the enormous amount of legacy data on tape. It's easier and cheaper just to keep using tape than it is to move to a new medium. Tape is also a thoroughly understood and well-supported technology. Nearly every kind of medium and long-term backup software supports tape and the tape interface is (relatively) easy to manage for that reason. That is also why VTLs are so popular. They provide a disk-based substitute for tape that plugs into existing architectures.
About the author:
Rick Cook specializes in writing about issues related to storage and storage management.
This was first published in July 2008