The new generation of ATA disk arrays uses either virtualization software provided by the tape library vendor or...
third-party software. Storage administrators should pay particular attention to front-end virtualization products. While every vendor's disk array offers disk virtualization software, not every product provides front-end tape virtualization software.
For instance, EMC's Clariion CX array is available in an assortment of configurations that can be packaged with different front-end disk virtualization software. If purchased from EMC, the DL700 ATA disk arrays will ship with FalconStor Software's IPStor virtualization software. However, if purchased from ADIC, the same back-end EMC disk array will ship with ADIC's virtualization software, obtained as part of its acquisition of V-Stor in 2002, and ADIC's I/O controllers.
Other virtualization products from vendors such as Candera Inc. or IBM don't present virtual tape drives or cartridges, but rather virtual disks or LUNs from pools of storage; these don't have functionality to allow backup software to see virtual tape drives or cartridges.
While users can obviously implement backup software or normal operating system utilities to store data on this type of disk, they shouldn't expect the same level of functionality that they'll get from arrays that have front-end tape virtualization software.
For example, ADIC, IBM and StorageTek use software that migrates data transparently between the disk array and the tape library. This provides a number of important benefits. First, the data transfer from the virtual tape disk to a physical tape can occur at any time. Secondly, the identifier assigned by the backup software to the virtual tape moves with it to the physical tape, so when the tape is ejected, the backup software isn't aware there was a virtual tape involved at any point in the process.
Users also need to weigh variables such as reliability, support and warranties on arrays with ATA disks. Remember that disk drives, power supplies and internal components fail, or worse, have intermittent problems. ATA disk is cheap for a reason: Qualstar Corp. finds that ATA disks experience failures at rates six times greater than SCSI disks and have a mean time between failure (MTBF) of 600,000 hours. While this exceeds the 400,000 hour rating of LTO, SAIT and SDLT tape drives, the loss of a tape drive won't result in the loss of data, while the loss of a disk drive could. In fact, Dell Inc. finds that a standard RAID-5 configuration in a Dell Power Vault 660F/224F array with 14 SCSI drives carries a 38% chance of data loss over a three-year period. If you factor in the greater likelihood of ATA drives failing vs. SCSI drives, the probability of data loss using ATA drives in some RAID-5 configurations increases substantially.
|Virtual tape subsystems|
Tying disk and tape together
To get their disk/tape arrays to work, in most cases, vendors just rework their own storage resource management software or set up some customized policies in the backup software users already own. However, there are exceptions. Tricia Jiang, an IBM Tivoli technical representative, says, "TSM views disk, tape and virtualized disk as a storage pool on which it can store data based upon policies set by the administrator. It's agnostic in its view of the storage it manages, so if a user has EMC Symmetrix and IBM FAStT disk systems with a Spectra Logic tape library and an ADIC Pathlight VX virtual tape subsystem, it does not care."
The integration of disk into tape libraries combines the best qualities of both disk and tape in a single backup system. These new disk/tape libraries offer users the ability to meet immediate backup and recovery needs and address their longer term compliance requirements, while setting the stage for off-site data replication and vaulting. Users should begin to identify methods to incorporate this integration of tape libraries and disk into their environments. Failure to do so will ultimately mean organizations will pay the price in terms of decreased efficiencies and higher costs both to maintain the availability of their data as well as to recover it.