Published: 01 Jul 2011
Disk and tape-based technologies can and should be used in concert to meet the spectrum of data protection requirements. Find out why tape may have a second life in the cloud.
Cloud storage discussions are typically centered on cost savings and keeping everything online and always available somewhere in the ether, but without taking up valuable space the way tape does. So, does the advent of cloud-based infrastructure and applications mean tape is finally going away? Surveying today's tape landscape, the answer is a definitive no . . . no matter how often we hear the death knell toll for tape.
In recent years, tape was focused on the data center and the middle-to-high end of the small- to medium-sized business (SMB) market with decreasing penetration below this level. Disk has been edging into tape's traditional backup and recovery role by using faster disk arrays that appear as tape libraries -- virtual tape libraries (VTLs) -- and disk-based deduplication to reduce the amount of storage needed. These days, tape is utilized by many businesses for low-access applications. The technology is positioning itself to compete in the fast-growing tier 3 storage opportunities that include fixed content and data kept for compliance reasons.
Tape remains a viable economic part of the data storage hierarchy due in part to its lower cost per GB, decreased operating expenses and reduced energy costs. And tape technology has added security features such as encryption and WORM, as well as a greatly enhanced media life (more than 30 years in some cases). The notion that "tape is dead" ignores all this. As we learned in early 2011 when Google admitted to restoring 40,000 cloud-based email accounts from tape, these advantages hold true even in the cloud.
Protecting data in the cloud
When considering data protection methodologies, IT organizations need to consider both physical and logical data protection. Physical protection ensures data is protected in the event of a disk, array or site failure. Logical data protection guards against the most common contributors to data loss, machine or human error (data corruption or accidental delete). A good backup plan provides both sorts of protection, which is why good backup is so expensive.
With the cloud, it's likely your cloud storage service provider handles physical data protection (disk failures, array failures, site failures) via some sort of mirroring or remote replication capabilities that are built into the per-GB price you pay. But chances are the provider doesn't handle logical data protection (user and machine errors), which is a key shortcoming of the cloud.
In many cases, disk and tape-based technologies can and should be used in concert to meet the spectrum of data protection requirements (see "Data protection methods and levels," below). Synchronous remote mirroring may be suitable for your organization's most mission-critical data and for full-site, near-zero data loss disaster recovery. Disk backup is for data that's a little less mission critical and can tolerate a little more data loss, while tape backup is typically for still less critical data or for affordably keeping long-term offline copies of data.
DATA PROTECTION METHODS AND LEVELS
Enlarge DATA PROTECTION METHODS AND LEVELS diagram.
Why keep tape in the cloud?
Most organizations employ traditional backup software where the backup server software communicates with backup client software (which resides on the system where the data lives). That data is transferred from the client to the storage device across the local-area network (LAN) or from a client directly to the storage device across a storage-area network (SAN).
The data captured by the backup software might be a complete copy of an organization's production data (full backup), or just the incremental changes since the last full or incremental backup (incremental and differential). Best practices suggest having multiple copies of backup sets (one set to be stored offsite and one kept on site) with copies maintained for a specified period of time. Backup can be performed directly to tape (D2T) or to disk first, with a copy made to tape (D2D2T) outside the regular "backup window." Tape is a popular storage medium with a good shelf life and, over time, the Capex and Opex cost per GB declines substantially. Tape is scalable and provides true "capacity on demand" because you buy tapes as needed, rather than investing in spinning media that's filled over time. Tape is also low cost and portable, making it an ideal media for seeding cloud storage or migrating between cloud storage providers. Advancements in tape verification technology have made the technology a more reliable restore medium, and tape is searchable. Perhaps most importantly, tape is a good multitenant platform -- it supports partitioning, encryption and has strong key management -- so it addresses users' top concerns about cloud storage, security and privacy.
But a number of well-known issues remain with backup to tape. If a tape is stored offsite, retrieving it can easily take hours or even days. Because backup is typically a once-a-day event, the recovery point is whenever the last backup was performed, so a gap of 24 hours could exist. Tape media can also be rendered unreadable for a number of reasons, such as exposure to a magnetic field or damage to the case. Poor or insufficient media management and verification testing can exacerbate the situation. And tapes can be lost. But data still needs to be protected from logical errors like accidental deletions and software bugs. As shown in the table above, tape covers all the bases. Despite its shortcomings, and in light of cloud security concerns and today's prices for power, cooling and floor space, it's still difficult to beat the long-term value proposition of having backup copies of data on a medium that's offline and encrypted. That's why tape may have a second life in the cloud. Just ask Google.
BIO: Terri McClure is a senior storage analyst at Enterprise Strategy Group, Milford, Mass.