While many in the data storage field tend to use the terms data backup and data archive interchangeably, they actually are distinct functions, said Enterprise Strategy Group analyst Brian Babineau. Data backup refers to data needed regularly for production, and data archive refers to data accessed rarely and held primarily for compliance reasons.
The Storage Networking Industry Association (SNIA) defines an archive as "A collection of data objects, perhaps with associated metadata, in a storage system whose primary purpose is the long-term preservation and retention of that data." That definition hints at a further distinction. Data that is archived is not usually expected to be readily searchable. So, if you have a sudden need for, say, a series of emails from five years ago -- you might need to think about what you will need to do to locate and read that information.
Karen Grost, the business contingency/security plan administrator for NuUnion Credit Union, Lansing, Michigan, understands the differences between backup and archiving. Her organization recently started using the STORServer Backup Appliance to handle backup and to also help manage archiving.
From a backup perspective, "We chose tape because it is easier and less expensive for us to move data to an off site facility that way," she said. Similarly, the STORServer appliance and the IBM Corp. Tivoli Storage Manager software it uses help manage archiving, which is also based on tape. She said the focus of archiving is to maintain data for seven years "in case someone subpoenas us." And, she noted, tape again offers lower costs. "We set the tapes to expire after seven years and Tivoli Storage Manager tracks that for us," she added. So far, she said, "we haven't come close to running out of space because we already had so much tape."
Tape vs. disk for data archiving
Meanwhile, at the Dallas County Community College District (DCCCD) Joe Gremillion, network support specialist, with responsibilities for SANs and storage management, also takes care of archiving data. In addition to regular backups, Gremillion said he performs weekly and monthly archiving. He has also wrestled with the tape versus disk question but in his case has landed mostly on the side of tape for archiving. But getting to that decision point required help from disk, he explained. Most of his servers are virtualized, based on VMware. And, he said, his legacy backup solution Symantec Corp. Backup Exec 12.5 wasn't working very well. It often couldn't finish the backups by morning using tape. So Gremillion adopted Veeam, a disk-based backup product developed specifically for virtual environments.
That transition has helped speed up backup, but when it comes to data archiving, Gremillion said tape still rules. In particular, he said, the move to Veeam and disk has freed up a lot of tape (some of which is still used to backup the organization's remaining physical servers). But most is now available for archiving. "We perform archiving once a week or monthly using tape," said Gremillion. And for that purpose, Gremillion said tape is still a good choice.
"When you make decisions like this, you need to know your environment," he said. Clearly, he said, for production environments, where backup windows and user expectations are drivers, disk makes more and more sense. But not for archiving, he added.
A similar set of circumstances also faced Charles Braffett, who handles storage operations at the Foundation for Blind Children (FBC), which has three locations in Arizona. Braffett recently looked at how to best support backup and archiving for his organization. FBC's existing storage infrastructure consisted of direct-attached storage (DAS) on its seven physical servers (including two servers hosting 24 virtual servers), two network-attached storage (NAS) devices totaling 5 TB of capacity, and an LTO-1 tape backup system. Last August, however, Braffett was able to upgrade to a new Tandberg system, consisting of a DPS1100 virtual tape library (VTL) and a StorageLoader LTO-4 tape autoloader. The new equipment has dramatically reduced the weekly backup window, which, in turn has helped make archiving simpler.
"I can now write to tape for archiving directly from the backup system without going over the network," he said. Data archiving is built into the weekly backup and the tapes are stored at secondary and tertiary sites for additional safety, he said.
Braffett said he expects to stay with tape for archiving for the foreseeable future based in part on its cost-effectiveness. However, he noted, tape reliability is also better than in the past. "I used to have up to a 68% failure rate recovering data from LTO-1 and LTO-2, but with LTO-4 I haven't had any problems so far," he explained. Braffett said he also sees potential in cloud data storage and is actually testing a beta solution from a company called Nasuni Corp. "They provide a gateway to cloud storage resources -- you pay a fixed fee for unlimited storage," so it could be used for regular backup or even archiving he said. Still, at a planned price of $250 a month, Braffett said it may still be a bit expensive for his organization.
Tips on choosing the right data archiving medium
Analyst Babineau said customers trying to pick the right data archive medium should evaluate their archive information's accessibility requirements when choosing the appropriate storage media. If they are archiving an application where the data will be regularly accessed, disk is a better alternative. "This often happens in email archives where employees are constantly interacting with them and attorneys are searching the repositories in response to electronic discovery requests," he said. Alternatively, if an organization is archiving data that will rarely be accessed -- such as very old financial data stored within a database that must be kept for compliance purposes -- then tape is the more logical and economical choice.
However, noted Babineau, most organizations have situations that are complex enough that they should actually consider using both disk and tape. "They can store active archive data on disk and move it to tape as it ages and becomes less active," he said. For example, said Babineau, a company may archive emails for five years -- the first two could be on tape and the last three on disk. Babineau said this can enable companies to balance archive storage costs and accessibility requirements.
About this author: Alan Earls is a Boston-area freelance writer focused on business and technology, particularly data storage.