Published: 09 Feb 2005
|About the survey:|
Storage surveyed its readers and SearchStorage.com members in November 2004 to determine how often unreliable tapes were at the heart of a backup snafu. When asked to describe the tape failure situation in their shops, nearly a third of the respondents (31.2%) said it was either a significant problem that often disrupts backups or a problem that sometimes disrupts backups.
Tape by its very nature is imprecise--it's essentially a ribbon of plastic that can stretch and shrink, and that's tugged across heads where it can slide up and down. "There's a lot of wear and tear," says Michael Passe, senior storage engineer at CareGroup Healthcare System/Beth Israel Deaconess Medical Center in Boston. "You're pulling this thin, long piece of iron oxide-coated plastic across a head about a couple of million times a year." And as opposed to sealed, fixed-placement disks, tapes are handled frequently and often transported.
Over the years, tape manufacturers have gone to extraordinary lengths to engineer products that can stand up to these harsh environments. New materials and built-in intelligence to monitor the health of the tape have made tapes more reliable. But it's a battle that's far from over. One respondent cited tape's "huge hidden management costs," while another added that "tape drives and media are the biggest problems in our shop."
Pinpointing tape-related problems can be tough. "Sometimes we're not quite sure whether we have a drive issue or a problem with the media," notes George Rogers, senior systems programmer for storage administration at CareFirst BlueCross BlueShield in Washington, D.C. Rogers says the error messages from his firm's backup applications give little indication if the media or the drive is the problem. And the problem may be compounded if a single bad tape affects multiple drives.
Despite all the horror stories, tape offers two compelling benefits: the lowest media price and portability. With ever-rising capacities and transfer rates, tape technology isn't sitting still. "For reliability, speed, ease of storage and dollar per gigabyte, nothing beats tape," commented one satisfied tape user.
Shouldering the backup burden
The 378 storage professionals who completed our survey come from a broad cross-section of industries, regions and storage shop sizes. On average, respondents' tape libraries were equipped with approximately 31 drives and more than 2,200 slots.
The amount of data respondents back up varied widely--a third back up more than 4TB per week (see Figure 1, this page), and nearly a fifth (18.8%) say they use more than 200 tapes a week (see Figure 2, this page).
The most popular tape format--by a wide margin--is DLT; more than 50% of respondents use this format (see Figure 3, this page). "DLT tapes have been extremely reliable," said one respondent, noting that his group has seen "only two or three media failures within the last 10 years." Hard on the heels of DLT are LTO-1 and LTO-2, which netted approximately 21% and 30% usage, respectively. Among the tape brands used most often by respondents were Fuji Photo Film U.S.A. Inc.(Fujifilm) (32.3%), Imation Corp. (28.6%) and Sony Electronics Inc. (24.1%). These brands were followed by a clutch of manufacturers garnering 17% to 19% of respondents' answers (see Figure 4).
Some users have experimented with different brands and found variations in tape quality. "Everybody's oxide is not the same," says CareFirst's Rogers. "Some break down quicker than others, and some of it doesn't adhere to the substrate as well as the others--there's definitely a difference in quality." But Rogers also found that trying different brands may not sit well with drive vendors; he ran into problems using media that his drive vendor said wasn't certified for its libraries. Even though CareFirst had used the brand before and was using it in other libraries from the same vendor, this didn't help to resolve the problem.
Is tape the problem?
In a complex process such as backup that involves a mix of servers, software and storage devices, it's sometimes hard to pin a failed backup on one particular component. "The problems aren't necessarily with the tape vendor," remarked a respondent, "but have to do with the complicated environment--backup client, API client, DB client, tape sharing software [and so on]."
Rich Gadomski, vice president of marketing for the recording media division at Fujifilm, heartily agrees. "Media failure is almost like a default setting; it doesn't really indicate that there's a problem with the tape itself," he says. Gadomski adds that most media errors are due to "software, hardware, firmware, power supply issues or improperly maintained equipment."
Nearly 60% of survey respondents said their backups fail less than once a week on average, but approximately one-quarter (25.6%) reported two or more failed backups weekly. When asked to rate the most frequent causes of failed backups (see Figure 5, this page), the culprits receiving the highest responses as occurring sometimes, often or always were tape cartridges (52.9%) and human error (51.6%). Clearly, tape reliability is perceived as a key factor in the success--or failure--of a backup operation.
Respondents were asked to indicate the number of tape failures they experience each month with the tape formats primarily used in their shops (see Figure 6, this page). While the results for most formats fell into the range of 1.5 to 2.5 failures per month, 9840 and 9940 cartridge users reported 4.2 and 3.2 monthly failures, respectively. Interestingly, 31.5% of respondents said they use Storage Technology Corp. (StorageTek) libraries, which almost tied them with Hewlett-Packard Co. (HP) libraries for first place in this category. StorageTek also notched a very high score for library quality, with 91.4% of respondents rating their libraries as good, very good or excellent. StorageTek's tape drives earned a similarly high score of 92.2%, putting it at the top of the charts for that category.
Overall, there doesn't seem to be any correlation between tape problems and the tape format used. The number of respondents who said tape failure was a significant problem or a problem, and the number who characterized it as just an annoyance or not really a problem were fairly equal across all tape formats. Similarly, no smoking gun emerged when the same criteria were applied to tape brands used by respondents; there was no discernible correlation between the perceived seriousness of the tape failure problem and any particular tape brands.
When good tapes go bad
Generally, the number of tapes that failed on initial use was low--averaging less than one per month among all responses. Still, bad tapes out of the box were a bugaboo that haunted some respondents. "We recently had a bad batch of tapes that placed our entire enterprise tape backup at risk," noted one respondent who requested anonymity. "Going forward, tape will be used for long-term archive only, and backup will be to disk."
The causes of tape failures varied, but nearly 53% of respondents said media errors were sometimes, often or always the cause of their tape failures (see Figure 7, this page). Storage administrators admit that tape mishandling can often be the cause of a failure. "If a tape goes bad, I find it's usually a user error," said one candid respondent who indicated that dropping a tape is the most common user slipup. "It's very rare for LTO-1 or LTO-2 HP tapes to go bad on their own."
"I've had a lot of problems in the 9900 space with tape leaders being ripped off," says CareGroup's Passe. It was a disruptive enough matter that his group did a visual inspection of all 500 of their 9940 cartridges. They found about five cartridges that looked OK at first but had broken leaders. Passe hasn't run into this issue with 9840 tapes. "The 9840 tape technology is spoke-to-spoke with midmount heads so there's no leader--those definitely seem to be more reliable," he says.
Tape drives shared users' wrath aimed at tape media, with approximately 40% of respondents fingering these devices as sometimes, often or always the cause of their backup failures (see Figure 7, this page). "Tape drives are the most unreliable form of backup there is, with the exception of no backup," claimed one respondent.
The cost of failure
Taken individually, a tape cartridge is a modest expense--perhaps the cheapest item a storage manager will ever budget to buy. But as failed tapes pile up, the expense can become significant, if not considerable. Our survey finds that about half of respondents purchase fewer than 10 tapes per month, but more than a fifth (21.4%) buy more than 50 cartridges each month. At approximately $50 each, buying 50 tapes per month totals approximately $30,000 a year--a sum large enough to merit line-item status on most storage shop budgets.
Perhaps because the cumulative cost of failed tapes isn't apparent, 61% of respondents said they don't even bother trying to recoup the cost of cartridges gone awry. In some cases, this could mean writing off a fairly good chunk of change: 20.4% of respondents said failed tapes cost their organizations more than $300 a month (see Figure 8).
While the cost of failed tapes may not be an overriding issue, the potential for a failure to compromise data protection certainly is. Some respondents are resigned to the possibility of at least an occasional bad tape disrupting the backup process and take appropriate precautions. "Tape reliability can be a problem; but if you have a comprehensive, multitier backup plan in place, the potential for problems can be negligible," noted one respondent. Others are actively pursuing disk-based and even optical alternatives to reduce their reliance on tape for backup.
Generally, users appear to be satisfied with their tape vendors. Rating the major tape brands, the brand that garnered the most users, Fujifilm, also snared some of the highest marks, with slightly more than 50% of respondents rating it very good or excellent. Imation ranked second, less than two percentage points behind, followed by Sony and HP. A few brands received poor or not very good ratings from more than 10% of respondents, with Certance LLC and Exabyte Corp. garnering the highest negative ratings at 16.7% and 16.2%, respectively (see Figure 9, this page).
Tape vendors boast of their quality control efforts. While they acknowledge occasional lapses, they say they're few and far between and generally affect a small number of tapes. Vendors cite mishandling of tape cartridges by users as well as distributors who sometimes pack their wares inappropriately before shipping them to customers. "Our overall rate of return is about .1%," says Fujifilm's Gadomski. This figure covers defective tapes as well as cartridges customers return when they've ordered the wrong product.
Overuse may also be at the root of some tape reliability issues. Vendors provide two durability ratings for their products. Shelf life is measured in years--typically 15 to 30 years--and reflects how long you can expect the media to retain its data and stay in good enough physical condition to be loaded into a drive and read. The second rating is essentially how long you should use the tape before retiring it to an archive. The way vendors define that rating varies, and may be expressed in terms such as uses or head passes.
"I have about 2,000 or 2,500 tapes under management in the library, most of which are in the 200 to 300 mounts category and some are below that," says CareGroup's Passe. "The upper end is in the 500 to 800 mount category."
Chris Caprio, technical service manager at Imation, notes that tape longevity depends on the tape technology in use. He adds that handling of tapes is "the biggest issue with determining media longevity and media performance." But Caprio doesn't see overuse as a major problem. "For the most part, customers understand that tape is really not a commodity" given the data it holds, so they tend to observe usage specifications.
A little more than 57% of respondents indicate that they monitor how often a tape is used. But vendor usage recommendations may not always be heeded. Twenty-two percent of respondents said they use a tape for more than 500 mounts before retiring it, while 26.5% said they err on the side of caution and retire tapes after fewer than 100 mounts. "I replace tapes every year," said one user, "and each one only gets used 26 times."
Using a tape 26 times may be overly prudent, but that kind of talk is music to tape vendors' ears. Mark Eastman, product marketing manager for Quantum Corp.'s storage division, says his firm recommends 200 uses for its DLT and SDLT cartridges; a use is defined as "a full write of the cartridge and a full read of the cartridge," according to Eastman. Despite their recommendations, Eastman says Quantum "typically finds that users just use the media until it dies." Many users may rely on the tape's sensors to signal when a tape is approaching the end of its road, he notes, adding that applications such as Quantum's DLTSage can help to track tape use and determine when a tape is approaching the end of its usefulness.
For mainframe environments, Fujifilm's Gadomski says system software does a good job of monitoring the useful life of a tape; when it records an unusual number of read/write errors it ejects the tape. For open systems, Fujifilm recommends about 5,000 head passes for DDS cartridges before retiring a tape, but Gadomski says tapes aren't typically used that long. Because they use linear scanning technology, LTO and DLT cartridges can typically handle a million or more passes, according to Gadomski.
Some tape vendors will analyze returned tapes to determine the precise cause of failure. Users are often willing to comply, but sometimes there will be a snag in the process. For example, as part of a healthcare organization, Passe's group needs to ensure that any data on returned tape is destroyed. That's often enough to deter a tape vendor from seeking the tape in question.
"They don't want to go through the hassle that it takes to certify that the data was destroyed on those tapes," says CareGroup's Passe. Quantum's Eastman says the company receives a "small percentage" of returned tapes that will go through a failure analysis process. Analyzing failures is a challenge, he adds, because you don't know how the cartridge was used and handled.
"We get the tapes back from the customer because we want to analyze the tape and understand why it failed in their environment," says Imation's Caprio. He adds that they'll provide a document saying they haven't read the data and certifying the destruction of the tape.
Ounce of prevention
While the survey didn't reveal any obvious links of tape failures to media format, brand, or library and drive types, the problem persists. Many of the respondents' comments related to the amount of physical handling that tape requires. Best practices, however, suggest a number of ways to reduce tape failures and possible damage to drives:
- Handle tapes as little as possible. Tape vendors say dropping a tape is the most frequent cause of damage. Even if exterior damage isn't apparent, the fall may have caused the tape to push up against interior flanges.
- Read and follow vendor usage specs. Don't confuse shelf life for useful life; if the vendor's ratings use measurements that are difficult to track (passes vs. mounts, for instance), ask the vendor to put the ratings in terms that make sense in your environment. And note that different media formats have different usage specs.
- Store tapes properly. Extreme temperatures and humidity can damage tapes. Make sure they're not exposed to extreme conditions when in transit.
- Match applications to tape technologies. Certain tape technologies are more appropriate to streaming data, while others perform best in start-and-stop conditions.
- Avoid recertified tapes. Used--or recertified--tapes are cheaper than new ones, but are often of inferior quality.
Tape is here to stay
Despite tape reliability issues that range from minor annoyances to serious problems, there's little question among respondents that tape will remain part of the backup picture for the foreseeable future.
The growing volume of data that needs to be saved, compounded by new exigencies such as regulatory compliance, will ensure that tape remains the medium of choice for long-term storage. Many respondents noted that they were actively involved in researching or implementing disk-based backup systems but, in most cases, the plans involve using disk for backup staging or for daily backups before archiving to tape.
But tape's key asset--portability--also figures prominently in the problems associated with the medium. And while tape may still maintain a cost advantage over disk, CareFirst's Rogers sees tape's hidden costs. "There are still a lot of misconceptions about the cost of tape vs. the cost of disk," says Rogers. He feels that when people say "tape is cheaper, they're going byte for byte what it costs to get in the box, but not the hands-on management."