Problem solve Get help with specific problems with your technologies, process and projects.

Think before you invest in disk-to-disk backup

Bites & Bytes: Jon Toigo explores moving from tape to disk-to-disk. Is the "old jalopy" better than the "shiny new porsche"? Find out what Jon Toigo thinks.

Driving along the road, you pull up in traffic behind a beaten-up jalopy with greasy smoke plumes spewing from its tailpipe. On its rusted chrome bumper is a sticker proclaiming, "My other car is a Porsche." You:

1. Chuckle at the wittiness of the driver, then nervously check your rear view mirror for telltale smoke. 

2. Flip out your cell phone and call the Department of Motor Vehicles to register a fuel emissions nuisance complaint. 

3. Reflect on the age and implied prestige of your own vehicle and resolve to do better so you too can afford a new speedster. 

4. Content yourself with the knowledge that your car runs just fine and that you don't need some speed buggy with extreme RPM capabilities just to sit in traffic.

Odd though it may seem, this stream of consciousness is similar to what goes through the minds of many storage managers when they hear one of their peers talking about his new disk-to-disk backup or mirroring solution.

You listen attentively to the story of how the manager left behind the tired old world of tape backup, embraced the bright shiny future of disk-to-disk, and in one fell swoop made all of his issues disappear with "backup windows" and "recovery timeframes". You might wonder nervously whether you are behind the curve with your aging tape solution. Or whether your golf course banter or cocktail party conversation is less sexy because you still use tape.

Your insecurities drive you to ask questions about the total costs of the solution or for more details about the other guy's infrastructure that will reassure you that his solution would not work for your budget or your environment. You might even console yourself that, for a mirror site to deliver real disaster protection, it would have to be placed a significant distance -- in most cases several hundred miles from primary storage -- from the primary site. Only then would mirrored data be well out of range of whatever disaster consumes the main facility. With speed-of-light induced propagation delay over such a distance, the requirement creates a hornet's nest of mirroring issues that makes good, old tape-based restores look downright pleasant.

Whatever opinion you form about the continued efficacy of tape as a backup medium, expect to find your position continually under assault. After awhile, with enough SPAM email from vendors of mirroring or disk-to-disk solutions, enough trade press articles hyping disk-to-disk and enough pundits espousing the new wave of disk-based backup solutions, your resolve to stay the course with tape may begin to erode. Vendor marketeers are making it easier for you to reach the conclusion opinion that you are driving a tape jalopy, that is smoke that is beginning to billow out of the tailpipe of your data protection strategy, and that you need a disk-to-disk Porsche.

My message to you is simple: Stop and think before you act.

No one-size-fits-all disk-to-disk solution

The appeal of disk-to-disk for data protection applications is profound, especially as inexpensive ATA or Serial ATA disk platforms become more readily available on the market. Disk mirroring promises instantaneous, near realtime failover for data required by mission critical applications. If primary arrays become damaged, traffic is switched to secondary or mirrored backup arrays. Cool. Ultimately, quite pricey, but cool.

However, disk-to-disk isn't just about volume mirroring. Other D2D configurations use secondary disk arrays as "tape emulators" rather than true disk drives. In these configurations, you use the same backup software that you have always used for tape, but change the target device where data is being written to disk drives instead of tape media.

In still other cases, secondary storage arrays become repositories of snapshots of changed data from the primary array. This process creates a bunch of incremental backups that must be reapplied in sequential order to a full volume backup of data (stored somewhere safe) in order to bring the data back to a useable form in a recovery situation. This is a dicey proposition, especially since, in many cases restoring snapshots will not produce an operationally sound environment. Many managers have learned the hard way that, because the snaps themselves were made of files and databases while they were in an open or active state, aggregating all of the snapshots doesn't mean that the resultant data pool can actually be used. Check your software carefully.

Finally, disk-to-disk sometimes implies a "bare metal" backup to disk, a block image of disk bits that may lack granularity for a single file restore and instead must be restored in toto before you can do anything with the data. This approach can actually turn little disasters into big disasters: the kind that happen when recovering a single deleted file requires several hours of full volume recovery, rather than several minutes.

The above is just a cursory survey of the "mainstream" disk-to-disk solutions that are being introduced to market at present. Following on their heels are a number of new techniques, including commonality factoring from Avamar Technologies, time machine technology from Revivio and other secret sauce approaches from other up-and-coming players that promise to leverage the random access properties of rotating disk (or even solid-state disk) to reinvent how we think about data protection altogether.

So, one of the first problems most storage folks confront when seeking to trade up their tape for disk-to-disk solutions is that there are a lot of them and no easy taxonomy for sorting down the different products into different categories. Moreover, there are no clear best practices for associating a given D2D "solution" with any particular requirements set. Once you break down so-called D2D products by their differentiators, you still haven't got a clue whether replacing tape with another solution would be advantageous from a cost or efficiency standpoint.

Cost of ownership concerns

No one in the industry is prepared to tell you that now is the time to abandon tape for D2D. In the final analysis, all that the vendors and analysts are saying is that disk is getting cheaper and that it should eventually become as cheap as tape.

From the perspective of media cost, this may be true, but not from the perspective of system costs. At a systemic level, tape is still far less costly than disk-based data protection -- especially as the amount of data to be protected exceeds roughly what would fit on 200 tapes. Even "ghetto RAID" boxes can't compete with tape for large volume data protection requirements, according to smart guy Fred Moore of Horison Information Strategies, whose analysis on this point is compelling (see

EMC and the other mirroring advocates have finessed this important point by de-emphasizing equipment costs and instead focusing on recovery timeframes. Switching over from failed disk to working disk and keeping applications constantly in service is the value of disk-to-disk mirroring. Restoring from tape requires a minimum of one hour per terabyte, and that's assuming you bring along the right tapes. They also point to the people costs of a tape-based system and note that mirroring doesn't require as many monks.

However, they often fail to note that their disk mirroring scheme requires seven mirror disk sets in the array for every one primary disk set as an internal safeguard against failed disk (for point in time recovery), and that this configuration needs to be replicated on two additional arrays (one local and one remote) to establish a "multi-hop mirror" that insulates production apps against write latency and establishes a remote (albeit asynchronous) copy of the data for off-site disaster protection purposes. The hardware costs for this solution, together with software licenses and network tolls, raise the cost of the configuration to well in excess of $1 million for a terabyte of protected data.

Some of the newer ATA and Serial ATA players claim they can carve a chunk out of this price with arrays based on cheaper disk drives, but they beg an important point. As Steve Sicola at Seagate Technologies is fond of observing, ATA and SATA drives are not designed for the workload of Fibre Channel or SCSI drives, so how can you trust these arrays to handle application loading in the wake of a primary disk array interruption?

The bottom line

The decision to go with disk-to-tape or disk-to-disk comes down to a careful review of several factors. First and foremost, you must understand your applications, their criticality and the nature of the data store that supports them in the production setting. Application criticality will determine whether you can wait for tape restore or must adopt disk-to-disk out of necessity. However, the nature of the data store may provide some additional runway for tape. Since most data is unchanging once it has been written, you might be able to pre-stage or pre-position some of the data on arrays in your backup facility. That done, a 1-TB-per-hour tape restore rate may be plenty fast for recovering operations post disaster.

If the application's data cannot be unavailable for any length of time, then a disk-to-disk solution may be in order. But you need to evaluate carefully the I/O performance obtained from production systems to determine whether you need identical equipment at the recovery site or can make do with ghetto ATA or SATA RAID boxes.

Finally, you may need to consider an entirely new technology for data protection, one that not only provides resiliency, but also culls the junk and the duplicate files from your backups, cleans up any viruses or logic problems, and enables instantaneous recovery of the storage infrastructure to any point in time. Over the long haul, these burgeoning solutions from newbies in the storage industry are probably the real replacements for tape.

In the end, you may find that your old jalopy is still the best solution for your driving needs. If you are concerned about appearances, remember that some automobiles become timeless classics, while most others -- including the sports cars -- tend to be bound by "trendiness." Before you buy the latest model, be sure to check your reasons and the warranty.

About the author: Jon William Toigo has authored hundreds of articles on storage and technology along with his monthly "Toigo's Take on Storage" expert column and backup/recovery feature. He is also a frequent site contributor on the subjects of storage management, disaster recovery and enterprise storage. Toigo has authored a number of storage books, including "Disaster recovery planning: Preparing for the unthinkable, 3/e".

Dig Deeper on Disk-based backup

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.