Problem solve Get help with specific problems with your technologies, process and projects.

The problems backing up big databases

This tip discusses some issues of database backup and offers insight into what one company plans to do about it through reference data segregation and a pre-staging methodology.

According to the University of California at Berkeley, the fastest growing subset of business data is not files, but block data contained in relational database management systems. Anyone who has ever worked with backup/restore knows the hassles of backing up databases to and restoring them from tape -- especially big databases. So, Berkeley's insight is not exactly cause for celebration. But, there are still some issues with database backup that need exploring.

Here are some of the larger issues: Do bigger (say, multi-terabyte) databases spell death for tape, which chugs along at only 2 TB per hour under ideal laboratory conditions? Do such grand data constructs force companies into a disk-to-disk or mirroring strategy, and perhaps into a SAN topology, as some vendors would suggest? Does a big database shatter the concept of "backup windows" once and for all, since you need to quiesce a database before you copy its data to tape or disk, and copying all of the data in a huge database necessitates a fairly lengthy quiescence period, perhaps a lengthier one than your business can tolerate?

These are all good questions that are finally getting some attention as storage vendors jockey for position in the burgeoning "Information Lifecycle Management" space. In December, then again in late January, EMC Corporation made some much-covered moves to ally with in Campbell, CA-based OuterBay Technologies then with Oracle itself, to obtain tools and skills for sorting down the contents of huge databases –- ostensibly, to migrate older, non-changing, data in the DB to second tier disk platforms.


These new friendships make sense, of course, within the context of EMC's "reference data" philosophy. Says EMC, the world is full of often accessed, but rarely modified data that needs to stay online for reference purposes. But it is not cost-effective to host such data on your most expensive, most high performance gear. Seems like a sound observation.

EMC is seeking to apply this philosophy to big databases and to develop an enabling strategy that disaster recovery and business continuity planners have been seeking for years. The strategy is simple: confronted by a really big database, might it not be possible to "pre-stage" the lion's share of the DB (the non-changing part) at the recovery center. There, in the event of an interruption, the "pre-staged" data could be loaded from tape to disk in the time it took for the IT guys to travel to the emergency recovery center or hot site. With a viable data segregation and pre-staging methodology, recovery personnel could carry only backups of the changing data components of the DB to the hot site, then load them on top of the already restored non-changing or reference DB components. In short order, you would be ready for processing.

The scenario has appeal for the preponderance of firms that already have investments in tape technology and for whom the cost of mirroring is too great to justify. Plus, to the delight of StorageTek, Quantum, ADIC, Overland, Sony, Breece Hill, Spectra Logic, and many others, it has the additional value of keeping tape library vendors in profit.

The question is whether the enabling technology that EMC and others are exploring to carve "reference data" out of databases is feasible given the diversity and uniqueness of databases in play today. The answer is maybe.

Next week: The root of the problem

For more information:

Tip: Try concurrent Exchange backups

Tip: Get top performance from database storage

Tip: Treat databases the SAME

About the author: Jon William Toigo heads up an international storage consulting group, Toigo Partners International and has also authored hundreds of articles on storage and technology along with his monthly "Toigo's Take on Storage" expert column and backup/recovery feature. He is a frequent site contributor on the subjects of storage management, disaster recovery and enterprise storage. Toigo has authored a number of storage books, including Disaster recovery planning: Preparing for the unthinkable, 3/e. For detailed information on the nine parts of a full-fledged DR plan, see Jon's web site at

Dig Deeper on Backup and recovery software

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.