Published: 17 Nov 2002
When it comes to backing up databases, there are a number of imperfect options. Picking the right one, though, is critical to success
Databases may house an organization's most critical information, but to a backup administrator, they are a major headache. Appearing as monolithic containers, they are often backed up full, increasing backup time and consuming CPU, network and storage resources. Recovery is always an issue as well. No wonder that the quest for improved database backup and recovery capability is at the heart of many storage hardware and software purchase decisions.
Often, people don't fully understand the multitude of options open to them. They may overbuy software from backup and storage vendors that have overlapping or redundant functionality, and end up never using one or the other of their purchases. Or they select an approach that isn't the best fit for their environment. Once you understand your options, you can pick the appropriate solution.
|How database backup modules work|
Backup software and databases can be integrated to perform open database backups through the use of modules-basically agents-that map the database vendor's backup APIs to the backup software vendors' application APIs. This allows backup without freezing the database-the approach, however, may degrade performance in high-volume transaction environments.
The problem with databases
Today's databases are enormous and keep growing. Simple math relates the time available to back up the database to the resources available to perform the backup. The math can be ugly: One company complains that it is now taking more than 20 hours to back up the Oracle database hosting their SAP R3 environment.
Then there's integrity. To the operating system, a database is typically one huge open file. If the file changes during the backup, the backup may be corrupted without a mechanism for managing these changes.
But restorability is the leading cause of nightmares for database administrators (DBAs). Restoring a database is a multistage process, and restoring the files from tape is only one component. Depending on the reason for failure (e.g., disk failure, database corruption, etc.), different actions may need to be taken. Corrupt databases, for example, may require a significant amount of time to simply discover the exact point of corruption. The restoration process can also include a significant amount of time reapplying database logs after the actual file recovery. The total downtime is the sum of all of these activities.
Evaluating your options
Different products address some or all of those issues, often resulting in considerable overlap in functionality. "Database backup options" summarizes the most typical approaches in use today for database backup. Let's look at some of the key decision criteria.
Cold backup. This is the old-fashioned method: You shut down the database and back up the files or volumes using standard backup software. No added software is needed and it's easy to manage. But the database is unavailable during the entire backup process, unacceptable in many environments.
When to use: When you can't use anything else-such as with older database technology where no other backup mechanism is provided. Cold backups are also sometimes used in conjunction with some other approach, on a weekly or monthly basis, assuming the downtime is available.
Logical backup. Using the export utility of your database software, you dump the contents of the database to a file. That gets backed up as part of normal filesystem backup. For smaller databases, this works effectively, requiring no additional software or hardware beyond disk space for the files. But export takes too much time with larger databases. Logical backup also usually requires quiescing the database during export.
When to use: With smaller databases, and in environments where the database outages during backup and restore periods can be tolerated.
Hot backup. Most modern database products include a mechanism to back up a live running database. A database command informs the application that a backup is being performed. Then the database does a checkpoint and all subsequent transactions are maintained in one or more log files. The database containers and the log files form a coherent entity that you can use to do a restore. Utilities such as Oracle's RMAN manage this process.
What such utilities don't do is manage tape volumes or manage schedules. That's why backup software vendors have created database-specific backup modules to provide a conduit between the database vendors' backup API and their own backup software (see "How database backup modules work").
The good news is that the database is always available during backup. Also, after the initial configuration, management of backup and recovery processes become one-step operations, because the backup utility is aware of the physical objects and logs required for a particular data set. The bad news is that it can still be timeconsuming to perform backups and restores.
When to use: In moderate to low transaction rate environments, where the database needs to be accessed throughout the day and the downtime associated with tape recovery is acceptable.
Split mirror/snapshot. The database is placed into hot backup mode, the split command is issued and the database is removed from hot backup mode. This happens in a matter of seconds. The volume copy is mounted and backed up independently of the database. The logs can also be backed up for point-in-time backup.
The database is always available during backup, and is only placed in a degraded mode momentarily. Split mirrors are exportable to other servers, enabling backups independently of the production database environment. If the split mirror is still available, restores can be done by simply reversing the direction of the synchronization process.
However, the added cost of hardware and software can be significant. Also, the initial scripting of the split mirror process and the ensuing change control process can become quite complex. Additionally, if a restore from tape is ever required, it is a multistage process that is often not understood and can be complex.
When to use: Large, high-transaction databases, and in other environments where the fastest recoverability is needed, and where performance impact of backups must be minimized.
A hybrid approach used in some environments is to do hot backups daily, but also to periodically perform a cold or logical backup to add an additional measure of recoverability. You can also employ a variety of strategies using multiple levels of split mirrors or snapshots with hot or cold backups. Incorporating replication technologies with one or more backup methods is also becoming a more common approach to addressing database backup/recovery issues.
As technology evolves, new approaches will emerge to address this vital data management issue. I'll be looking forward to evaluating and reporting on them.