Performing consistent, regular backups of critical business data is a vitally important part of any recovery strategy. When treated as an afterthought or merely as a checkbox item on an annual IT audit, the risks of losing critical data are significantly elevated. For these reasons, it is important to establish a disciplined regimen of data protection defined by a set of clear backup policies that can be closely followed and monitored by IT and business stakeholders alike.
What is a backup policy?
A backup policy is a pre-defined, set schedule whereby information from business applications such as Oracle, Microsoft SQL, email server databases and user files is copied to disk and/or tape to ensure data recoverability in the event of accidental data deletion, corrupted information or some kind of a system outage. The policies will typically have a default protection scheme for most of the servers in the environment, with additional policies for certain critical applications or data.
For example, a default backup policy for all application data may be a nightly backup to tape from Monday through Friday whereby one set of tapes is kept on-site to facilitate local recovery, and a second, duplicated set is sent off-site for storage in a secure location. Critical business data may be further protected by a super-set of policies. This might specify that, in addition to nightly
In general, backup policies typically consist of capturing an initial full backup of data onto disk and/or tape, followed by a series of intervening incremental or differential daily backups.
Regardless of which method is used, at a minimum, two backup copies should be maintained -- one to enable on-site recovery and a second copy for vaulting to a secure off-site facility. That way, if the data center were to be destroyed by a flood, fire or some other disaster, the off-site copy becomes the recovery copy of last resort.
For the purposes of this article, terms such as "incremental" and "differential" backup are being used generically. Be mindful that some vendors use these terms to describe entirely different backup methodologies.
The most obvious place to start building a policy is with a full backup. A full data backup consists of taking a complete copy of all the data on a particular host or set of hosts. If a data loss event occurs, the more recent the full backup is, the easier it will be to recover information. For this reason, some IT shops will run full backup jobs each night. In some larger environments, however, full backup jobs make take more than 24 hours to complete and will consume a lot of tape resources. Consequently, many data centers will typically run a full backup over the weekend and run either incremental or differential backups during the week to reduce both the nightly backup window and economize on tape media.
Incremental backups only back up the data that has changed since the last backup job. For example, a Monday incremental backup following a Sunday full backup will only back up the data that has changed since the Sunday full backup was completed. Likewise, Tuesday's incremental backup will only back up the data that changed since Monday's incremental was performed. If a full system, tape-based data recovery had to be performed on Thursday, it would require loading the Sunday full backup tape(s), along with all the incrementals from Monday through Wednesday, in order to obtain the most recent version of the information.
A best practice is to use separate, unique tapes for each nightly incremental backup job. This ensures some measure of local redundancy if a tape media cartridge happens to be defective or is broken in transport.
Differential backups, on the other hand, will back up all the data that has changed since the last full backup. For example, Wednesday night's differential backup would back up all the data that changed on Monday, Tuesday and Wednesday. In the same above recovery scenario, the Sunday full backup tape(s), along with the Wednesday differential backup tape set(s), is all that would be required to begin recovering the data.
Pros and cons
As anything else, there are pros and cons to each approach. Incremental backups can be completed fairly rapidly and only consume a small amount of backup space compared with either full or differential backups. This helps reduce backup windows and cuts down on disk or tape consumption. On the other hand, if it is an all-tape-based recovery, the process becomes a bit more complicated and time-consuming, because more tapes have to be loaded and scanned to process the recovery.
As described above, differential backups remove some of the recovery burdens that can occur when restoring from an incremental backup. However, if the application environment is subject to frequent data change on a daily basis, the backup window could become elongated. In addition, differentials will consume more backup resources, since each differential backup copy moves and stores all the changed data since the prior full backup.
Integrating disk into a backup architecture is an ideal way to remove some of the complexities from the backup-and-restore process, especially if data deduplication (dedupe) is embedded on the backup disk array. In this instance, there is no practical reason not to adopt a weekly full and daily incremental backup policy. For example, when restoring data from incremental backups saved on disk, it completely eliminates the need to swap multiple tape cartridges to process the recovery.
As previously stated, a good backup policy is to back up data to disk first and then move data off to tape as it ages. Purpose-built deduplicating disk appliances are ideal disk backup targets because they can be used with a number of different backup applications. For example, some appliances support traditional backup applications, as well as Oracle Recovery Manager. Most deduplicating backup appliances also support tape-out processes.
Most deduplication appliances can efficiently store more than 30 days of backup data on disk, enabling end users to perform the vast majority of recoveries directly from disk. Data that needs to be stored for long-term compliancy purposes can then be gradually moved off to tape.
Reliable backup policies
The purpose of backup policies is to ensure that there is a consistent and reliable method for recovering data. Ad hoc backup policies like providing a network file share for an end user to copy their data to can be a hit-or-miss (more likely a miss) proposition. Therefore, it is best for IT to take ownership of backing up all data. Otherwise, there is a very high probability that critical business data will be lost at some point in time and, most likely, IT will be held accountable.
Regular scheduled backups and well-defined, clearly documented backup policies bring more predictability to the recovery process backup administrators, and their successors will know where to recover the data from and the required steps to restore the data.
Regardless of the backup architecture deployed, establishing regimented and clearly defined backup policies is a good first step toward ensuring the consistent protection of business data.
This was first published in July 2013