Most discussions about data backup focus on data files, but when entire systems must be restored from scratch, much more is needed than just the data files. The data required for a complete server recovery includes system configuration, operating system files, and any application binaries as well as the data files. What makes a system-level restore, sometimes called a
Why is this such a problem?
The basic process to restore a system from scratch is to first load the operating system and system configuration information, then the application binaries, and then restore the data files. Doing this manually can take hours, and must be done for each system that's being recovered. For each of these servers, patches may have been applied to the operating system or applications and system configuration information may have been tweaked -- all changes that may need to be redone, provided you even have a record of them, thereby lengthening the restore process. If these changes aren't taken into account, the server you end up with after the rebuild may not accurately reflect the performance, functionality and configuration settings of the server it's meant to replace.
One way to address this problem is to regularly and reliably record all of this metadata so that if a bare-metal restore is required, the information necessary to accurately re-create the original server is available. Manually maintained spreadsheets that list patches applied and system configuration changes made by server are an error-prone way to do this that is in use in more shops than it should be. Moving to an automated way to collect and retain this information represents a significant improvement but is complicated by two factors.
First, the information necessary to re-create the baseline server configuration is often not visible to file-based backup products and can't be copied and packaged into a usable format by them. And second, because a different technology must be used to collect the pertinent information, it can't be stored in the backup catalog and must be stored in a separate database. Because of this, most server-level restores today require at least a two-step process: one to lay down the operating system and system configuration information, and a second subsequent process to restore the application binaries and data files.
Key requirements for bare-metal restore solutions
First and foremost, a BMR solution should provide a way to quickly, easily and accurately restore a given server image to another server. Second, it should provide options for restoring a server image to another server that doesn't match the original server's hardware configuration in every respect, giving end users cost-saving flexibility in performing restores to dissimilar hardware. Third, it should provide an automated way to collect the necessary information on a regular basis without impacting application availability on any production servers.
Because it's rare that any company has only one operating system environment, it would be nice if a single tool supported heterogeneous operating systems. This is not listed as a key requirement, however, because a BMR can be performed, and in fact has historically been performed, with platform-specific tools, but the availability of a cross-platform tool does make BMR administration potentially easier.
Bare-metal restore approaches in use today
Available BMR tools vary from operating system utilities designed to capture the necessary information to create a bootable image to automated tools that offer some level of integration with backup software. Most operating systems ship with a special utility that facilitates the creation of bootable images; e.g., makesysb for AIX, IgniteUX for HP-UX, KickStart for Linux, JumpStart for Solaris and Windows System Recovery for Windows. Symantec Corp. Ghost is another popular tool in Windows environments, although it must be purchased separately.
Most file-based backup software products offer some sort of BMR capability that allows for the creation of bootable images, although it is generally only minimally integrated with the file-based backup product itself and may have to be licensed separately. These products can offer a number of nice features, including greater automation, recovery to dissimilar hardware platforms, cross-platform support and recovery server pooling. All of these utilities and products require a process separate from file-based backup to create bootable images, but in general offer the highest levels of functionality in these types of products. Representative vendors with these offerings include BakBone Software Inc., CA, CommVault, EMC Corp., IBM Corp. and Symantec, among others.
Image-based backup products address the "integration" issue by offering the option to capture all data, whether it is operating system, system configuration, or file-based, at the block level. The advantage to this approach is that all of this information can be collected in a single pass, providing an updated BMR capability with each file-based backup. Products in this class, such those from Acronis Inc., StorageCraft Technology Corp. and UltraBac Software can perform these backups online, perform dissimilar hardware restores, enable file-level restores from their image-based backups, recover servers remotely across WANs or LANs, allow backup images to be saved to a variety of different media, and often support encryption. Most offerings in this space support Windows only and are targeted for use by small and medium-sized businesses (SMBs), although Acronis also supports Linux. These products can generally also be used in conjunction with other file-based backup products just to provide the BMR capability.
The utility and product options discussed so far all help companies maintain current BMR images for restore purposes, assuming they are changing over time. But another approach in use by large enterprises -- such as the New York Stock Exchange (NYSE) -- is to define "golden images," bootable images that have been thoroughly tested and represent the corporate standard for a particular system type, and commit only to providing system-level restore capabilities built around those images. This form of strict change management encourages compliance because any "rogue" changes made to operating system or system configuration information on individual servers will be lost on recovery.
Rich Llewellyn, a system administrator at NYSE, explains: "For our production environments, we use third-party validation software to validate system configurations automatically on a regular basis to ensure that no unauthorized changes are being made to any of our servers. Any system-level recoveries are done using the golden image."
While BMR software products are in use in almost every enterprise, they tend to be deployed only for certain servers. Despite the fact that they generally offer less functionality, operating system utilities are much more widely used to create and maintain bootable images. Enterprises' mission-critical servers are often clustered for high availability, so being able to rapidly perform a system-level recovery isn't as critical as one might expect for key server resources. Still, these products do make it easier to create bootable images and perform system-level restores, and as they become more integrated with enterprise backup software products we may see increased adoption in the industry.
About this author: Eric Burgener is a senior analyst with The Taneja Group. His areas of focus include data protection, disaster recovery, storage capacity optimization and archiving.
This was first published in September 2008