Published: 10 Mar 2003
The millions of dollars thrown at backup solutions haven't cured the main problems of slow or failed restores, shrinking backup windows and backups running too slow or too long. Backup challenges are growing daily. New data retention, archiving and vaulting requirements go well beyond the midnight start time of backups some organizations are accustomed to.
|Analyzing your environment|
To start identifying the right backup software product for your organization, implement a two-step process:
First, take a business focus. Start by identifying the requirements of your applications. For instance, a midtier organization with 500 end users and 30 to 40 servers may conclude a midtier backup software product would meet its needs. But if the mix of applications includes an online transaction processing (OLTP) application with an Oracle database backend that has no backup window, that midtier backup software solution by itself may or may not have the right features to accommodate that requirement. Conversely, a midtier environment that does primarily file services and observes business hours may find a low-end, low-cost backup software solution a perfect fit.
Second, inventory your current environment. Inventorying the existing backup software and hardware components comprises the other half of the equation to get a handle on the problem.
As challenging as this discovery process may seem for midtier and small groups, this process gets much more complicated in enterprises. These organizations frequently find themselves mired in an environment with competing backup solutions, no central backup administrator and management hoping the problem will cure itself without their intervention. Unfortunately, trying to gain consensus at the grass-roots level rarely works in these environments. Usually only decisive, informed recommendations coupled with a mandate at the executive level can straighten the backup software situation out.
Software vendors are well aware of this problem and backup software solutions exist for all organizational levels (see "Sizing up the software"). The major players in the backup software space (Computer Associates, EMC, Hewlett-Packard, IBM, Legato Systems Inc., and Veritas) each bring different philosophies and techniques to the market on how to tackle backups. In addition, hardware vendors such as Hitachi Data Systems (HDS) and Network Appliance have products that complement or enhance the offerings from the major backup vendors.
Some of the major differentiators between the players include how each of these companies manages data and how their backup software performs backup. The major question for someone choosing a solution--of course--is how to pick a product that best fits into your storage infrastructure (see "Analyzing your environment," this page). What follows next is a short rundown of the major backup programs, along with a detailed matrix that lists the pros and cons of product features (see "Major product features").
IBM Tivoli Storage Manager
IBM's Tivoli Storage Manager (TSM) provides a backup software solution that most completely addresses multiple OS environments. TSM runs on the mainframe and open systems platforms, and its ability to manage different OS platforms through a single product will appeal to companies wishing to consolidate all of its backup software products.
The design of TSM differs from the other major products on the market. When TSM does an initial backup of a server, it does a full backup of the data. Once that initial full backup completes, TSM does what Patricia Jiang, technical attachÉ for Tivoli Software, calls progressive backups. Progressive backup is another way of saying: "First do a full backup, then incrementals forever." TSM tracks all new and changed data from the initial full backup and backs up only the changes to either disk or tape.
Jiang contends TSM provides two distinct advantages for end users--it uses substantially less tape resources because TSM reclaims disk or tape space when data expires by deleting it, and reuses the reclaimed space for new data. Despite these advantages, TSM requires a solid understanding of how the software works as well as a good grasp of the environment into which it will be deployed. Restores can be gruesome in TSM, but they can be gruesome with any backup software product.
The performance of any restore--TSM or otherwise--will largely hinge on the age and size of the file, the size of the tape library and even the number of tape drives in the tape library. All of these factors contribute to the time of the restore. But for enterprises that can execute and deploy TSM, it should generate significant savings when properly implemented and deployed.
|Sizing up the software|
Almost without exception, every major backup software vendor refers to its feature product as enterprise class backup software and refers to other products that complement it as midtier or entry-level backup software. Veritas provides an excellent example of this. For what it classifies as their enterprise class, it offers its NetBackup product. In the midtier market, it offers Backup Exec and for the user running software on an individual workstation, provides a link to Sonic Solution's Backup MyPC product.
Yet the definition and scope of each of these products vary from company to company. Computer Associate's David Liff says these different definitions of tiers of backup software are largely irrelevant to CA. The only definition of enterprise that matters to CA in the field is the one the customer uses--whether that means one server or 10,000 servers--and that's the one they use for that account. IBM takes yet a different view, saying that Tivoli Storage Manager can meet the needs of any size organization.
Unfortunately, these definitions do more to obfuscate the terms than clarify terms. While no clear or industry standard definitions for enterprise, midtier or entry level exist, here are some general guidelines:
Entry Level: Will generally only backup the workstation on which it resides. Usually intended for the individual Windows or Linux desktop. Samples of products include: Sonic Solutions Backup MyPC and Peer Software's PeerSync.
Midtier: Characteristics of this class include the ability to do backups of multiple servers of varying OS platforms supporting different vendor's databases. This tier usually provides a central management console permitting administrators to connect to the individual servers to configure the backup software for backup functionality. They generally feature the ability to back up Unix and Windows platforms as well as providing agents to do hot backups of the databases residing on the individual servers. Products that typify this class include Veritas' Backup Exec and CA's ARCserve.
Enterprise: This tier generally contains a subset of products supporting the features of the other two tiers, plus offers a variety of advanced functions that may vary significantly from vendor to vendor. Advanced functions include vendor-dependent or independent snapshots, advanced tape media management functionality, the ability to integrate with SAN or NAS storage devices and the ability to do policy-based storage management. Products that typify this category include Tivoli Storage Manager, Veritas NetBackup, and CA's BrightStor Enterprise Backup.
According to Jerry Hoetger, Veritas' senior manager of product marketing, NetBackup differentiates itself from its competitors in four important ways. First, NetBackup can stream data from multiple backup jobs to a single tape drive, or in the case of large backup jobs from a single server, spread the backup job over a number of tape drives, increasing the speed of the backup.
Second, NetBackup delivers a three-tier architecture. The first tier is called Master Server, and acts as the operations center for the product and schedules and tracks client backups operations. The second tier permits an organization with large databases to back them up on the server where they reside, while also enabling them to back up other clients systems on the network. The third tier is the client agents that back up server and workstations.
Thirdly, NetBackup differentiates itself by offering options to utilize the latest storage technologies. In the storage area network (SAN) space, for example, it offers a shared storage option (SSO), which keeps backup traffic on the SAN and reduces the backup traffic that normally would be introduced into the LAN environment. In the network-attached storage (NAS) space, it offers a network data management protocol (NDMP) option that controls backup and recovery functions for NAS systems supporting NDMP.
NetBackup 4.5--the latest edition--offers a fourth differentiator: Global Data Manager. This provides a GUI that shows a single view of the entire NetBackup backup and recovery environment, provides real-time reports and lets an administrator drill down to a specific location anywhere in the world.
CA's BrightStor Enterprise Backup
CA's BrightStor Enterprise Backup also offers key features such as data staging. Though some other vendors offer this feature, data staging is becoming increasingly important because it gives administrators the option to move data to a secondary location before moving it off to tape. Unlike NetBackup's similar option, this feature works independent of a Unix or Veritas file system. Data copies can be scheduled to occur to minimize server and application performance hits while increasing the availability of data before offloading it to tape.
In SAN environments that include Windows and Unix platforms, BrightStor Enterprise Backup enables cross platform device and media sharing. This allows administrators to store both Unix and Windows data on the same tape media. It also allows the sharing of tape libraries and other SAN storage devices without the need to dedicate these devices to one OS.
Nipping at the heels of these three vendors with their backup software products are new and existing players such as CommVault, Innovation Data Processing (IDP), and Legato. CommVault's Galaxy 4.1 recently grabbed a Gold award for Storage magazine's "Best Storage Products of 2002" with two cutting edge features. One was its ability to view all backup media from a logical view as opposed to a physical device view that many of its competitors do. The other was how it backed up to disk. Unlike many of its competitors that back up data sequentially, and consequentially will run slower when backing up to disk, it backs up data to disk randomly, thereby capitalizing the inherent strength of disk.
While a disk performs better when data is randomly scattered on the disk, just the opposite is true with tape: It performs better when the data is laid out sequentially. For example, take the following two number sequences: 2 4 1 7 9 3 6 5 8 and 1 2 3 4 5 6 7 8 9--a disk would read the first set of numbers faster than tape.
IDP offers the FDR/Upstream product. It's one of the few other products on the market that offers organizations the ability to manage their open systems and mainframe backups with a single product. It includes extensive support for Windows, Novell, Unix and Linux platforms and it may even be configured to use ESCON and FICON connected network channel cards that provide a mechanism for connecting open systems to IBM mainframes.
Legato also offers a variety of other options in addition to its core Networker product to ensure uptime and availability. Its add-on Octopussy product provides real-time data replication for Windows servers without the use of specialized or proprietary hardware. Its SnapImage module provides the ability for Sun Solaris and HP-UX operating systems to use NDMP to perform block-level image backups for large file servers.
|Major Product Features|
Any effective medicine comes with its side effects, and backup software is no exception. Large organizations usually own a smattering of backup software products: a little Veritas NetBackup, some Tivoli Storage Manager and a dose of Legato Networker. As a result, only the most basic backup and restore features get utilized. Because it takes a long time to get to know the ins and outs of complicated software, the more advanced features and the corresponding savings they offer end up being overlooked or underutilized by the administrators responsible for managing this array of products.
David Liff, VP of storage solutions at CA, points out that the quality of data varies greatly from platform to platform. CA found that 85% of the data in the mainframe environment and 65% in the Unix environment meet its definition of quality data. Yet in the Windows environment, only 10% to 15% of data meet this definition. While obviously these percentages won't hold true in all environments, they highlight an important point: Not all data is created equal and the degree of effort and money spent on backing up these different qualities of data should not be viewed the same either. To address these data management challenges, new tools and techniques are appearing on the scene to meet users' needs in these environments.
CA took a stab at making its products easier to use by borrowing an idea popularized by Microsoft, the time-tested wizard. CA includes wizards in its Enterprise Backup product allowing administrators to set up and schedule backups, restores, check job status and device management operations easily and quickly.
Other software companies are making products that help manage a mixed environment of software products. For example, Fujitsu Softek's Storage Manager provides a single policy-based interface permitting the administrator to define the backup needs for each application or server. Once defined, Storage Manager generates the scripts to start the backup process, regardless of the backup software product used.
These scripts enable functions ranging from setting up a simple midnight start time backup to more complex operations such as pausing a database, taking a volume level snapshot, resyncing the database and then starting the backup. This software alleviates one of the difficulties existing in backup environments today: the requirement for administrators to write and test specialized scripts to support these different backup software functions.
Currently, Storage Manager interacts with Legato's Networker--which it OEMs--but their next release due in mid-2003 is scheduled to provide this functionality for Veritas' NetBackup, Tivoli's Storage Manager and CA's BrightStor Enterprise Backup from one interface.
|Why the pain?|
Companies today are struggling to manage data associated with a slew of new applications that in many cases didn't exist in even as little as five years ago. The types of these applications range from 24x7 online transaction databases (OLTP) applications to spreadsheets to the growing use of imaging. This diverse bag of applications creates a complex environment that requires at least a general understanding of each application to match it with obtain the right backup solution.
Going hand in hand with backup comes the necessity to restore data in the event of error or data loss. The requirements of restores vary as widely as that of backups in that they may range from recovering the data from a snapshot to a best-faith effort to recover the data from tapes archived off site.
Backup reporting and performance monitoring remains an often neglected component in backup software administration. This stems from a lack of tools, time and information about the backup environment. While companies may assume their administrators monitor backups closely, reality paints a much different picture.
Today's quick fixes
To address the growing need to eliminate backup and restore windows, technologies such as windowless backup and instant restores have arisen. To facilitate this, hardware and software storage vendors are cooperating to enable this functionality. Vendors such as EMC, HP, HDS, and IBM now provide the APIs of their storage arrays to vendors such as CA, Veritas, and Legato that lets these products to snap the data to the same vendor's disk arrays and restore it again from the snapshot.
Backup software vendors use these hardware vendor's APIs and automate the functionality through their management consoles. This eliminates the need for the end user to learn how to use and program these specialized functions into the software. Learning the ins and outs of a major backup product is a time-consuming task.
In the area of reporting on backup failures and successes, start ups like Bocada have answered the call. They provide a central interface to monitor the success and failure of backups of multiple vendors' products without putting an agent on every device on the network (see "Bocada clarifies backup picture").
Another challenge administrators face is when they try to document and understand the environment they need to manage. Here's where more advanced SAN Management tools such as AppIQ's Manager, EMC's Control Center, or Veritas' SANPoint Control can help users see what data lies where in the SAN as well as visualize and identify problems within the SAN environment that the backup traffic traverses.
These products help pinpoint problem areas in LAN and SAN environments, such as data traffic contention and performance issues. They gather and analyze data from the disparate storage devices, operating systems, applications and switches or directors, and report their findings in simple to read formats.