Managing and protecting all enterprise data


Big three apps adjust to disk-based backup

EMC's NetWorker, IBM's Tivoli Storage Manager and Symantec's Veritas NetBackup are still the leaders for enterprise backup. But as more and more shops back up to disk, the big three have had to adapt to the new requirements of disk-based backup.

With disk playing a bigger role in backup, the three major enterprise backup programs--EMC's NetWorker, IBM's Tivoli Storage Manager and Symantec's Veritas NetBackup--are undergoing radical changes.

As disk rapidly becomes the preferred initial backup target, vendors of the three big backup programs--EMC Corp.'s NetWorker, IBM Corp.'s Tivoli Storage Manager (TSM) and Symantec Corp.'s Veritas NetBackup--are scrambling to enhance and change the focus of their programs. Never before has a shift of such titanic proportions affected the product development of these three dominant players which, until now, have been slow to change.

Of course, the most widely used backup software products have always provided some disk support, but vendors recognize the need for significant product upgrades to take advantage of disk's lower costs and unique restore capabilities (see "Product roadmaps"). EMC's forthcoming NetWorker PowerSnap RecoverPoint module enables central management of EMC's continuous data protection (CDP) product; IBM's TSM advanced copy services for Exchange allows users to tap into Microsoft's Volume Shadow Copy Service (VSS) in Microsoft Exchange environments; while NetBackup's new PureDisk technology adds single-instance storage for remote-office protection (see "Noteworthy new features").

There's no question there will be some major bumps in the road for users as the movement from tape to disk accelerates. And they'll have reason to be wary. Some Symantec NetBackup users have been reluctant to upgrade to Version 6.0 because of the major code revisions. EMC's acquisition of Legato led it to provide more snapshot integration with EMC's storage product lines, but left existing NetWorker users with heterogeneous storage environments out in the cold. And IBM is showing little evidence it will support other vendors' disk storage products.

For example, Steve Shim, director of technical services at Health First, Rockledge, FL, was forced to look beyond IBM's TSM because he found its 24-hour recovery time unacceptable.

And it took three weeks for Arun Sondhi, the storage management group lead at a Milwaukee manufacturer, to integrate NetWorker with Sun Microsystems Inc.'s StorageTek ACSLS Manager. Backing up his servers behind a corporate firewall required the purchase of another NetWorker server because opening 25,000 TCP/IP ports on the firewall posed a huge security risk to the organization. His "reliance on backup software has become so big that if it sneezes, the CIO hears it," says Sondhi.

In response to user requests for new features, EMC, IBM and Symantec are:

  • Increasing their support for faster backups and recoveries
  • Enhancing support for encrypted data
  • Offering better ways to protect remote offices

Disk-based backup
Backing up to disk dramatically improves backup times, usually in the 30% to 50% range or greater. Aaron Mathes, chief operations officer for information services at Liberty University in Lynchburg, VA, finds that disk provides him with a higher degree of confidence that his backups are completing successfully. "Disk has had an exponential impact," he says, adding that he backs up 90% of Liberty University's 4TB of data to a disk cache.

The EMC, IBM and Symantec products all include the ability to manage a disk cache (it's an optional feature for NetWorker), a disk volume where data is initially parked before being moved to tape. Disk caches can be shared drives on an Ethernet network or a Fibre Channel (FC) SAN drive owned by the backup server. When using a disk cache it's important to:

    • Select volumes large enough for each server to keep a week's to a month's worth of backups online to expedite recoveries. NetWorker 7.3 has a wizard that assesses the size of each server's backup and required retention period, and helps administrators select a disk volume large enough to meet those requirements.

    • Determine what copies of the disk backup need to be written to tape and when. For instance, if a server does incremental daily backups and full weekly backups to the disk cache, you may opt to copy only the full weekly backups to tape to conserve tape and schedule the copies during low-traffic backup times to lessen their impact.

  • Use a watermark, which deletes or transfers backups from the disk cache when a certain level is reached, for example, 90% of disk capacity.

There are three basic ways--high watermarks, time based and manual--in which vendors allow users to manage disk cache threshold levels, although each feature is not in every program. Another major way traditional backup programs are changing is in their support for virtual tape libraries (VTLs). A VTL presents its disks as virtual tapes, and its disk array FC or iSCSI ports as virtual SAIT, SDLT or LTO tape drive images to the backup software. VTLs are easier to implement in the sense that backup software treats VTLs like physical tape libraries--it will detect and manage virtual tapes and tape ports the same way it does with tape libraries. But users may pay extra for VTLs; they cost more on a per-megabyte basis than similarly configured generic disk arrays, and EMC and Symantec charge an additional software license fee to manage VTLs.

Backup software vendors license their VTL software by virtual tape ports or VTL capacity. Symantec initially licensed NetBackup for VTLs the same way it did for a tape drive--by each virtual tape drive. Because a company may use tens--if not hundreds--of virtual tape drives, this licensing approach quickly becomes cost prohibitive. For instance, using Symantec's old model, licensing for 12 virtual tape drives on a 20TB VTL cost $60,000. Symantec now offers NetBackup licensing based on total VTL storage capacity. With the new model, the licensing cost for the 20TB VTL is only $20,000 ($1,000 per VTL terabyte); however, users still need to monitor VTL capacity growth to control future costs.

In addition to having to adjust to different licensing costs, some VTL users are finding that their backup bottlenecks are moving from the tape device to the server. As the speed of backups increase, the VTL puts more demands on the server to ship data to it faster. Ian McLeavy, manager of global enterprise storage at Black & Decker in Baltimore, implemented five EMC VTLs that lowered his NetWorker backup times from 11 hours to seven hours. But because of the increased CPU activity, says McLeavy, "the backup bottleneck has moved to the server."

Product roadmaps
EMC Corp. NetWorker: NetWorker 7.4, to be released within the next year, will include the following new features:
  • NetWorker will manage all TCP/IP ports opened through a firewall. These ports will be locked and controlled by NetWorker to prevent other programs from using these ports and compromising the firewall.
  • The PowerSnap module for the EMC RecoverPoint continuous data protection product will be released in April.
  • Over the next year or two, all of the different NetWorker modules will be able to be managed from a central console.
Symantec Corp. Veritas NetBackup: The NetBackup 6.5 release is planned for the second half of this year. Some features Symantec plans to include or enhance are:
  • Enhanced support for virtual tape libraries.
  • A new shared disk option that allows SAN-attached disk volumes controlled by the master server to be assigned to client servers, which is similar to how Symantec's shared tape option now works.
  • Improve the ability to restore from file-system snapshots so that only a single file is restored rather than the entire file-system snapshot.
  • Integrate the PureDisk data de-duplication technology into NetBackup.
  • Over the next year or two, integrate the data classification engine in Veritas Enterprise Vault with NetBackup to allow users to classify and categorize data on new backups, as well as classify data on old backups.
IBM Corp. Tivoli Storage Manager: Declined to provide information about the upcoming features it plans to offer in the next year or two.

Instant backup and restore
An offshoot of the growth of disk-based backup is the increased interest in backup and recovery technologies such as CDP and snapshots. NetWorker and TSM offer snapshot and replication options that support primarily array and virtualization technologies sold by their respective companies. TSM lets users execute instant restores and backups using IBM's TotalStorage DS6000 and DS8000 storage arrays or SAN Volume Controller (SVC). Similarly, EMC's NetWorker PowerSnap modules integrate predominantly with EMC's Symmetrix and Clariion storage arrays. Prior to EMC's acquisition of Legato, NetWorker offered PowerSnap modules that supported older IBM and Sun storage array models; going forward, the PowerSnap modules won't be upgraded to support non-EMC storage arrays.

With NetBackup 6.0, Symantec introduced Advanced Client, which allows the central management console to identify the host-, network- or array-based snapshot options available to the client server. It also grants the storage administrator the ability to remotely configure snapshots on that server.

But don't assume a backup program containing a wizard-like snapshot option will work out of the box. There are a number of tasks required to get some wizards to work in complicated storage environments.

For instance, NetWorker PowerSnap modules are licensed by specific storage devices, so a future storage array change requires changing the host software and licensing. If you're using a Clariion, you must first verify that it has up-to-date firmware; because Clariion supports two snapshot methods, you must then choose the type of snapshot to use--copy on write or split mirror. The next step is to verify that the Clariion contains sufficient storage space for the desired type of snapshot. Finally, you must install EMC's Navisphere Host Agent, Navisphere CLI, PowerPath and NetWorker client software on the host before a snapshot is created. While these steps vary in complexity according to the backup software product, both TSM's and NetBackup's snapshot modules require similar steps.

McLeavy decided not to use snapshot modules. Instead, McLeavy scripts snapshots using SYMCLI commands because NetWorker didn't offer a PowerSnap module for the Tru64 operating systems he used at the time. While McLeavy is now moving from Tru64 to AIX, he still has no plans to purchase the PowerSnap Symmetrix modules because "the modules are a little pricey, considering we already have a working configuration."

CDP and virtualization
CDP and network-based virtualization technologies provide specific benefits that array- and host-based options don't. EMC's RecoverPoint CDP product is based on Mendocino Software's technology and enables users to recover a server to any point in time. Unlike snapshots, users can pick any previous point in time, down to the second, for their recovery point.

Backup software:
Core features
Click here for a comprehensive list of core features for backup software (PDF).

EMC's forthcoming NetWorker PowerSnap module for RecoverPoint will integrate with EMC's RecoverPoint technology and give users a central management interface to manage regular backups and RecoverPoint images. Even though the ability to create application-consistent views for instantly recoverable databases at user-defined points in time is already available in RecoverPoint, this new module lets administrators manage these images and backups under the same interface with the same user logins.

TSM and NetBackup allow end users to capture desktop and laptop data, with TSM also extending its support to the server level. Tivoli's Continuous Data Protection for Files and Symantec's NetBackup Desktop and Laptop Option use agents that create an initial image of the data on each PC. Copied writes are then sent immediately to the central NetBackup or TSM server. If the network connection to the server, desktop or laptop is offline, the writes are held in cache and then sent when a network connection is established.

Backup software:
Disk backup options
Click here for a comprehensive list of disk backup options (PDF).

However, CDP backup technologies create some issues. Because each write I/O must be copied, server performance could be affected. Secondly, CDP products don't by default create application-consistent images for databases, so administrators must make sure checkpoints are taken from time to time to ensure application-consistent images. And vendors are just beginning to integrate CDP into their main backup software management console; until the integration is complete, you must manage CDP through a separate interface and perhaps pay a separate licensing fee.

IBM's SVC, its network-based virtualization appliance, gives users another way to support a multivendor storage environment with only one snapshot technology. Snapshot modules from NetBackup and TSM can be used with IBM's SVC. It works in the same way that array-based integration works except that SVC supports different vendors' storage arrays on the back end, eliminating the need to buy all storage from one vendor to gain snapshot functionality. This gives you the flexibility to deploy any vendor's storage array to host snapshots. While the initial setup and configuration process for EMC's RecoverPoint and IBM's SVC appliances can be labor intensive, once the initial setup is completed and documented, the configurations on the initial host can be replicated more easily to other hosts in the storage environment.

Noteworthy new features
While integration with disk is a huge focus for the major backup software vendors, other new product improvements include:

EMC Corp. NetWorker 7.3
Directed recoveries. Allows the backup server to restore backed up data to a different SAN-attached client than the one from which the backup was created--useful in the case of server failures or for disaster recovery.

Clone-retention policies. A clone is a copy of a completed backup job that's created for offsite storage. NetWorker 7.3 gave storage administrators the ability to set different retention polices for clones made to disk than for clones made to tape.

IBM Tivoli Storage Manager (TSM) 5.3.1
TSM Express. Targeted at the small- to medium-sized market, this is a disk-to-disk backup product (with the ability to make copies to tape) that lacks the archive or storage management capabilities normally found in TSM. Initially for Windows; Linux support to follow.

Archive management. Holds data called for by a judge until the data is released by the court. There's also an event-based trigger that, for example, can be set to wait 30 years from the time of an accident until employee data can be deleted.

Symantec Corp. NetBackup 6.0
Cold-metal restores. Allows restores to a different set of hardware than where the image was originally made. For instance, if a network interface card (NIC) on the server being used for the restore is different from the original host, the software will detect the different NIC and install the appropriate drivers for it.

SharePoint integration. Only NetBackup integrates with Microsoft's SharePoint.

Catalog backups. Online full or incremental NetBackup catalog backups are now possible.

Securing the data
When disk becomes the primary backup target, tape can assume its more appropriate role as the medium for portability and long-term data protection. Storage managers are also increasingly interested in encrypting data stored on tape. Liberty University's Mathes, for example, hopes to start encrypting data on tape in the next six to nine months.

For users who now wish to encrypt data, the three big backup software vendors offer varying levels of client-side encryption. While NetWorker and TSM each include encryption with the core product, NetWorker supports only the 256-bit AES option, while TSM is limited to 56-bit DES and 128-bit AES. Symantec users will need to purchase NetBackup's Encryption Option, which includes four encryption levels: 40- and 56-bit DES for customers with legacy encryption needs, and 128- and 256-bit AES that satisfy the more current, stringent U.S. and corporate encryption standards.

NetWorker 7.3 lets each client set a pass-phrase that's used as the key to lock/unlock encrypted data. However, the pass-phrase is needed to recover the data, and only servers in the NetWorker zone in which the pass-phrase was created can recover the data. Pass-phrases in TSM 5.3 are retained by the client and only that client can recover the data. Tricia Jiang, Tivoli's technical attache, warns that client encryption processing will impact that server's performance during backups.

But none of these vendors offers a way to centrally manage encrypted data because there's still considerable debate as to where encryption should occur--at the host, network or tape drive level--and who should handle the key management. Managing pass-phrases at either the client/server level or within the zones to which specific servers belong is problematic, as the individual or group who created the encrypted backup needs to maintain and protect the pass-phrase list to ensure they can recover the data at some future point in time. Most users appear willing to wait for industry standards to emerge before widely deploying encryption. "Encryption is definitely not something you just turn on," says Liberty University's Mathes.

Single-instance storage
Another new backup technology that users are viewing cautiously is single-instance storage. Single-instance storage is a data de-duplication technology that stores the same blocks of data together and then creates meta data that lets the blocks be reconstructed. There are no industry standards for data-reduction technologies to date, and data recoveries are potentially time consuming and management intensive. "Most vendors have implemented [data reduction] in a proprietary fashion and charge an absolute premium for this technology," says Health First's Shim.

Symantec is the first and only one of the major vendors to introduce and offer a single-instance storage backup option targeted at the remote office. PureDisk Server is NetBackup's data-reduction technology. Users may install a PureDisk client or server at each remote office to back up the office's data to a PureDisk master server in the data center. Once an initial backup is complete, only changed blocks are captured and sent back to the central office, greatly reducing the amount of stored data and the bandwidth required between the two offices.

However, the major backup software vendors are just starting to respond to the significant changes that cheap disk is bringing to backup and restore. Their products require major changes to stay relevant and support user needs for data security and remote-office protection. Although users shouldn't be in a hurry to abandon their current backup software products, viable alternatives from new startups are rapidly emerging. Stay tuned.

Article 2 of 18

Dig Deeper on Disk-based backup

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

Get More Storage

Access to all of our back issues View All