Home > Data Backup Tips > Backup and recovery > The pros and cons of globally deduplicating data backup appliances
Data Backup Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

BACKUP AND RECOVERY

The pros and cons of globally deduplicating data backup appliances


Jerome M. Wendt
06.13.2008
Rating: -2.50- (out of 5)


Data backup technical tips
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


Data deduplication appliances are often deployed under the premise of expediting backups while keeping large amounts of data on disk. But as retention periods for data grow and companies look to keep their archived and backup data stores together, individual data deduplication appliances can create new sets of scaling, scope and management problems. To overcome these issues, global deduplication is emerging as a way for companies to achieve higher data reduction ratios and use less storage capacity while giving companies more flexibility to keep data online for longer periods of time.

Appliances that provide global deduplication can provide the following key benefits:

  • Create smaller storage footprints
  • Decrease network bandwidth requirements for data replication
  • Eliminate data silos
  • Lower storage costs
  • Simplify and centralize the management of deduplication appliances

The choice of any data deduplication appliance depends on the amount of data that a company anticipates storing and how many geographic locations it needs to protect. Companies with only one location with combined deduplication archive and backup stores of fewer than 20 TB will find that almost any deduplication appliance will meet their needs. It is when companies grow their deduplicated data stores beyond 20 TB or need to protect multiple sites that the need for a backup appliance that supports globally deduplicated data stores becomes more evident.

Global deduplication for ROBOs

The most prevalent way that global deduplication is implemented is as part of a company's scheme for protecting remote and branch offices (ROBOs). Configured in a hub-and-spoke architecture, deduplication appliances are deployed at each ROBO, usually with a larger, master deduplication appliance located at the home office.

The global deduplication only occurs after the data in each ROBO is backed up, deduplicated and stored on the appliance at its site. At regularly scheduled intervals, either nights or weekends, the deduplicated data at each ROBO is replicated back to the master backup appliance in the home office.

To minimize the amount of data replicated back to the home office, an index of the deduplicated chunks of data at the ROBO is first sent to the master backup appliance in the home office. The master backup appliance then compares this list to its own, larger index to identify which chunks of data it already has in its data store.

After this is completed, the master appliance creates a list of the chunks it doesn't have and sends that back to the appliance at the ROBO. This helps to minimize the amount of data that needs to be sent, the amount of network bandwidth required and the length of time it takes to complete. Products that support these types of hub-and-spoke global deduplication configurations include Data Domain's DDX Arrays, EMC Corp.'s new 3D disk libraries, NEC's Hydrastor and Quantum Corp.'s DXi-Series of backup appliances. Appliances from ExaGrid Systems Inc. and Sepaton Inc. have similar but more limited global deduplication features.

It's important to note that, in enterprise environments, global deduplication appliances can have capacity and performance limitations. These limitations may make themselves evident in the following ways:

  • The amount of data to deduplicate exceeds the capacity of the master backup appliance. In these circumstances, companies may need to purchase a larger appliance and migrate all of the data to the new appliance or purchase a second appliance. If a second backup appliance is purchased, verify it can access the deduplicating index created by the first index. If not, it needs to start deduplicating data from scratch. This creates a separate data silo and recreates the problem that global deduplication was initially intended to solve.

  • The master backup appliance has insufficient processor and memory to support all of the replication and global deduplication functions. The master backup appliance may have to concurrently receive and deduplicate data from ROBOs while handling incoming backup streams and deduplicating them in the home office. The combination of managing all of these jobs on a nightly basis could extend backup windows while slowing the deduplication and replication of data of edge appliances to the master appliance.

Global deduplication allows companies to consolidate and centralize their deduplicated data. However, all data deduplication appliances are not created equal. Companies looking to deduplicate and then replicate data from their ROBOs back to a central site need to ensure that the appliances in the central site can scale in performance and capacity to meet their global deduplication needs and provide the granularity of control that they need when replicating data. However, properly implemented, global deduplication can centralize and enhance corporate-wide data management, protection and recovery options while minimizing data stores and related storage costs.

About the author: Jerome M. Wendt is lead analyst and president of DCIG Inc.


Rate this Tip
To rate tips, you must be a member of SearchDataBackup.com.
Register now to start rating these tips. Log in if you are already a member.




Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google



RELATED CONTENT
Backup and recovery
SQL Server data backup and recovery best practices
Secure your data backups with encryption key management best practices
Using data deduplication with backup applications: Source vs. target dedupe
Data backup for virtual machines: Alternative methods to VMware Consolidated Backup
Upgrading from LTO-3 to LTO-4 tape for data backup and recovery
Is VMware Consolidated Backup right for your enterprise?
Is cloud data backup service right for your organization?
Are data backup vendor certifications valuable for backup administrators?
Choosing a Linux system backup tool: Pros and cons of popular Linux backup apps
Dedupe dos and don'ts: Data deduplication technology best practices

Data reduction and deduplication
Data archiving reduces data backup workload prior to data deduplication
Arkeia takes aim at EMC Avamar with Kadena Systems data deduplication IP buy
Data backup and recovery news briefs: Druvaa Software updates flagship product, releases inSync v3.1
Data backup and recovery vendors dig into deduplication technology, aim for cloud backup
Data backup and recovery news briefs: Data Domain upgrades data deduplication appliances
Using data deduplication with backup applications: Source vs. target dedupe
Quantum launches midrange data deduplication backup appliances
Data deduplication software trends in backup and recovery
BakBone phasing out virtual tape library, adds data deduplication with NetVault Backup 8.5
EMC's Slootman: No data deduplication for Disk Library virtual tape library

Data storage backup tools
Data backup and recovery news briefs: Thales Group releases CryptoStor Tape 3.0 appliance
Data archiving reduces data backup workload prior to data deduplication
Symantec releases Linux version of Backup Exec System Recovery
Data backup and recovery news briefs: Druvaa Software updates flagship product, releases inSync v3.1
SQL Server data backup and recovery best practices
Data backup and recovery vendors dig into deduplication technology, aim for cloud backup
Veeam integrates with VMware vStorage APIs in Backup and Replication 4
Data backup and recovery news briefs: Data Domain upgrades data deduplication appliances
Double-Take replication software solves remote-office data backup headache for Lennox International
Using data deduplication with backup applications: Source vs. target dedupe

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary

DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



Enterprise Backup Solutions - Continuous Data Protection (CDP)
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2008 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts