Global deduplication and backup: A primer

Learn about global deduplication and backup in part two of our data deduplication tutorial.

By Dave Raffo, Senior News Director

Now that data deduplication is rapidly becoming a mainstream feature in data backup and recovery, a big question for any data dedupe product is, "Does it do global deduplication?"

Global deduplication reduces backup data across multiple devices and sites, so the devices act as one large system. The alternative is local dedupe, which reduces data on each device individually. Global deduplication becomes more important to your data storage environment when you have more data to back up and use more devices because it can often improve deduplication ratios. However, one of the biggest benefits of global dedupe is the ability to efficiently manage multiple devices. For instance, global dedupe allows load balancing and high availability.

Data deduplication technology tutorial

Data deduplication technology tutorial: A guide to data deduping and backup in the enterprise 

Global data deduplication and backup: A primer

Choosing data deduplication product: Hardware and software offerings

Data deduplication best practices: Inline vs. post-processing dedupe

The major data backup software applications with dedupe today offer global deduplication. These include Asigra Inc. Cloud Backup, EMC Corp.'s Avamar, CommVault Simpana and Symantec Corp. PureDisk.

Exagrid Systems EX Series, Falconstor’s File-interface Deduplication System (FDS) and Virtual Tape Library, Hewlett Packard (HP) Co.'s Virtual Library System (VLS), IBM Corp. ProtecTier, NEC HydraStor and Sepaton DeltaStor support global dedupe on virtual tape libraries and disk targets.

Some of the large deduplication vendors still only support local deduplication, including market leader EMC's Data Domain and Quantum Corp.'s DXi Series. Local dedupe devices each use their own repository, which limits the deduplication ratio when shifting a backup job to a different appliance. This means a 16-controller Data Domain array, for instance, acts as 16 separate systems. Data Domain execs argue that their systems are large enough and fast enough that global dedupe isn't necessary, although Data Domain also plans to include global dedupe for two nodes in its next product release this year.

Independent backup expert  W. Curtis Preston suggests that organizations backing up more than 50 TB of data a night should use global dedupe. Also, those with smaller but rapidly growing backups should also consider it.


  Editor's Tip: Learn more about global data deduplication in this article.

Click here to go to part two of data deduplication tutorial, and learn about choosing data deduplication products.

Dig Deeper on Data reduction and deduplication

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.