Home > Using data deduplication with backup applications: Source vs. target dedupe
Column:
EMAIL THIS

Using data deduplication with backup applications: Source vs. target dedupe

22 Oct 2009 | SearchDataBackup.com

News and trends in the storage industry
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google

W. Curtis Preston
By W. Curtis Preston

Your data backup software company wants some of Data Domain's revenue -- seriously. Backup software companies didn't see the intelligent disk target (IDT) market coming. The next thing they knew, companies like Data Domain were making millions of dollars a year selling such devices. Then the independent software vendors (ISVs) that make backup software started having the same thought: "If we offered dedupe for regular backups, customers would pay the data deduplication premium to us instead of to those appliance companies." And a line in the sand was drawn.

Source deduplication is your friend

The IDT vs backup software battle is just beginning, and this article will include a description of the products that have entered the battle; however, first we should discuss the battle that's completely over: backup of small amounts of data coming from remote sites. In this fight for your storage dollars, source deduplication has won
More on data deduplication and backup
Data deduplication software trends in backup and recovery

Dedupe dos and don'ts: Data deduplication technology best practices

Global data deduplication can simplify administration of multiple deduplication devices
hands down. Whether you're backing up a single home computer with your personal data, hundreds of remote users with laptops, or many remote offices with less than a terabyte of data each, source dedupe is your friend.

Without source dedupe, backups of smaller data sets and remote data sets can be quite challenging. Home users have historically used free products that are included with their OS or USB drive. Remote offices typically use something like Symantec Corp. Backup Exec and a DAT drive. Only the most conscientious laptop users have any kind of backup plan at all other than occasionally copying their data to a server that gets backed up. All of these methods are fraught with problems and suffer most from human error.

Installing a source dedupe product on these systems allows them to back up to a source dedupe server over a WAN connection -- completely automating this most important business function. They can back up to a source dedupe server managed by the IT department, or to a cloud backup service managed by an outside company.

The reason that source dedupe allows you to back up large amounts of data over such a small connection is that a source dedupe product communicates with the source dedupe backup server to identify and transmit only the blocks that are new. They start by asking the file system for the files that have changed since the last backup, then they examine each file that is to be backed up for blocks that have changed. This method of backup is obviously very well suited for remote data or mobile data.

Cloud backup and source dedupe

One interesting way that some companies can begin using source dedupe is to use a cloud backup provider that will manage the backups for them. All they have to do is install the cloud backup provider's software on their servers and start backing up to the cloud service. There's no backup server to install or manage. The only challenge some companies may have is getting the first backup done, since the first backup obviously has to send all the blocks. Some cloud providers offer a "seeding" option where they ship you a disk drive that you back up to locally and then ship back to them. They copy this backup to their servers, thus "seeding" your initial full. Once that has been done, your servers only have to back up the blocks that have changed since you backed up to the seeding system.

Target deduplication

Where source dedupe is perfect for smaller, remote data sets, target dedupe is meant for larger datasets where you have essentially unlimited bandwidth between the backup client and the backup server. This is the market that the appliance vendors have focused on, and some have done quite well selling you appliances that will ingest native, un-deduped backups and dedupe them for you. That's what made backup software companies sit up and take notice.

In this fight for your storage dollars, source deduplication has won hands down.

The first company to make a move was Symantec. They took NetBackup PureDisk (a source dedupe product) and moved it inside the media server, allowing it to receive and dedupe regular NetBackup backups. The media server dedupes the data inline as it receives the data, and the deduped data is sent over IP to a PureDisk server.

IBM Corp Tivoli Storage Manager (TSM) followed with TSM server dedupe in TSM 6.1. TSM's implementation is a post-process implementation that looks at backups that have been sent to a disk-type device and dedupes them after the fact. CA announced similar capability for its ARCserve Backup product.

CommVault Systems Inc. is the latest vendor to enter the fray with its media agent dedupe option. Backups that are sent to a Simpana media agent are deduped inline before they are stored on disk. If you wanted to back up a remote site using this feature, CommVault says a media agent in a remote site could write deduped data to a CIFS share that was mounted from the central site.

Both CommVault and Symantec are making claims that you should use their dedupe software instead of buying a dedupe appliance, although CommVault's claims tend to be a little bolder. (Dave West, vice president of marketing and business development at CommVault, wrote in his blog that he sees no use case where any CommVault customer would need to buy a dedupe appliance.)

Can source dedupe or target dedupe from your backup application meet your needs? That will depend on your environment. We definitely see cases where a company's throughput requirements for target dedupe can only be met by an appliance. But there are also plenty of cases where either will work and it's simply a matter of negotiating over price. Just be sure to perform a proof of concept of any vendor claims before signing that check.

About this author: W. Curtis Preston (a.k.a. "Mr. Backup"), Executive Editor and Independent Backup Expert, has been singularly focused on data backup and recovery for more than 15 years. From starting as a backup admin at a $35 billion dollar credit card company to being one of the most sought-after consultants, writers and speakers in this space, it's hard to find someone more focused on recovering lost data. He is the webmaster of BackupCentral.com, the author of hundreds of articles, and the books "Backup and Recovery" and "Using SANs and NAS."



Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google



RELATED CONTENT
Backup and recovery
Criteria for choosing the right tape encryption solution for your data backup plan
Creating a System Recovery Disk in Windows 7: A step-by-step tutorial
Modern data backup and recovery system considerations
SQL Server data backup and recovery best practices
Secure your data backups with encryption key management best practices
Data backup for virtual machines: Alternative methods to VMware Consolidated Backup
Upgrading from LTO-3 to LTO-4 tape for data backup and recovery
Is VMware Consolidated Backup right for your enterprise?
Is cloud data backup service right for your organization?
Are data backup vendor certifications valuable for backup administrators?

Data reduction and deduplication
Texas Tech turns to data deduplication for data backup, disaster recovery
EMC gives Avamar desktop and laptop support
Data backup and recovery news briefs: Dynamic Solutions introduces data deduplication products
Data archiving reduces data backup workload prior to data deduplication
Arkeia takes aim at EMC Avamar with Kadena Systems data deduplication IP buy
Data backup and recovery news briefs: Druvaa Software updates flagship product, releases inSync v3.1
Data backup and recovery vendors dig into deduplication technology, aim for cloud backup
Data backup and recovery news briefs: Data Domain upgrades data deduplication appliances
Quantum launches midrange data deduplication backup appliances
Data deduplication software trends in backup and recovery

Data storage backup tools
HP expands laptop and desktop data backup with Data Protector Notebook Extension
Data backup and recovery news briefs: Rackspace unveils cloud-based file storage apps
EMC gives Avamar desktop and laptop support
Terremark acquires managed data backup and recovery provider DS3 DataVaulting
Data backup and recovery news briefs: Dynamic Solutions introduces data deduplication products
Creating a System Recovery Disk in Windows 7: A step-by-step tutorial
Modern data backup and recovery system considerations
Data backup and recovery news briefs: Thales Group releases CryptoStor Tape 3.0 appliance
Data archiving reduces data backup workload prior to data deduplication
Symantec releases Linux version of Backup Exec System Recovery

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary




Data Backup Solution Categories
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2008 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts