Home > Data Backup Tips > Backup and recovery > Inline vs. post-processing deduplication appliances
Data Backup Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

BACKUP AND RECOVERY

Inline vs. post-processing deduplication appliances


Jerome M. Wendt
05.28.2008
Rating: -3.00- (out of 5)


Data backup technical tips
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


Choosing between appliances that do inline or post-processing data deduplication can be difficult, and the answer as to which is the best method for your environment often "it depends." To help you decide between the competing approaches, here are some general guidelines that you can follow to select the appliance that provides the right data deduplication approach for your environment.

Data backup time

If minimizing backup times is your primary objective, then a post-processing appliance is almost always the best approach. Using post-processing, the backup data is first stored in its native backup format to disk and then deduplicated after the backup is complete. Conversely, the overhead required to deduplicate data inline can create a bottleneck because before it can store the data. An inline data deduplication appliance must first break apart data in the incoming backup stream into smaller chunks, and then compare these chunks to data that has already been deduplicated.

Quantity of redundant data

This is a critical piece of information that your company should try to ascertain ahead of time. Most businesses have large quantities of redundant data that changes little from day to day. If you suspect, or better yet can document, that this is the case with your company, it can help alleviate performance concerns around inline deduplication. A post-processing appliance is oblivious to data in the incoming backup stream; an inline appliance recognizes redundant data in different backup streams and can deduplicate data more efficiently.

Disk capacity

A post-processing deduplication appliance needs to maintain a disk cache that's large enough to store the largest night's backup plus enough additional capacity to store the deduplicated data. Because inline deduplication appliances immediately deduplicate data, they don't have a requirement for this additional disk cache and won't need as much disk capacity.

Offsite replication requirements

If you need to quickly replicate deduplicated data to an offsite location, such as a disaster recovery (DR) site, you should give preference to an inline deduplication appliance. Even though it will likely take longer to complete the backup than using a post-processing appliance, the post-processing appliance requires a window of time to deduplicate the data after the backup is complete. The backup window plus the deduplication window may be longer than the amount of time it takes to deduplicate all of the data inline, so by using an inline deduplication appliance, the process of sending the deduplicated data over the WAN can start sooner.

Copying backup data from disk to tape

If copying data from disk to tape is going to remain a part of a company's data protection strategy, then the company needs to establish what set of data it plans to copy from disk to tape. If the copy is going to occur immediately (zero to 12 hours after the backup completes), then a post-processing appliance has an edge because it doesn't need to reconstruct the deduplicated data before copying it to tape. However, if copying older backup data (day-old or week-old) to tape is the objective, then neither approach necessarily has an advantage.

Multiple clustered server nodes

Clustering servers that provide the processing and memory power necessary to deduplicate data is important in a company that anticipates backing up and deduplicating tens of terabytes or more every night. If a company has fewer than 20 TB of data to back up, either an inline or post-processing approach will generally work. However, once a company scales beyond 20 TB of backup data, it needs to carefully examine the deduplication appliance's architecture and if its architecture can scale to deduplicate data in their environment. This matters more in inline deduplication architectures because the level of performance and memory an inline appliance can allocate to deduplication impacts how quickly backups complete.

About the author: Jerome M. Wendt is lead analyst and president of DCIG Inc.


Rate this Tip
To rate tips, you must be a member of SearchDataBackup.com.
Register now to start rating these tips. Log in if you are already a member.




Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google



RELATED CONTENT
Backup and recovery
Secure your data backups with encryption key management best practices
Using data deduplication with backup applications: Source vs. target dedupe
Data backup for virtual machines: Alternative methods to VMware Consolidated Backup
Upgrading from LTO-3 to LTO-4 tape for data backup and recovery
Is VMware Consolidated Backup right for your enterprise?
Is cloud data backup service right for your organization?
Are data backup vendor certifications valuable for backup administrators?
Choosing a Linux system backup tool: Pros and cons of popular Linux backup apps
Dedupe dos and don'ts: Data deduplication technology best practices
Changing data backup software applications: Tips and recommendations

Data reduction and deduplication
Data backup and recovery vendors dig into deduplication technology, aim for cloud backup
Data backup and recovery news briefs: Data Domain upgrades data deduplication appliances
Using data deduplication with backup applications: Source vs. target dedupe
Quantum launches midrange data deduplication backup appliances
Data deduplication software trends in backup and recovery
BakBone phasing out virtual tape library, adds data deduplication with NetVault Backup 8.5
EMC's Slootman: No data deduplication for Disk Library virtual tape library
Online data deduplication calculators don't always add up to accurate dedupe ratios
ExaGrid doubles capacity with EX10000E data deduplication appliances, challenges EMC/Data Domain
Data backup for virtual machines: Alternative methods to VMware Consolidated Backup

Data storage backup tools
Data backup and recovery vendors dig into deduplication technology, aim for cloud backup
Veeam integrates with VMware vStorage APIs in Backup and Replication 4
Data backup and recovery news briefs: Data Domain upgrades data deduplication appliances
Double-Take replication software solves remote-office data backup headache for Lennox International
Using data deduplication with backup applications: Source vs. target dedupe
Plan ahead to avoid bare-metal restore frustration
Even with new and advanced VMware data backup tools, users stick with older technologies
VMware and virtual data backup and recovery technology tutorial
Online data deduplication calculators don't always add up to accurate dedupe ratios
Data backup for virtual machines: Alternative methods to VMware Consolidated Backup

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary

DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



Enterprise Backup Solutions - Continuous Data Protection (CDP)
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2008 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts