Home > Data Backup Tips > Backup and recovery > Inline vs. post-processing deduplication appliances
Data Backup Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

BACKUP AND RECOVERY

Inline vs. post-processing deduplication appliances


Jerome M. Wendt
05.28.2008
Rating: --- (out of 5)


Data backup technical tips
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


Choosing between appliances that do inline or post-processing data deduplication can be difficult, and the answer as to which is the best method for your environment often "it depends." To help you decide between the competing approaches, here are some general guidelines that you can follow to select the appliance that provides the right data deduplication approach for your environment.

Data backup time

If minimizing backup times is your primary objective, then a post-processing appliance is almost always the best approach. Using post-processing, the backup data is first stored in its native backup format to disk and then deduplicated after the backup is complete. Conversely, the overhead required to deduplicate data inline can create a bottleneck because before it can store the data. An inline data deduplication appliance must first break apart data in the incoming backup stream into smaller chunks, and then compare these chunks to data that has already been deduplicated.

Quantity of redundant data

This is a critical piece of information that your company should try to ascertain ahead of time. Most businesses have large quantities of redundant data that changes little from day to day. If you suspect, or better yet can document, that this is the case with your company, it can help alleviate performance concerns around inline deduplication. A post-processing appliance is oblivious to data in the incoming backup stream; an inline appliance recognizes redundant data in different backup streams and can deduplicate data more efficiently.

Disk capacity

A post-processing deduplication appliance needs to maintain a disk cache that's large enough to store the largest night's backup plus enough additional capacity to store the deduplicated data. Because inline deduplication appliances immediately deduplicate data, they don't have a requirement for this additional disk cache and won't need as much disk capacity.

Offsite replication requirements

If you need to quickly replicate deduplicated data to an offsite location, such as a disaster recovery (DR) site, you should give preference to an inline deduplication appliance. Even though it will likely take longer to complete the backup than using a post-processing appliance, the post-processing appliance requires a window of time to deduplicate the data after the backup is complete. The backup window plus the deduplication window may be longer than the amount of time it takes to deduplicate all of the data inline, so by using an inline deduplication appliance, the process of sending the deduplicated data over the WAN can start sooner.

Copying backup data from disk to tape

If copying data from disk to tape is going to remain a part of a company's data protection strategy, then the company needs to establish what set of data it plans to copy from disk to tape. If the copy is going to occur immediately (zero to 12 hours after the backup completes), then a post-processing appliance has an edge because it doesn't need to reconstruct the deduplicated data before copying it to tape. However, if copying older backup data (day-old or week-old) to tape is the objective, then neither approach necessarily has an advantage.

Multiple clustered server nodes

Clustering servers that provide the processing and memory power necessary to deduplicate data is important in a company that anticipates backing up and deduplicating tens of terabytes or more every night. If a company has fewer than 20 TB of data to back up, either an inline or post-processing approach will generally work. However, once a company scales beyond 20 TB of backup data, it needs to carefully examine the deduplication appliance's architecture and if its architecture can scale to deduplicate data in their environment. This matters more in inline deduplication architectures because the level of performance and memory an inline appliance can allocate to deduplication impacts how quickly backups complete.

About the author: Jerome M. Wendt is lead analyst and president of DCIG Inc.


Rate this Tip
To rate tips, you must be a member of SearchDataBackup.com.
Register now to start rating these tips. Log in if you are already a member.




Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


RELATED CONTENT
Backup and recovery
Data backup options for remote sites
The differences between block-based and file-based data backup
How to implement VMware Site Recovery Manager
Data backup strategies: Migrating from tape to disk
Restoring deduped data
Troubleshooting Microsoft Exchange data backup and restores
SharePoint data recovery solutions
Virtual machine backup with CA, HP and Syncsort
How to choose the right tape library
Data deletion or data destruction?

Data reduction and deduplication
Data backup options for remote sites
Restoring deduped data
CommVault CEO says business good despite earnings slip
IBM quickly integrates FilesX's CDP into Tivoli Storage Manager
Quantum disk revenues double, tape sales decline
Record sales reported for data deduplication products
Quantum adds management app to its data dedupe platform
Remote-office backup and data deduplication
Demystifying VMware data protection: VMware data replication methods
HP prepares double dose of data deduplication

Data storage backup tools
Data backup options for remote sites
The differences between block-based and file-based data backup
Full, incremental or differential: How to choose the correct backup type
How to implement VMware Site Recovery Manager
Troubleshooting Microsoft Exchange data backup and restores
Asigra sues backup service rival ROBObak for libel
MozyPro users grumble about poor backup and restore performance
Backup SaaS offers remote data destruction
CommVault CEO says business good despite earnings slip
SharePoint data recovery solutions

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary

DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.

About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides enterprise IT professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective IT purchase decisions and managing their organizations' IT projects - with its network of technology-specific Web sites, events and magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Reprints  |  Site Map




All Rights Reserved, Copyright 2008, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts