Inline deduplication vs. post-processing: Data dedupe best practices

Inline deduplication vs. post-processing: Data dedupe best practices

By Dave Raffo, Senior News Director

Much of the early debate about data deduplication focused on inline deduplication

    Requires Free Membership to View

    When you register for SearchDataBackup.com, you’ll also receive targeted emails from my team of award-winning editorial writers. Because your job never seems to get any easier, it’s our goal to keep you up-to-date on the latest backup tips, trends and technologies that will help you get the job done.

    Rich Castagna, Editorial Director

    By submitting your registration information to SearchDataBackup.com you agree to receive email communications from TechTarget and TechTarget partners. We encourage you to read our Privacy Policy which contains important disclosures about how we collect and use your registration and other information. If you reside outside of the United States, by submitting this registration information you consent to having your personal data transferred to and processed in the United States. Your use of SearchDataBackup.com is governed by our Terms of Use. You may contact us at webmaster@TechTarget.com.

vs. post-processing deduplication. Inline deduplication reduces data while it is being sent to the backup device while post-process backs up data first, and then reduces it. Both methods have advantages and disadvantages. Post-processing backs up data faster and reduces the backup window, but requires more disk because backup data is temporarily stored to speed the process.

Data deduplication technology tutorial
Data deduplication technology tutorial: A guide to data deduping and backup in the enterprise

Global data deduplication and backup: A primer

Choosing data deduplication product: Hardware and software offerings

Data deduplication best practices: Inline vs. post-processing dedupe

Inline deduplication products include EMC Corp.'s Data Domain and Avamar, IBM Corp. ProtecTier, Symantec Corp. PureDisk, CommVault Simpana and NEC HydraStor. Post-process products include ExaGrid EX, FalconStor FDS, and Sepaton DeltaStor. Quantum Corp.'s DXi platform gives customers the choice of post-process or inline dedupe.

FalconStor Software and Sepaton Inc. call their methods concurrent processing because while they move data to a disk staging area first; they don't wait for backups to finish before deduping.

Both inline and post-process methods have their advocates but experts say neither is universally better; it all depends on what type of backup environment you have.

Deduplication is often combined with replication for disaster recovery. Deduplication reduces the amount of data and lowers the bandwidth requirement to copy data offsite. EMC Data Domain, Quantum, IBM ProtecTier, FalconStor and Sepaton are among the vendors who have beefed up their replication capabilities over the past year, often increasing the number of remote sites that can fan into the data center.

Editor's Tip: Learn more about data deduplication products and disaster recovery in this article on data deduplication and IT disaster recovery strategies in this FAQ guide from W. Curtis Preston.

This was first published in March 2010