When does data deduplication using backup software on the host make sense?

    Requires Free Membership to View

Data deduplication information
CAS and data deduplication: Partners in archiving

In-band vs. out-of-band deduplication
There are a couple factors that you really need to consider. If you're bandwidth constrained and are trying to back up data and you have large amounts of data coming over the network, then using data deduplication at the host makes a lot of sense. That can dramatically free up the amount of bandwidth that you have.

It is important that the host can sustain the initial hit. This technology requires memory and CPU processing to perform the data deduplication. It might be a good idea to run the initial backup over a weekend when the backup window is a bit longer.

Some companies are taking steps to mitigate this initial performance hit. I recently talked to Symantec, and, for that initial backup, they are putting some of the intelligence out on the individual nodes so there is some level of deduplication taking place before it takes off to help reduce some of the overhead.

Check out the entire Data Deduplication FAQ.


This was first published in December 2007

Join the conversationComment

Share
Comments

    Results

    Contribute to the conversation

    All fields are required. Comments will appear at the bottom of the article.