There are also concerns about data loss due to hash collisions. How have manufacturers addressed tha

I've had a chance to speak to a number of vendors about this, and for the most part they say that this is pretty much resolved with the newest hashing algorithms. I was talking to Permabit about this subject, and it said that the only way you can be absolutely sure that each chunk of data is completely unique is to take each new chunk of data and compare it to every other chunk of data stored. It also said that's really not practical because if you do that, it creates a huge time delay. So, the compromise is to use a hashing algorithm.

Other companies take a different approach. ExaGrid, for example, cuts data into large segments and analyze contents of each segment to see how they are related to each other, and then it performs byte-level differencing on each segment and stores data that way.

Check out the entire

    Requires Free Membership to View

Data Deduplication FAQ.

This was first published in December 2007

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: