There are also concerns about data loss due to hash collisions. How have manufacturers addressed tha

I've had a chance to speak to a number of vendors about this, and for the most part they say that this is pretty much...

I've had a chance to speak to a number of vendors about this, and for the most part they say that this is pretty much resolved with the newest hashing algorithms. I was talking to Permabit about this subject, and it said that the only way you can be absolutely sure that each chunk of data is completely unique is to take each new chunk of data and compare it to every other chunk of data stored. It also said that's really not practical because if you do that, it creates a huge time delay. So, the compromise is to use a hashing algorithm.

Other companies take a different approach. ExaGrid, for example, cuts data into large segments and analyze contents of each segment to see how they are related to each other, and then it performs byte-level differencing on each segment and stores data that way.

Check out the entire Data Deduplication FAQ.

This was first published in December 2007

Dig deeper on Data reduction and deduplication

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchSolidStateStorage

SearchVirtualStorage

SearchCloudStorage

SearchDisasterRecovery

SearchStorage

SearchITChannel

Close