There are also concerns about data loss due to hash collisions. How have manufacturers addressed tha

I've had a chance to speak to a number of vendors about this, and for the most part they say that this is pretty much...

I've had a chance to speak to a number of vendors about this, and for the most part they say that this is pretty much resolved with the newest hashing algorithms. I was talking to Permabit about this subject, and it said that the only way you can be absolutely sure that each chunk of data is completely unique is to take each new chunk of data and compare it to every other chunk of data stored. It also said that's really not practical because if you do that, it creates a huge time delay. So, the compromise is to use a hashing algorithm.

Other companies take a different approach. ExaGrid, for example, cuts data into large segments and analyze contents of each segment to see how they are related to each other, and then it performs byte-level differencing on each segment and stores data that way.

Check out the entire Data Deduplication FAQ.

This was last published in December 2007

Dig Deeper on Data reduction and deduplication

PRO+

Content

Find more PRO+ content and other member only offers, here.

Start the conversation

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

-ADS BY GOOGLE

SearchSolidStateStorage

SearchConvergedInfrastructure

SearchCloudStorage

SearchDisasterRecovery

SearchStorage

SearchITChannel

Close