Three or four years ago, there was something of a deduplication arms race going on. Vendors competed fiercely to achieve the highest possible deduplication ratio. Just as one vendor would advertise a 20:1 ratio, another would issue a press release stating it had achieved a 50:1 ratio.
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
Today, these figures are hardly even worth considering. For all practical purposes, vendor-advertised deduplication ratios have become meaningless for two main reasons:
- When a vendor states it can achieve a 50:1 deduplication ratio, that number indicates a best-case situation. In the real world, deduplication ratios are often much lower than what vendors advertise as being possible. Remember, deduplication works by removing redundant data. If no redundant data exists, then deduplication is impossible. Some types of data are already compressed and therefore contain very little redundancy. This is especially true of media files such as MPEG videos or JPEG images.
- As the ratio increases, the data deduplication process yields diminishing returns. For example, if you deduplicate 1 TB of data, a 2:1 deduplication ratio (which is very low) eliminates half the data (512 GB). By the time you get to a 20:1 ratio, 95% of the data has been eliminated and your 1 TB of data has been reduced to a mere 51.2 GB. If you increase the deduplication ratio to 25:1, there is not much more data that can be eliminated because most of the redundancy has been removed. Moving from a 20:1 to 25:1 ratio only reduces the data by another 1% and the data volume by approximately 10 GB, which is insignificant compared to the original 1 TB of data. The data reductions become increasingly insignificant as the deduplication ratios get larger.
Vendor deduplication ratio claims vary widely
Guidelines on deduplicating disk backup storage
Deduplication key for backup and primary storage
Related Q&A from Brien Posey
IT can have trouble managing user profiles that are separate from the desktop images in nonpersistent VDI, but they can make things easier with user ...continue reading
Microsoft Excel may be widely used, but other spreadsheet applications such as Google Sheets and Thinkfree are available for organizations to use.continue reading
Social network backup tools are hard to find, but they are out there. Like with file data, your organization should make sure its social media data ...continue reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.