Three or four years ago, there was something of a deduplication arms race going on. Vendors competed fiercely to achieve the highest possible deduplication ratio. Just as one vendor would advertise a 20:1 ratio, another would issue a press release stating it had achieved a 50:1 ratio.
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
Today, these figures are hardly even worth considering. For all practical purposes, vendor-advertised deduplication ratios have become meaningless for two main reasons:
- When a vendor states it can achieve a 50:1 deduplication ratio, that number indicates a best-case situation. In the real world, deduplication ratios are often much lower than what vendors advertise as being possible. Remember, deduplication works by removing redundant data. If no redundant data exists, then deduplication is impossible. Some types of data are already compressed and therefore contain very little redundancy. This is especially true of media files such as MPEG videos or JPEG images.
- As the ratio increases, the data deduplication process yields diminishing returns. For example, if you deduplicate 1 TB of data, a 2:1 deduplication ratio (which is very low) eliminates half the data (512 GB). By the time you get to a 20:1 ratio, 95% of the data has been eliminated and your 1 TB of data has been reduced to a mere 51.2 GB. If you increase the deduplication ratio to 25:1, there is not much more data that can be eliminated because most of the redundancy has been removed. Moving from a 20:1 to 25:1 ratio only reduces the data by another 1% and the data volume by approximately 10 GB, which is insignificant compared to the original 1 TB of data. The data reductions become increasingly insignificant as the deduplication ratios get larger.
Vendor deduplication ratio claims vary widely
Guidelines on deduplicating disk backup storage
Deduplication key for backup and primary storage
Related Q&A from Brien Posey
Edge computing is finding its place in the enterprise to handle data growth. IT may use that same advantage to help address problems in a VDI ...continue reading
A ghost image can be used to copy the contents of one server to another for backup, but the process of creating ghost images may not be as simple as ...continue reading
Backup and recovery trends, such as hybrid cloud data protection, are gaining popularity in the IT industry. Are these three major trends part of ...continue reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.