Home > Data Backup Tips > Backup and recovery > Understanding data deduplication ratios in backup systems
Data Backup Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

BACKUP AND RECOVERY

Understanding data deduplication ratios in backup systems


Lauren Whitehouse
05.11.2009
Rating: -4.29- (out of 5)


Enterprise IT tips and expert advice
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


The effectiveness of data deduplication is often expressed as a deduplication or reduction ratio, denoting the ratio of protected capacity to the actual physical capacity stored. A 10:1 ratio means that 10 times more data is protected than the physical space required to store it, and a 20:1 ratio means that 20 times more data can be protected. Factoring in data growth, retention and assuming deduplication ratios in the 20:1 range, 2 TB of storage capacity could protect up to 40 TB of retained backup data.

How are these data deduplication ratios determined? The rate is calculated by taking the total capacity of data to back up (i.e., the data that will be examined for duplicates) and dividing it by the actual capacity used (i.e., the deduplicated amount of data).

What's a realistic data dedupe ratio?

But what is a realistic data deduplication ratio? The Enterprise Strategy Group (ESG) research found that, of respondents currently using data deduplication technology, approximately one-third (33%) said they have experienced a less than 10 times reduction in capacity requirements; 48% report a 10 times to 20 times reduction, and 18% report reductions ranging from 21 times to more than 100 times.

Several factors influence deduplication ratios, including:

  • Data backup policies: the greater the frequency of "full" backups (versus "incremental" or "differential" backups), the higher the deduplication potential since data will be redundant from day to day.
  • Data retention settings: the longer data is retained on disk, the greater the opportunity for the deduplication engine to find redundancy.
  • Data type: some data is inherently more prone to duplicates than others. It's more reasonable to expect higher deduplication ratios if the environment contains primarily Windows servers with similar files, or VMware virtual machines.
  • Rate of change: the smaller the rate of change, the higher the likelihood of finding duplicate data.
  • Deduplication domain: the wider the scope of the inspection and comparison process, the higher the likelihood of detecting duplicates. Local deduplication refers to the examination of redundancy at the local resource, while global deduplication refers to inspecting data across multiple sources to locate and eliminate duplicates. For example, a daily full backup of data changing at a rate of 1% or less that is retained for 30 backups has 99% of every backup duplicated. After 30 days, the ratio could reach 30:1. If, on the other hand, weekly backups were retained for a month, then the ratio would reach only 4:1.

Deduplication ...


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google



RELATED CONTENT
Backup and recovery
A review of VMware disk-to-disk backup apps: Veeam, Vizioncore, PHD Virtual and VDR
Criteria for choosing the right tape encryption solution for your data backup plan
Creating a System Recovery Disk in Windows 7: A step-by-step tutorial
Modern data backup and recovery system considerations
SQL Server data backup and recovery best practices
Secure your data backups with encryption key management best practices
Using data deduplication with backup applications: Source vs. target dedupe
Data backup for virtual machines: Alternative methods to VMware Consolidated Backup
Upgrading from LTO-3 to LTO-4 tape for data backup and recovery
Is VMware Consolidated Backup right for your enterprise?

Data reduction and deduplication
Texas Tech turns to data deduplication for data backup, disaster recovery
EMC gives Avamar desktop and laptop support
Data backup and recovery news briefs: Dynamic Solutions introduces data deduplication products
Data archiving reduces data backup workload prior to data deduplication
Arkeia takes aim at EMC Avamar with Kadena Systems data deduplication IP buy
Data backup and recovery news briefs: Druvaa Software updates flagship product, releases inSync v3.1
Data backup and recovery vendors dig into deduplication technology, aim for cloud backup
Data backup and recovery news briefs: Data Domain upgrades data deduplication appliances
Using data deduplication with backup applications: Source vs. target dedupe
Quantum launches midrange data deduplication backup appliances

Disk-based backup
Texas Tech turns to data deduplication for data backup, disaster recovery
EMC gives Avamar desktop and laptop support
Modern data backup and recovery system considerations
Arkeia takes aim at EMC Avamar with Kadena Systems data deduplication IP buy
SQL Server data backup and recovery best practices
Data backup and recovery vendors dig into deduplication technology, aim for cloud backup
Quantum launches midrange data deduplication backup appliances
Data backup news briefs: ProStor Systems ships InfiniVault removable disk backup appliance for SMBs
BakBone phasing out virtual tape library, adds data deduplication with NetVault Backup 8.5
EMC's Slootman: No data deduplication for Disk Library virtual tape library

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary


rates can be confusing. Some vendors express reduction as a percentage of savings instead of a ratio. If a vendor cites a 50% capacity savings, it's equivalent to a 2:1 deduplication ratio. A ratio of 10:1 is the same as 90% savings. That means that 10 TB of data can be backed up to 1 TB of physical storage capacity. A 20:1 ratio increases the savings by only 5% (to 95%).

Evaulating a dedupe product

When evaluating data deduplication, it's important to trial vendors' products in your environment with your own data over several backup cycles to determine a product's impact on your backup/recovery environment. The focus of selecting a product should be less on reduction ratios as a decision factor. ESG research (ESG Research Report, "Data Protection Market Trends," January 2008) found that, not surprisingly, the cost of the deduplication solution was the most frequently cited factor (although savings garnered from capacity reduction often overcome financial objections to deploying deduplication). Otherwise, the survey data suggests that ease of deployment and ease of use, as well as the impact on backup/recovery performance were important considerations -- more so than technical implementations, such as the deduplication ratio.

About this author: Lauren Whitehouse is an analyst with Enterprise Strategy Group and covers data protection technologies. Lauren is a 20-plus-year veteran in the software industry, formerly serving in marketing and software development roles.


Rate this Tip
To rate tips, you must be a member of SearchDataBackup.com.
Register now to start rating these tips. Log in if you are already a member.




DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



Enterprise Backup Solutions - Continuous Data Protection (CDP)
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2008 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts