Home > Data Backup Tips > Backup and recovery > Data deduplication technology: The business case for dedupe
Data Backup Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

Data deduplication technology: The business case for dedupe


Pierre Dorion
Rating: -3.33- (out of 5)

Data deduplication is a relatively new technology that has made its way into many data storage environments. But what makes it a justified expenditure in one environment will not necessarily hold true in all cases; there is definitely a need to understand whether the dedupe will fill a gap, help you meet a requirement or reduce costs. Storage vendors are typically better at finding a need for their technology in your environment rather than finding a technology that will actually meet your needs. Beware of vendor ROI calculators that spew out fantastic dedupe savings as mileage will most certainly vary.

When talking about building a business case for dedupe, the term expenditure is preferred here instead of investment because we are talking about backups. Rarely does data backup technology generate revenue unless it's used by a backup service provider. For most companies, backups are...


RELATED CONTENT
Backup and recovery
An introduction to Microsoft SharePoint 2007 backup and recovery
How to back up encrypted files and how to use the Encrypting File System
Protecting disk-to-disk backups and continuous data protection
Cloud data backup management: Users see new options for cloud storage administration
New features in VMware vSphere that benefit data backup and recovery
Preventing tape backup system disasters
Using different types of storage snapshot technologies for data protection
Top five tape storage backup and recovery tips
Storage snapshot technologies in data backup and recovery
Top 10 data backup and recovery tips of 2009

Data reduction and deduplication
Data backup and disaster recovery Products of the Year winners
NetApp discontinues development on NearStore virtual tape library
Data backup and recovery briefs: Quantum releases StorNext 4.0 data management software
What are the differences between file-level vs. block-level deduplication?
Symantec injects data deduplication into NetBackup 7 and Backup Exec 2010
Nexsan and FalconStor gun for EMC Data Domain with Dedupe SG 2 data deduplication backup device
Data backup and recovery planning in 2010: Mr. Backup's predictions
Top 10 data backup and recovery tips of 2009
Secure-24 switches to EMC virtual tape library -- sans data deduplication
Top five data dedupe technology tips of 2009

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
near-continuous data protection (near CDP)  (SearchDataBackup.com)
post-processing deduplication  (SearchDataBackup.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary


a way to prevent losses so the mindset is around saving money. You don't really hear about "investing in a backup technology to increase revenue." Cost reduction is therefore a good place to start in order to build a solid business case for data deduplication.

What are you trying to solve with data deduplication?

What are you trying to solve with dedupe? This should be the first question asked. While there is actually nothing wrong with adopting new technologies and improving the way certain IT processes work, obtaining funding is always easier when it is aimed at cutting costs or addressing something that is failing to meet requirements. Here are some pros and cons that can help build a case for deduplication

Advantages of data deduplication

Remote offices: Deduplication can help address a common situation for remote offices where there are no onsite skills to manage backups. Using a dedupe-capable disk array as the primary target to store backup data will eliminate the need to ensure a tape is always available and eliminate the need to have someone mount a tape for restores. Add to that the ability to replicate deduplicated data across the WAN and you have a low management overhead backup solution. Additionally, replicating deduplicated data across the WAN reduces the network bandwidth requirement, making this a cheaper alternative to disk mirroring. This does not necessarily translate into immediate savings over tape, but it can eliminate frequently failing or missed backups.

Data deduplication and duplicate files: Eliminating duplicate files is one of the most appealing reasons for data deduplication. Environments with large amounts of duplicate or similar files have a lot to gain from a storage cost-reduction perspective. Deduplication yields the best data reduction results when it encounters large volumes of identical data segments. In instances where full backups are frequent and data change rates are moderate to low, date reduction can be very impressive and can result in significant storage savings. A data reduction ratio between 5:1 to 10:1 is not uncommon but ratios of 20:1 and higher have been observed in some environments.

Reduced media handling: For environments still needing tape operators and racks to store media because the tape library is at near capacity, deduplication offers a great opportunity to reduce media handling allowing resources to be redeployed in other areas where they are needed. Once more, the ability to replicate data to a remote after it has been deduped can eliminate the need for offsite media handling without requiring major network bandwidth to meet backup windows. Organizations with at least two locations already connected via a network link can leverage replication of deduplicated data without significant capital expenditure while reducing their offsite storage budget and reallocating resources to more productive tasks.

Space reclamation: Given the cost of data center space, it may make a lot of sense to reclaim some of the space occupied by a very large tape library and replace it with some reduced footprint, dedupe-capable disk arrays.

Tape upgrade: Any organization considering a tape technology update should seriously consider disk deduplication. Where it does not necessarily make financial sense to rip and replace a tape subsystems that is still meeting requirements, the need for a technology update always offers an opportunity to evaluate other options.

The disadvantages of data deduplication

Data type: Not all data is a good candidate for deduplication; image, video and audio or other types of compressed data will gain little from deduplication.

Encryption: For security-minded organizations that implement data encryption at the source, deduplication at the backup level is not the best choice as encryption's first job is to make date unrecognizable without the keys. This nullifies most benefits of deduplication unless encryption is applied post-deduplication.

Transient data: Data with very low retention parameters will typically see a poor dedupe or reduction ratio. This is because deduplication needs to build a base of identical data segments before it really becomes effective. Pass-through or very short-term retention data does not typically reside long enough on the storage array to allow dedupe algorithms to build history. Deduplication is definitely better suited for longer term retention.

Deduplication misconceptions

Deduplication-capable virtual tape libraries (VTLs) should not be considered an endless source of tape devices. While manufacturers may enable you to configure 128 logical tape drives or more, this does not automatically translate into a massive performance gain. For example, streaming data to more than100 virtual tape drives over a gigabit link will still not exceed gigabit performance. You may find yourself with the same performance bottleneck tens of thousands of dollars later.

Many vendors will leverage the fact that deduplication-capable disk arrays can be faster than tape but there are still limitations. Data deduplication to disk is not mirroring or snapshot technology; data must be reassembled and if managed by a backup product, it must also be written back to a file system in a format that is readable by the applications accessing it. Depending on the deduplication technology in use, performance for large restore operations can also be disappointing.

Deduplication should be presented like any other technology. Unless using it will address the shortcomings of another technology or truly help reduce operational costs beyond the initial capital cost over the solution's usable life, it will be a tough sell.

About this author: Pierre Dorion is the Data Center Practice Director and a Senior Consultant with Long View Systems Inc. in Phoenix, AZ, specializing in the areas of business continuity and disaster recovery planning services, and corporate data protection.


Rate this Tip
To rate tips, you must be a member of SearchDataBackup.com.
Register now to start rating these tips. Log in if you are already a member.




DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



Enterprise Backup Solutions - Continuous Data Protection (CDP)
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2008 - 2010, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts