Essential Guide

Browse Sections


This content is part of the Essential Guide: Complete guide to backup deduplication
Problem solve Get help with specific problems with your technologies, process and projects.

Target versus source deduplication today

Rachel Dines, analyst with Forrester Research, discusses target and source deduplication today, including the pros and cons of each.

What are the pros and cons of source and target deduplication?

With source deduplication, hashing and processing occurs on the client itself before data is transmitted over the network. Because deduplication occurs at the source, less data is transmitted over the network and ultimately stored. However, it does add some processing overhead on the client. How much overhead will vary by vendor, but it usually ranges from 15 to 25%. Source-based deduplication is especially useful in highly virtualized environments and branch-office environments where bandwidth is scarce, but it usually isn't suitable for high-transaction environments.

With target deduplication, hashing and processing occurs on a media server or a proxy server or on the disk appliance. Because deduplication occurs on the target side, it does not reduce the amount of data transferred from the client, but it does not add any processing overhead to the client, either.

Over the past few years, many enterprise backup software products have evolved to include both source and target deduplication. The deduplication rates among vendors depend on the type of algorithm used, but they tend to be very competitive. Usually, on average, I see companies getting around a 7:1 to 10:1 deduplication ratios, depending on backup schema, end-user patterns and data types. For example, companies that use "incremental forever" backups -- where they take one full backup at installation and, from then on, take only incremental backups -- see less dramatic deduplication ratios, often in the range of 4:1 to 6:1. This doesn't necessarily mean they are storing more data than their peers using a traditional weekly full, nightly incremental approach, since "incremental forever" backups are innately more space-efficient.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

Join the conversation

1 comment

Send me notifications when other members comment.

Please create a username to comment.

Would love to see one day a real analysis about which kind of ratios can be seen.
Is it difficult to place 10 Tb of data (structured data and unstructured data) giving a % for each and benching the major actors ? for 15 days or 30 days... Is there something to hide ? Dedup is not new and still no real competitive analysis from experts... Miguel