Essential Guide

Complete guide to backup deduplication

A comprehensive collection of articles, videos and more, hand-picked by our editors
Q
Get started Bring yourself up to speed with our introductory content.

Can your deduplication process be improved with semantic dedupe?

While semantic deduplication is a legacy technology, the global deduplication process can perform a similar job. Expert Brien Posey compares the two offerings.

FROM THE ESSENTIAL GUIDE:

Complete guide to backup deduplication

+ Show More

Semantic deduplication, more commonly known as semantic-aware multi-tiered deduplication, or SAM, is a legacy deduplication process that has existed since at least 2010.

The technology was supposed to strike a balance between deduplication ratios and the overhead required by the deduplication process. When semantic deduplication was introduced, achieving high deduplication ratios meant committing significant hardware resources to the deduplication process, which tended to impact performance. More performance-friendly deduplication algorithms of the time were less effective with regard to the level of deduplication they could achieve.

Striking a balance between deduplication and performance is much less of an issue than it was five years ago. Deduplication is now a mature technology, and much of the overhead produced by the deduplication process can be countered by more efficient deduplication algorithms and by hardware offloading.

Semantic deduplication worked by taking a multi-tier approach to the deduplication process. File data was globally deduplicated in an effort to remove redundant files. Data was also deduplicated locally at the chunk level. This multi-tier approach resulted in higher deduplication ratios with less overhead.

Even though semantic deduplication is a legacy technology, there are a number of products in use today that perform global deduplication using a very similar technique. Imagine that 10 servers need to be backed up, and each server runs the same operating system. Performing a block-level deduplication on the servers would eliminate redundancy at the server level, but redundancy would still exist within the backup target due to similarities in the various servers. A second deduplication pass at the backup target level can eliminate any cross-server redundancy. While this approach isn't exactly the same as semantic deduplication, the similarities cannot be ignored.

Next Steps

Available deduplication options

Vendors trying to speed up the deduplication process

Deduplication can benefit your organization

This was last published in November 2015

PRO+

Content

Find more PRO+ content and other member only offers, here.

Essential Guide

Complete guide to backup deduplication

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

Join the conversation

2 comments

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

What type of deduplication process does your organization use?
Cancel
There are many different kinds of deduplication and many different products for the process. Which dedupe vendor do you prefer?
Cancel

-ADS BY GOOGLE

SearchSolidStateStorage

SearchCloudStorage

SearchDisasterRecovery

SearchStorage

SearchITChannel

Close