Backup with deduplication: An essential guide

Learn about data deduplication best practices in this essential guide on data dedupe and backup technology.

data dedupe and backup image Backup with deduplication is the hottest thing to hit enterprise data storage since disk began popping up in backup configurations. With its ability to stretch usable capacity to accommodate ever-growing data stores, data deduplication might just be the single technology that keeps disk in the picture as a viable alternative to tape for short-term retention of backup data.

Using disk as a backup target for short- or long-term retention is a no brainer at this point -- backup windows aren't shattered, restores have never been faster or easier, and relatively cheap disk makes a perfect home for data before it gets spun off to tape. It's a pretty picture, but one marred by an ugly reality: There doesn't seem to be an end in sight to the growing amount of data that needs to be backed up.

The idea behind data dedupe technology is simple -- it trims the amount of data to be stored by eliminating the redundancies so typical of backups, allowing you to effectively cram far more data into the same physical space by factors ranging from 10 to 50 times or more.

The benefits are apparent and make pitching a data dedupe deployment to upper management a relatively easy exercise. But dedupe does have its finer points, with capabilities, efficiencies and administrative issues varying from product to product, and from one environment to another. To determine the best fit for your backup setup, you'll need to sort through the types of dedupe available and make decisions regarding file or block methods, hash-based systems vs. those that use byte-level comparison techniques and inline vs. post-processing deduplication implementations. But there are still numerous details to work out to get optimal dedupe performance and to restore deduped data in a timely manner. A little homework up front will avoid some grief later, while helping you set reasonable expectations for deduplication.

In this essential guide on backup with data deduplication, learn about getting started with deduplication software, file-level vs. block-level dedupe, problems with restoring deduped data, and other timesaving tips and tricks. Download our free guide on data dedupe technology and backup, and be sure to email SearchDataBackup or ask our data backup experts if you have any questions about data deduplication technology.

--Rich Castagna, Storage Media Group Editorial Director


Top five data dedupe tips: The best way to select, implement and integrate a data deduplication product varies depending on how the deduplication is performed. Read this article to learn about the general principles you can follow to select the right deduplicating approach and then integrate it into your environment. Following these steps will get you the best results when deduplicating backup data.

File-level vs. block-level deduplication: The pros and cons: Data dedupe has dramatically improved the value proposition of disk-based data protection, as well as WAN-based remote and branch-office backup consolidation and disaster recovery (DR) strategies. It identifies duplicate data, removing redundancies and reducing the overall capacity of the data transferred and stored. Some deduplication approaches operate at the file level, while others go deeper to examine data at a sub-file or block level. Determining uniqueness at the file or block level offers benefits, but the results will vary. The differences lie in the amount of reduction each approach produces, and the time each method takes to determine what is unique.

The truth about data deduplication: Dispelling data dedupe myths: Exaggerated claims, rapidly changing technology and persistent myths make data deduping treacherous. But the rewards of a successful dedupe installation are indisputable. Sorting out the deduplication myths is just the first part of a backup administrator or a storage manager's job. The following tips in this article will help managers deploy deduplication while avoiding common pitfalls.

Restoring deduped data: The capacity reduction achieved through data deduplication reduces network traffic. Depending on where the deduplication occurs, this can impact the volume of data transferred over a LAN, SAN or WAN, and make it more practical for organizations to implement backup consolidation for remote/branch offices and offsite replication for disaster recovery protection. Both scenarios introduce significant improvements over tape-based strategies where media has to be physically handled and transferred between sites. Learn about the problems surrounding restoring deduped data in this article.

Click here to download our entire guide on backup with deduplication.

Dig Deeper on Data reduction and deduplication

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.