Pros and cons of different deduping methods

Pros and cons of different deduping methods

Date: Mar 14, 2013

In this Storage Decisions video, Marc Staimer outlines three major types of deduplication -- file, block or blocklet, and content-aware -- and he describes the pros and cons of each method.

"Storage-based, file-based dedupe reduces duplicate files and reduces primary storage consumption. It does it on a file basis. So if you have duplicate files, it will reduce the exact duplicate file. Typically, it's free. You can get it with NetApp; you can get it with EMC; you can get it with a variety of players. Realistically, they're giving it away for free because there are downsides to this technology," said Staimer, president of Dragon Slayer Consulting.

Staimer said the strengths for this form of deduping include its effectiveness in handling duplicate email attachments, duplicate ISO files and golden images. He said it offers roughly anywhere from a two-to-one to three-to-one reduction of primary data.

"That's about the best you're going to see on primary data. Secondary data, you see much better reductions. So you just need to be aware of that from that perspective," Staimer said.

He said read/write latencies with this form of deduping take longer than others, which means it is frequently performed post-process, not inline, especially with primary data deduplication.

Another form of deduping is able to reduce storage consumption by not looking at the file layer, but at individual blocks and blocklets, the latter being smaller than 512 bytes in size. This form of dedupe is a very fine granular approach that provides "excellent" deduplication and is effective with backup data. But storage that includes this dedupe method comes at a cost, Staimer noted.

"Be aware that storage that has this built in tends to carry a premium" he said. "You're going to pay more for it than other storage."

A third method -- content-aware storage deduping -- is available through Dell, he said.

"It takes the data … and looks for common storage pieces among the different files. And then it recompresses it in its new format, because you have to have a reader piece of software to read the data after they compress. So it does really well with different file types," said Staimer, who noted that the method requires special reader software to view data and that the deduping process must be scheduled for after normal operating hours for an organization.

More on Data reduction and deduplication

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: