Data deduplication serves a dual purpose: It's a way to reduce the amount of physical storage used for backup data and, indirectly, because deduplication is typically implemented at the disk level, a way to replace tape a primary storage media for backups.
A common application for deduplication is in remote offices where there might be a requirement to eliminate tape handling because there are no IT skills available on site. While the primary solution to meet that requirement is disk, deduplication increases the utilization of the more expensive disk storage.
Deduplication solutions also significantly reduce the bandwidth requirements for replicating data to a remote location.
Other drivers to use disk-based backups include reliability of RAID-based disk storage and the security of not having tapes leave your environment with sensitive data.
However, there are instances when deduplication is not a good choice. For instance, imaging-type data or data encrypted at the source do not benefit much from deduplication. This makes disk-based backups a pricier proposition.
As far as fairly establishing which are the best deduplication solutions and the top three vendors, this would be the object of a multi-page white paper rather than this short column. Suffice to say that most vendors will all claim to have the fastest, most effective and reliable deduplication solution and to a large degree, they are all telling the truth. Your choice should be based on the features you need, ease of integration with your current environment and cost. But most importantly, your decision should be based on whether or not you will benefit from implementing deduplication.
This was first published in July 2008