The desire to cut the purchase and energy costs of disk drives, and to reduce the bandwidth required to replicate data among various sites (such as for disaster recovery), have made data deduplication a must-have feature for enterprise virtual tape libraries. Some vendors are also introducing massive array of idle disks (MAID) or spin-down features that power up disks only when they read or write data.
Deduplication can be done inline (as the data is taken into the virtual tape library) or post-processing, after it's written to disk. Inline reduces the amount of disk space required but can slow performance; post-processing requires the most disk space, but can speed backups by allowing the disks to work at full speed as data is written to them.
Many vendors offer their own spin on deduplication, seeking to reduce the amount of time, processing power or disk space consumed by the dedupe process. Diligent Technologies Corp. first identifies similarities among the data being backed up, and then submits only those similarities for detailed byte-by-byte deduplication. IBM Corp. will incorporate Diligent's software into its own products, said Tom Grave, Diligent's director of product management. EMC Corp. has deduplication functions (licensed from Quantum Corp.) to its virtual tape libraries.
Quantum's DXi7500 high-end virtual tape library asks a user to choose parameters such as their backup window and the amount of data they need to back up, and then automatically chooses whether to perform deduplication inline or after the data is backed up, said Mike Sparkes, Quantum's product marketing manager for enterprise disk systems.
While many deduplication vendors boast of deduplication ratios of 20:1, said Sparkes, Quantum's customers report data-reduction ratios of anywhere from 5:1 or 6:1 to 30:1 or 40:1. Just how much space deduplication will save a given customer varies based on the amount of data being stored, how long it's kept and how often full backups are done, as well as the deduplication technology being used.
This article originally appeared in Storage magazine.
About this author: Robert L. Scheier is a frequent contributor to "Storage" magazine.