Deduplication to tape hasn’t seemed to catch on. Why?
I have pretty strong opinions on deduplication products, but the basic theory of its use with a disk-based virtual tape library (VTL) is basically sound. Rather than storing seven full backups, you apply de-duplication so you only use disk to store the first full copy and change data. That uses less space.
The real issue with dedupe in the companies I visit is that it has never been tested in court. If you are publicly traded, the SEC requires that your financial data be disclosed in a full and unaltered form. Dedupe has never been subjected to scrutiny by the SEC or by the courts to decide whether it materially alters data. To many of my clients (especially those in financial services), it doesn’t matter if dedupe changes data or not. They don’t want to shoulder the burden of the court costs that would accrue to fighting a battle over the admissibility of their data if it is ever called into question.
Deduplication to tape (like encryption) introduces another hurdle I need to jump when I am trying to restore my applications under tight timeframes. I just don’t like the extra step and the issues, depending on the kind of dedupe you use, having to do with having the index or other structures required to “rehydrate” deduplicated data.
This was first published in March 2012