Months after making data deduplication a feature of its operating system to reduce primary storage, NetApp Inc. has added data deduplication to its NearStore virtual tape library (VTL).
All of the other major VTL vendors already offer data dedupe, but NetApp resisted adding the technology by partnering or acquisition as some of its main rivals did. By developing its own data dedupe IP, NetApp can include it for no charge in its VTLs, just as it does with its storage systems.
While NetApp put its VTL data deduplication through alpha and beta testing with customers, some customers went ahead and deployed NetApp filers with data deduplication for their backups without waiting for integration with the VTL interface. "Part of our NetApp SAN filers are already being used to store backups with deduplication," said David Waterhouse, senior system administrator for Dexma Inc.
"We were able to leverage features that were already in OnTap and bring it to market more quickly," Rogers said. "For the VTL we had to design it from the ground up."
IBM, Hewlett-Packard, Hitachi Data Systems and EMC have added data deduplication through partners or acquisitions. Having its own data dedupe technology will make it easier for NetApp to offer upgrades, Rogers said.
Waterhouse will probably deploy the VTL now that it has data deduplication at its secondary site for offloading data to tape. "Replication to a secondary site is about as far as we've gotten, but I'd like to replicate the data and then offload it to the VTL and traditional tape storage," he said.
NetApp also deploys data dedupe for its VTL in a different way than its competitors. For example, while the data deduplication is post-process by default, the fingerprinting or hashing process can be done inline. "Ultimately, we find both post-processing and inline deduplication approaches are needed and will be supplied by all vendors," Rogers said. "But the reason people value a VTL is performance, and we didn't want to put anything in the performance path."
If the inline fingerprinting process interferes with performance, the VTL can switch to pure post-processing. Rogers also said users can set data deduplication policies for individual virtual libraries and even individual virtual tapes. "It's taken them longer to introduce dedupe, but it seems they've been able to develop functionality that incorporates the best of multiple approaches," said Forrester Research analyst Stephanie Balaouras.
But, Balaouras added that she didn't think most customers would mix dedupe and nondedupe workloads. "You can, but the question is whether you can set SLAs for each workload, and it's also what customers decide to do in practice," she said.
"There are many aspects of NetApp VTL from deduplication … to … retention policies that can be adjusted based on specific SLA requirements," Rogers said.
Data deduplication has been a hot technology in recent years, but it is still in relatively early days. Data Domain, which uses data deduplication as the main technology in its backup and nearline storage systems, is considered the market leader. But the large storage and VTL vendors, such as Quantum and Sepaton, have added or upgraded their data deduplication products since the start of the year.