Network Appliance Inc. (NetApp) announced general availability of block-level data deduplication within its NearStore R200 and FAS storage systems. The license key is free for NearStore users.
The data deduplication development is based on NetApp's Advanced Single Instance Storage (A-SIS), from its SnapLock product, and was four years in the making. The first instance of A-SIS came in a joint product with Symantec Corp., SnapVault Backup.
NetApp used a feature of its Write Anywhere File Layout (WAFL) to add A-SIS to its filers. WAFL already calculates a 16-bit checksum for each block of data it stores. For data deduplication, the hashes are pulled into a database and "redundancy candidates" that look similar are identified. Those blocks are then compared bit by bit, and if they are identical, the new block is discarded.
"That allows them to have deduplication with little to no performance impact," said W. Curtis Preston, vice president of data protection services at GlassHouse Technologies Inc. "That's huge."
NetApp released a white paper earlier this year which cautioned beta testers of the product about a number of limitations: A-SIS could only be used on CIFS or NFS data, it could not be applied to snapshots and it could not deduplicate across FlexVols, which at the time had a size limit of 4 terabytes (TB) on the supported platforms.
Those instructions were meant for beta testers first putting the product through its paces, according to Ravi Thota, director of product marketing, data protection and retention for NetApp. A-SIS can now be deployed on any FAS system, including storage area networks (SAN) and can be applied to snapshots. It still can't deduplicate across FlexVols, but the size limitation is 16 TB now that bigger platforms are supported.
"You still wouldn't want to run it frequently with snapshots because it creates a small amount of metadata overhead each time data is deduplicated," Thota said. "If you're taking frequent snapshots -- that can add up. It'll still work, but that's just a best practice we're advising for our customers."
Ironically, the data deduplication now applicable to NetApp's SAN and network-attached storage (NAS) platforms is still not available for its virtual tape library (VTL), despite the fact that secondary storage has been the most common place for data deduplication so far. Like IBM, NetApp remains concerned about the performance impact of data deduplication on the VTL, though rumor has it the capability is now in beta tests with users. NetApp will be doing demonstrations of A-SIS with the VTL at Symantec's Vision conference next month.
"Even though they still don't have it for the VTL, segmenting products is good because it's honesty, and it's making sure the application will work in real environments," according to Preston.
He also praised the idea of adding data deduplication to WAFL. "They are the first to offer this type of product for direct storage of user data and not just backups," he said. "Other deduplication products have NFS and CIFS interface, but end users can't store their data directly on the box."
The deduplication license is available now and is bundled in to the NearStore "personality" license. For FAS systems, the license costs $3,000.