News Stay informed about the latest enterprise technology news and product updates.

NetApp adds data deduplication for primary storage

NetApp adds block-level deduplication within its WAFL file system, though it's still not offered on its VTL.

Network Appliance Inc. (NetApp) announced general availability of block-level data deduplication within its NearStore R200 and FAS storage systems. The license key is free for NearStore users.

The data deduplication development is based on NetApp's Advanced Single Instance Storage (A-SIS), from its SnapLock product, and was four years in the making. The first instance of A-SIS came in a joint product with Symantec Corp., SnapVault Backup.

NetApp used a feature of its Write Anywhere File Layout (WAFL) to add A-SIS to its filers. WAFL already calculates a 16-bit checksum for each block of data it stores. For data deduplication, the hashes are pulled into a database and "redundancy candidates" that look similar are identified. Those blocks are then compared bit by bit, and if they are identical, the new block is discarded.

More on data deduplication

Sepaton claims 50-to-1 data reduction ratio

Data deduplication supplier pushes performance

Symantec adds reporting, failover to data deduplication

NetApp adds deduplication to NearStore

Special Report: Data Deduplication

13 data deduplication optimization guidelines

"That allows them to have deduplication with little to no performance impact," said W. Curtis Preston, vice president of data protection services at GlassHouse Technologies Inc. "That's huge."

NetApp released a white paper earlier this year which cautioned beta testers of the product about a number of limitations: A-SIS could only be used on CIFS or NFS data, it could not be applied to snapshots and it could not deduplicate across FlexVols, which at the time had a size limit of 4 terabytes (TB) on the supported platforms.

Those instructions were meant for beta testers first putting the product through its paces, according to Ravi Thota, director of product marketing, data protection and retention for NetApp. A-SIS can now be deployed on any FAS system, including storage area networks (SAN) and can be applied to snapshots. It still can't deduplicate across FlexVols, but the size limitation is 16 TB now that bigger platforms are supported.

"You still wouldn't want to run it frequently with snapshots because it creates a small amount of metadata overhead each time data is deduplicated," Thota said. "If you're taking frequent snapshots -- that can add up. It'll still work, but that's just a best practice we're advising for our customers."

Ironically, the data deduplication now applicable to NetApp's SAN and network-attached storage (NAS) platforms is still not available for its virtual tape library (VTL), despite the fact that secondary storage has been the most common place for data deduplication so far. Like IBM, NetApp remains concerned about the performance impact of data deduplication on the VTL, though rumor has it the capability is now in beta tests with users. NetApp will be doing demonstrations of A-SIS with the VTL at Symantec's Vision conference next month.

"Even though they still don't have it for the VTL, segmenting products is good because it's honesty, and it's making sure the application will work in real environments," according to Preston.

He also praised the idea of adding data deduplication to WAFL. "They are the first to offer this type of product for direct storage of user data and not just backups," he said. "Other deduplication products have NFS and CIFS interface, but end users can't store their data directly on the box."

The deduplication license is available now and is bundled in to the NearStore "personality" license. For FAS systems, the license costs $3,000.


Dig Deeper on Data reduction and deduplication

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.