The Virginia Credit Union (VACU) is using NetApp Inc.'s data deduplication on primary storage, freeing up space for disk-backup and other projects.
The company has 90 TB of capacity on two FAS6070 filers and replicates snapshot backups to two FAS3020s at an offsite location for disaster recovery. Currently about 60 TB of that capacity is used, which represents growth in disk capacity over the last eight months, according to Richard Barlow, VACU systems architect and engineering manager.
"We've seen about 50% data reduction on our SQL database data and between 70% and 80% reduction on VMware volumes, but when you do data reduction, you end up using the space for something else," he said.
In VACU's case, that something else was a rollout of new applications using VMware clones and the elimination of tape for backups.
"We didn't have the storage to do that, and we didn't have the budget to get new storage to do it," Barlow recalled. Bringing in deduplication reduced those 3 TB down to about 65 GB across several volumes. Windows XP system images take up just 18 GB. "When you do data deduplication on virtual machine images, you can actually end up with less data than you started with," he said.
To back up VMware volumes, VACU used NetApp's FlexClones to create writeable snapshots that save only changes. Some of those writeable snapshots replaced fresh VMware mounts, saving more space.
With the space saved by delta-only snapshots and deduplication, VACU then set about getting rid of tape backups. VACU administrators were hand-delivering tape to a secondary data center around 30 minutes away before going to all-disk backup. VACU was also able to cut VMware's consolidated backup (VCB) proxy server out of the equation and reduce recovery times.
"With a snapshot you can restore a single object if you need to," Barlow said. "Instead of going through my tape backups to restore files for an hour or hour and a half, I type one command and have it back in a second."
Barlow said VACU has only seen about a 1% overall performance hit on its FAS6070 systems since deploying data dedupe, but acknowledges that he's had to do some careful volume management because NetApp doesn't deduplicate data across FlexVols. Part of the reason for this might be to load-balance deduplication processing over smaller volumes, rather than make the system crunch through it all at once, he said.
Even so, Barlow said NetApp representatives advised him not to run the data deduplication process more than once every few days. "Otherwise, there will be too much metadata overhead created, and you'll get a performance penalty," he said.
Barlow experimented with running the duplication process in the background of a VMware virtual desktop while an end user had it running. The desktop machine didn't skip a beat, and neither did the array, but Barlow said he saw one of the processors on the FAS6070 take a fairly sizeable impact, working at 25% load. "Usually, we don't see the processors on a 6070 go above 15%," he said.