Home > Data Backup News > Big storage shops clamor for data deduplication
Data Backup News:
EMAIL THIS

Big storage shops clamor for data deduplication

By Beth Pariseau, News Writer
27 Sep 2007 | SearchStorage.com

Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   

NEW YORK, NY -- Data deduplication is a hot topic at this year's Storage Decisions conference, with users saying they're gung-ho about deploying the technology. However, those with large storage environments say they've had trouble finding a product that fits their requirements.

Brian Greenberg, director of data protection services for a large financial company based in Chicago called data deduplication the "Holy Grail" of disk-based backup Wednesday during a presentation on disk-based backup.

Still, Greenberg's company, which he declined to name, is sticking to tape for backup for now while waiting for deduplication to become more useful for disaster recovery.

More on data deduplication
EMC repackages Avamar data deduplication

Data Domain's CEO takes on nearline storage

Quantum first with data deduplication flexibility

Users dish on Symantec PureDisk
A cost analysis model Greenberg performed using systems-analysis software called iThink from Isee Systems showed that with a three-year retention scheme, the cost of media for about 68,000 tapes over the next five years would amount to $3.4 million. The cost of disk capacity for the same amount of data, not including power and cooling, comes out to $103 million -- and twice that amount for replication. However, he said, data deduplication at a ratio of 30:1 brought the disk costs down to about $3.2 million. "Data deduplication is the key to being able to do disk-based backup in our environment," he said.

So why isn't he using it? Greenberg said he will not deploy a data deduplication appliance until he finds one that can copy its deduped data store and its index to tape for disaster recovery purposes. He could copy data from most data deduplication systems to tape by "rehydrating" the data and backing up the same data separately, but Greenberg said he wants to save space on tape, too. "Being able to backup the catalog is a standard feature of a tape backup environment," he said. "Many of the vendors have asked me why I'd want to do tape backup when I can replicate between systems, but what if there's a rolling disaster that corrupts both?"

Pete Fischer, storage administrator for a large paper and packaging manufacturing company, said his company is desperate to find a product that can reduce the 400 TB of data it must protect every 24 hours. The company uses IBM's Tivoli Storage Manager (TSM) to send data from EMC Clariion CX500, 600 and 700 systems with a total of 27 TB usable capacity to Clariion Disk Library (CDL) virtual tape library (VTL) systems.

"We have barely enough room to keep our incremental backup data in the disk pool," Fischer said. Any overflow gets sent directly to the CDLs, which are also trying to backup data from the disk pool, causing bottlenecks. Fischer also said he's running out of capacity in his tape libraries, estimating that a fully populated Sun StorageTek SL8500 has about 30 percent of the drives he needs.

Fischer's company has brought in a Data Domain box for testing. He's also evaluating Diligent Technologies, but favors Data Domain because Diligent is strictly a VTL. "We're leery of VTL and tape in general at this point," Fischer said. His firm is putting Data Domain DD560 systems through rigorous performance testing, and Fischer said he's not satisfied with the product's scalability. The DD560s hold just over 1 TB of disk apiece, so he will need to deploy at least eight boxes and silo his data according to application. "What I want is to have the boxes be aware of each other, and to be able to get even more data reduction across applications," he said.

Mark Glazerman, storage and backup admin for a plastics manufacturing company, is happily running Data Domain DD560 and DD430 boxes to back up 25 TB. Glazerman said his most recent monitoring reports from his Data Domain systems show an average throughput of 10 MBps over 24 hours. That satisfies Glazerman, but won't work for everybody. [Update: Following publication of this article, Glazerman contacted SearchStorage.com to clarify that the 10 MBps throughput rate reported by the system is per drive, rather than for his entire system. At 15 drives, the entire system is getting an average throughput of 130 MBps, Glazerman said.]

Jannes Kleveberg, solution area manager for ATEA, a consulting firm that manages storage at a large automobile manufacturer's facilities in Europe, has considered deduplication for his client's 600 TB shop. He heard Glazerman's per-drive performance numbers with Data Domain and said "that kind of performance won't do in a large environment."

Kleveberg said he's concerned about post-process systems causing contention with the servers they draw data from after the backup window is over. "For us it always comes back to the performance issue," Kleveberg said.

Data Domain's director of product management Ed Reidenbach said users may point fingers at deduplication if they have poor performance because it's an unfamiliar technology. "We spend a lot of time debugging customer networks to resolve the issue, but since we're the new player in the environment [users] think we're the problem," he said. According to vice president of marketing Beth White, Data Domain is working on letting individual boxes connect through a global namespace to scale better. "We're still pushing the upper limits of our product," she said. "All of us [vendors] in this market are still working our way up the food chain to those megascale data center environments."



Tags: Data reduction and deduplicationVIEW ALL TAGS

Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   



RELATED CONTENT
Data reduction and deduplication
Texas Tech turns to data deduplication for data backup, disaster recovery
EMC gives Avamar 5 desktop and laptop data backup support
Data backup and recovery news briefs: Dynamic Solutions introduces data deduplication products
Data archiving reduces data backup workload prior to data deduplication
Arkeia takes aim at EMC Avamar with Kadena Systems data deduplication IP buy
Data backup and recovery news briefs: Druvaa Software updates flagship product, releases inSync v3.1
Data backup and recovery vendors dig into deduplication technology, aim for cloud backup
Data backup and recovery news briefs: Data Domain upgrades data deduplication appliances
Using data deduplication with backup applications: Source vs. target dedupe
Quantum launches midrange data deduplication backup appliances

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary



Data Backup Security: Tape Encryption & Backup Security
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2008 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts