This article is part of an Essential Guide, our editor-selected collection of our best articles, videos and other content on this topic. Explore more in this guide:
3. - How deduplication is used today: Read more in this section
- Data backup process angst persists, but dedupe and cloud offer relief
- What types of data yield a high deduplication ratio?
- Software-based deduplication and cloud backup
- Windows Server 2012 deduplication vs. backup software dedupe
- Using deduplication when backing up VMs
- Education group goes all-dedupe for data center, remote data backups
- How centralized backup can ease remote site data protection challenges
- Using data dedupe technology as part of your disaster recovery strategy
- Finding data deduplication solutions for DR
- Deduplication and data lifecycle management
Explore other sections in this guide:
- 1. - Backup deduplication technology today
- 2. - How deduplication is performed
- 4. - Data deduplication challenges
After trying cloud and tape backups, Teach For America Vice President of Technology Operations Thomas Licciardello said his organization saved money and a lot of time by switching to disk backups with deduplication in its remote offices and data centers.
Teach For America is a New York-based nonprofit organization that recruits teachers for districts with shortages. Licciardello said Teach For America has been an EMC customer for primary storage for years, beginning with Celerra network-attached storage (NAS) and now moving to a VNX unified system. About four years ago, it set out to revamp its backup, which consisted of tape in the data center, and online backup at its remote offices.
Teach For America has 1,900 full-time staffers with up to 10,000 teachers in its databases. The organization relies on government grants, so it is crucial for its data to be secure and up to date.
"We're a data-driven organization," Licciardello said. "We keep records on alumni and donors, and we have to be accurate with financial data because we have to protect government grants. We can lose grants without the right documentation. We definitely have to keep everything backed up and secured."
The backup overhaul started with EMC NetWorker backup software in its data center and an Avamar entry-level appliance with 2.2 TBs of capacity for remote data backup at one site.
"We saw benefits right off the bat" with a 6-1 dedupe ratio, Licciardello said.
Now, Teach For America uses Avamar software on file servers in remote offices and on appliances in its New York office and SunGard data center colocation site in New Jersey. It also added Data Domain DD670 target devices in the New York and New Jersey sites in 2010. Licciardello said the Data Domain appliances provide more than 10-to-1 dedupe ratios.
"Our data was growing exponentially, so deduplication was the next process," he said.
He said Teach For America uses Avamar's source dedupe products to protect virtual machines and unstructured data and Data Domain target dedupe for structured data. The organization uses NetWorker for tape backups for long-term retention.
As for having to manage three backup products, Licciardello said, "It's not bad. We have a smart team to handle it."
He said EMC's work to integrate management in the past few years has helped. "We don't have three products with three separate interfaces that do different things," he said.
One way EMC has tried to integrate management of its backup products is through DD Boost, which distributes the deduplication process between Data Domain target appliances and Avamar clients. Licciardello said Teach For America is testing DD Boost with its Oracle RAC database.
He said his backup window is about one-third of what it was before adding Avamar and Data Domain, and backups are easier to manage.
"The stability of backups has improved," Licciardello said. "We don't need people to come in and re-seed backups in the middle of the night anymore. We had one person fighting with backups all the time. Now we can do restores in minutes as opposed to days waiting for tapes to come back."
For remote data backups, Teach For America previously used cloud-based backups that required installing agents at each office.
"That was OK when we had 15 offices, but when we grew to about 40 offices, it became hard to manage because there was no central management," Licciardello said. "You had to manage each office individually, and costs were getting astronomical. Backing up to the cloud is great if you're a small company, but when you're enterprise-wide and have the amount of data we have in regional offices, it's not cost-beneficial. Moving backups in-house saved us money."
Teach For America has several options for disaster recovery. It replicates between its New York and New Jersey sites, and has a warm site in Atlanta. "Our New York office and [colocation] sites are so close, so we're creating insurance for ourselves," Licciardello said.
Based in midtown Manhattan, Teach For America was ready for the worst during Hurricane Sandy in October, but its offices didn't get hit as hard as other locations in its area.
"We had our procedures ready in case we needed to shut down anything," Licciardello said. "We had our backups ready. Luckily, we have some staff in Chicago that was able to monitor the situation for us because a lot of people here lost power during and after the storm."