Storage administrators at the largest companies have learned that without better data backup management, it will be impossible to tame their data growth regardless of what data reduction or protection tools they use.
In part one of our series on the state of data backup, we looked at how many users are favoring disk-based backup approach when buying a new data backup hardware solution. In parts two and three, you learned about modern data protection applications and how cloud computing backup is changing backup and recovery today. In this part, you will learn about how data backup management tools can make your job easier.
Data backup reporting tools can help, but there are still challenges
Because of its size, financial resources, and business model, Yahoo Inc. has extensive in-house programming and application coding resources. These resources have allowed Yahoo's manager of data protection Marcellus Tabor and his team to develop a centralized custom reporting tool for both primary storage and backup. The backup reporting tool is based on a MySQL database with a PHP interface and draws in data through SNMP and other means. Yahoo had no choice but to develop it in-house because Tabor says he's yet to find any commercial offering that would be as effective.
He says it makes no sense that data classification is omitted from backup apps, leaving it up to end users and application administrators rather than the storage and backup team.
"Backup vendors are going to have to get more intelligent about the way they do incrementals," he said. "If you have [a certain] kind of data set, incrementals can take longer than fulls." Backup software will eventually have to locate changed blocks of data without crawling an entire file system, Tabor predicted. "And any kind of parallelization they can add can't come soon enough," he added.
Tom Becchetti, a storage administrator for a large medical equipment manufacturing company, said he's found Aptare Inc.'s StorageConsole helpful in getting a good sense of his backup environment. Aptare helped him find a network bottleneck that saved his company from adding tape drives. "It helped me quickly see that we're not pushing the tape drives and start asking why," he said.
But Becchetti says he's yet to find a classification tool that's universally interoperable. "The thing about mainframes is you can apply security and management policy to the name of a file, it'll enforce adherence to the name across systems," he said. "The hurdle in open systems is that there are so many different operating systems and file systems. To get market share is very difficult."
Better data backup management strategies and archiving could cut the data glut
Beth Israel Deaconess Medical Center storage architect Michael Passe said traditional backup management paradigms will lose their usefulness when storage grows to the petabyte level. "We're going to have to shift gears," he said. "The systems you're starting to see come out for that amount of data are focused on low cost per gigabyte, internal redundancy with clustering, and geographic replication."
for that shift, Beth Israel has purchased an F5 Networks ARX Series file virtualization switch to automate the movement of unstructured data between tiers of storage for backup and archiving purposes. Automated migration will help make that task more manageable with large data repositories, Passe added. "Going forward, the focus will be on keeping active data on primary storage, while anything that doesn't change should come out of the data stream," he said.
The plan is to use the Acopia switch to segment out 180 days of data to be retained on a Data Domain box as an archive. Passe said he's aware of the risk of shifting vendor lock-in from a tape drive maker to the file virtualization switch vendor, but said the tradeoff is greater flexibility in his choice of backup and archiving products.
Ultimately, improving process may be the only way out
Becchetti said his background in mainframe gives him a different perspective on the data growth currently happening in the open-systems world.
"In the mainframe world, backups are backups, and you backup for two weeks and only for two weeks," he said. "The shift in open system is that you back up incrementally every day, a full weekly, and once a month you send a backup off site forever."
Changing this will require more of a change in mindset than technology, Becchetti said. "I've done things to try to bring up end users awareness, like cutting down on backups of home share," he said. "Next year, I hope to do away with backing up home shares, and if someone needs a file backed up, they'll have to put it in the department share."
Patrick Banghart, manager of the Windows server team for a large health insurance company in California, was told to revamp his company's data backup management strategy when he was hired in 2007.
At the time, his company was using an outdated version of CA Inc.'s ARCserve software for tape backup. Banghart is in the process of deploying backup to disk using Symantec Corp. NetBackup 6.5's DSSU feature.