Unless you are brand new to the IT field, or have not read anything about data storage in the past five or six years, you know that data growth for most organizations has been steadily in excess of 50% per year for at least as long. This has been the source of countless headaches for storage administrators when it comes to provisioning. With lower cost storage somewhat easing the pain, the spotlight is now on
Information or data lifecycle management (ILM) is really where it all begins. Although many vendors were quick to associate the acronym with hardware/software solutions, ILM is really about corporate decisions, policies and regulations regarding where data is stored, and how long it should be kept; technology only assists in automating those policies. Any data deleted or archived as a result means less data to back up (and restore).
Understanding what data an organization stores is arguably the first step to ILM. It also helps identify how much data there is, where it is stored as well as how and if it should be backed up. However, this is not a trivial task and requires both time and resources.
Because technology is sometimes cheaper than lawyers and can be implemented faster than policies, listed below are some other options.
How long backup data is kept will have a significant impact on the backup environment. Organizations must seriously question how useful a 30-day-old database is or whether they need the ability to restore email messages that are 30 or 60 days old -- remember distinction between the archive and backup. Backups protect from data loss and should not be used for long-term retention. In addition, archived data no longer needs to be backed up daily or weekly, nor does it need to be restored after a system failure.
Archives for email, file server, database
Symantec Enterprise Vault, Zantaz EAS, EMC EmailXtender and Centera, CommVault DataMigrator, Princeton Softech Optim, OpenText Livelink ECM, AXS-One AXS-Link, IBM Tivoli Archive Manager are just a few of the products that allow organizations to archive application data and manage retention. Once again, data archived at the source is no longer backed up daily, thus reducing backup storage utilization.
HSM capable solutions, such as IBM TSM, Storage Migrator and DiskXtender to name only a few, can also help reduce the backup and restore pains by migrating less frequently used data off the primary storage and leaving a "stub file" in place pointing to the actual file. Backup products integrated with these solutions will only backup or restore the stubs, thus minimizing the amount of data backed up.
RTO and recovery point objectives (RPO) dictate the difference between recoverability and availability. Data subject to zero downtime and zero loss requirements should be made highly available via replication, rather than simply backed up. For such data, many IT organizations have switched to multiple point-in-time copies on disk for primary data protection and only use traditional backups for added peace of mind.
Single instance storage
Single instance storage or data deduplication solutions are probably one of the most refreshing advancement in the backup storage arena. Products, such as DataDomain, Avamar, DoubleTake and NearStore, can be implemented as disk arrays or virtual tape libraries (VTL) and in some cases, can be fully integrated with an existing backup solution. These products can dramatically reduce the amount of storage required for backup data by only storing one instance of duplicate files. In some instances, granularity can be such that even duplicate data sequences will not be stored.
Well implemented, these processes, policies and tools will result in much leaner storage and backup environments in the new year.
This was first published in December 2006