McCarony - Fotolia
When developing a policy for data retention, it's important to consider the reason why the organization is archiving data in the first place.
Before we get to data retention policy best practices, keep these two questions in mind: Does the IT department need to free up space on some of the servers? Are the servers' contents becoming so cluttered that it's becoming increasingly difficult to locate data? These two questions have a major impact on the way the data retention policy should be constructed.
Data retention policies are serious matters, and it is important to consider the long-term consequences of implementing them.
What data to retain
Some data is required by law to be retained for a certain period of time. Other data is required to be retained by a company's internal rules. Other data is nice to keep around but isn't necessarily required for retention. Common types of retained data include files, email messages and database records.
One of the first best practices to keep in mind is knowing what data should remain live and what data should be archived. Typically, this determination will be made based on data age but not always. In some cases, it is important to also examine criteria, such as when the data was last accessed and the data type.
Suppose, for example, an organization has plenty of free space on the file server, but it wants to cut down on some of the clutter. With this data deletion goal in mind, it decides to create an archive policy that moves anything older than five years to the archives and then deletes anything more than 10 years old.
Although this might sound like a reasonable approach to creating a data retention policy, it may have unwanted consequences. For example, what happens if a spreadsheet was created six years ago but is regularly updated? If the data retention policy only looks at the creation date, then the spreadsheet would be archived, even though it is regularly used. It tends to be much more effective to base a retention policy on the last access date rather than the creation date.
Data retention policies can also backfire in other ways. For example, regarding document retention, let's say, 11 years ago, your organization signed a 15-year lease for its office building. In all likelihood, nobody has looked at the document in the last 10 years, but you probably want to keep it. The policy should take into account instances of this nature.
A data retention policy should be comprehensive but also one that an organization can easily manage and enforce. So being concise and clear is important as well.
Compliance is one of the major reasons for a company to retain data. In addition to a company's internal compliance rules, there are several laws and regulations that a company needs to consider in forming its data retention policy. It's important to figure out the applicable laws; an outside auditor can help.
The European Union's GDPR, for example, which went into effect in May 2018, features mandates applied to personal data produced by EU residents, no matter where it's stored. A data-collecting organization should have a data retention policy that specifically outlines GDPR compliance issues.
Other regulations that feature data retention requirements include the Sarbanes-Oxley Act and the Payment Card Industry Data Security Standard. Especially as it relates to these regulations, an organization should only keep personal data that's needed.
Staying compliant is a common business concern. Penalties for violations include fines and loss of reputation. Having a data retention schedule on hand can be a helpful tool for compliance. It's important to keep it updated, though, since data and laws change often, and an old schedule does not provide much value.
Retention period and how to store the data
A retention period is often determined by rules and regulations. Since retention periods range from minutes to years, an organization may need different types of media for storing data.
The public cloud is a popular storage location for long-term retention. Amazon Glacier, Microsoft Azure Blob Storage and Google Nearline are among the options for low-cost archival storage in the cloud. The storage is off-site, which is good for data protection. Restore times and costs can run high, though, depending on how much an organization needs to bring out of the cloud.
Tape is another media type for long-term storage that is cheaper than other options, such as disk. Durability is typically stated at up to 30 years for the latest LTO tape cartridges. LTO-8 provides 30 TB of compressed storage capacity. Restore speed is slow, though, so an organization shouldn't solely use tape to retain data that needs quick recovery.
Disk is more expensive but faster than tape. It's not a cost-effective place to store lots of data that needs long-term retention and probably won't be accessed.
Your data retention policy should outline which media types you use for given data sets.
Establish a geographical data retention policy
What retention tools should an Office 365 admin use?
Reduce backup volume with a retention policy
Dig Deeper on Backup and recovery software
Related Q&A from Brien Posey
VMware's new Virtual Volumes allow virtual machines and applications to be better allocated to storage. The feature is now available in beta. Continue Reading
The intelligent data management platform helps organizations get more out of their data. Explore why this trend has taken off and three best ... Continue Reading
Compatibility, connection and cost are key considerations for hyper-converged data backup. Dive into these best practices to improve your data ... Continue Reading