I've written and spoken a ton on what to back up, how to back up and how to prepare for the worst disaster. I have...
rarely taken the opportunity to discuss, in essence, what not to back up as part of your backup management process.
It's a bit of a foreign concept, really -- the idea that you shouldn't back up some portions of your environment. But in a world where recovery windows are short -- which, in turn, makes the backup windows equally small -- you need to be certain the data, applications and systems you're including in your backup strategy are truly necessary.
Now, before you start commenting about how this backup management process is completely wrong, I'm not (entirely) saying that you shouldn't back up certain data sets ever. I want you to be thinking about the concept and use it as a lens when scrutinizing data sets and determining backup frequencies, storage locations and recovery strategies.
Data sets should typically follow the "3-2-1 backup" rule: three copies (including production as one), on two different media, with one backup held in an off-site location. However, here are four data set examples that should not be included in your frequent backups.
- Archive and cold data. This is usually older reference data that does not change: historical email, scientific readings and, in the future, internet of things sensor data. Once you have a backup of it -- particularly when you have it in cloud-based cold storage, where it's guaranteed to be accessible by the cloud vendor -- you should forget about including this data in your typical backup management process.
- Workstations. This usually only applies to those workstations deemed "noncritical." They also don't belong to the executive team, as those are often backed up. Assuming you have a gold image and a process of both deploying and updating a new workstation, these can generally be excluded from backups. The exception to this is when you're planning for ransomware attacks and have determined which machines should be protected to get them back up and running quicker. Having said that, even in that backup management process, most organizations still come up with the same list of critical workstations.
- DevOps virtual machines (VMs). Developers can spin up, use and discard VMs faster than you can finish reading this article. Unless development specifically asks for a VM to be included -- for example, when the VM will remain up in development for an extended period of time -- you should have a default position of not backing these VMs up.
- File-level backups. In cases where you have VM-level backups being generated, you don't need separate file-level backups as well. Most backup vendors support the virtual mounting of VMs, even directly from a backup, so that you can access the VM's file system to retrieve needed individual files.
The important thing here is that you are thinking about your backup management process and the fact that not everything should be backed up. When defining backup sets, you should be asking yourself: Do I really need to back this up? By doing so, you'll shorten backup and recovery windows, reduce needed storage and lower the overall cost of your backup strategy.
I'm guessing each of you reading this article could probably add to the list. Please do so in the comments section.