Home > Data Backup Tips > Backup and recovery > Backup vs. archive
Data Backup Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

BACKUP AND RECOVERY

Backup vs. archive


Pierre Dorion
10.20.2008
Rating: -4.54- (out of 5)


Data backup technical tips
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


When discussing the difference between backups and archives, we should probably start with introducing some generally accepted definitions for the terms "backup" and "archive." I use the "generally accepted" qualifier because in IT, terminology is rarely absolute. Furthermore, the world of IT is full of smart people and smart people like to argue a lot.

Backup: A backup refers to a copy of data that may be used to restore the original in the event the latter is lost or damaged beyond repair. It is a safeguard for data that is being used.

Archive: An archive refers to a single or a collection of historical records specifically selected for long-term retention and future reference. It is usually data that is no longer actively used.

One of the differences worth noting in the above descriptions is that a backup is always a copy while an archive should be the original that was removed from its initial location and sent elsewhere for long-term retention.

Frequently, backup software is configured in an attempt to fulfill both roles. Data backup schedules often include daily incremental backups kept for seven days, a weekly full kept for a month, a monthly full kept for a year and finally, a yearly full kept for seven years. The yearly full backups are often referred to as seven-year archives. The problem with that scheme is that it becomes very difficult to single out specific files for long-term retention out of an entire backup job, so a user of this method often ends up having to lump everything into on large "archive" package, which is a full backup. Some email programs allow you to create so-called archives from a user's mailbox, but they all end up in one large file (i.e., a PST file for Exchange). So, retrieving an archived record can become a daunting task.

The lifecycle of backup copies

When we no longer need a backup copy of a file, usually because we have a sufficient number of point-in-time copies, we simply de...


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google



RELATED CONTENT
Backup and recovery
Criteria for choosing the right tape encryption solution for your data backup plan
Creating a System Recovery Disk in Windows 7: A step-by-step tutorial
Modern data backup and recovery system considerations
SQL Server data backup and recovery best practices
Secure your data backups with encryption key management best practices
Using data deduplication with backup applications: Source vs. target dedupe
Data backup for virtual machines: Alternative methods to VMware Consolidated Backup
Upgrading from LTO-3 to LTO-4 tape for data backup and recovery
Is VMware Consolidated Backup right for your enterprise?
Is cloud data backup service right for your organization?

Data storage backup tools
HP expands laptop and desktop data backup with Data Protector Notebook Extension
Data backup and recovery news briefs: Rackspace unveils cloud-based file storage apps
EMC gives Avamar desktop and laptop support
Terremark acquires managed data backup and recovery provider DS3 DataVaulting
Data backup and recovery news briefs: Dynamic Solutions introduces data deduplication products
Creating a System Recovery Disk in Windows 7: A step-by-step tutorial
Modern data backup and recovery system considerations
Data backup and recovery news briefs: Thales Group releases CryptoStor Tape 3.0 appliance
Data archiving reduces data backup workload prior to data deduplication
Symantec releases Linux version of Backup Exec System Recovery

Tape backup and tape libraries
Texas Tech turns to data deduplication for data backup, disaster recovery
Data backup and recovery news briefs: Rackspace unveils cloud-based file storage apps
Spectra Logic looks to leapfrog high-end tape storage market with T-Finity tape library
Secure your data backups with encryption key management best practices
Data backup news briefs: ProStor Systems ships InfiniVault removable disk backup appliance for SMBs
Upgrading from LTO-3 to LTO-4 tape for data backup and recovery
W. Curtis Preston: Articles and podcasts on data backup and recovery
The tape storage end game: The pros and cons of recycling backup tapes
Data backup and recovery news briefs: Tandberg Data introduces DAT tape drives and media
Community Health Centers Alliance takes control of data backup and recovery

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary


lete the oldest copy or backup job of which the file is part. This step is typically automated for us. Likewise, when we restore a file from a backup copy, we know what we are looking for and we may opt to select a point-in-time copy or version based on a date criteria.

If long-term backups are used in lieu of archives, things can get a lot more complicated. Other than to satisfy a legal or regulatory compliance requirement, archives are used to free up primary (production) disk storage space from data that is no longer actively used but must be retained. Keeping an entire backup job for seven years as per our earlier example is not a very cost-effective way to use storage. Because backups only copy data, the original file is left in place, which frees up no space at all. In addition, the backup creates yet another copy of the data, which means we are now using twice as much storage space as before unless files are deleted manually after the backup. This only adds more data management overhead.

Furthermore, because a lot of backups are organized in a sequence (i.e., full and incremental), a full backup has to potentially be taken out of sequence to capture the desired point-in-time full copy of a file. Day 3 of an incremental backup sequence would be useless without the previous full and incremental backups.

Archives

One way that archiving software is different from backup software is the cataloging and search capabilities. Metadata about archived objects is stored in a database and can be searched based on user-supplied criteria. Some backup software products also offer basic archiving capabilities where files can be archived, individually or in groups, and retained independently from the backups. Archived objects can be named or labeled based on the type of data, date, ownership or any other searchable criteria deemed appropriate to ease the search process for future reference.

However, for organizations generating and handling large volumes of archives such as email, imaging data, etc., the ability to create a simple searchable label is no longer sufficient. More sophisticated archive solutions providing search capabilities based on content, ownership, etc. are required. Just picture trying to find a 5-year-old patient record or court case transcript using your backup software's search wizard!

Another challenge with archives that backup software simply can't handle is the static nature of aging archived data vs. the very dynamic technology used to access that same data. As much as software vendors try to maintain backward compatibility with some applications, data eventually ages to a point where it can no longer be easily or usefully accessed with the latest release of an application. Some sophisticated archive solutions have integrated conversion capabilities to allow future access to certain data using a more universal format. An example would be the conversion of reports to PDF format, which allows future access to those reports without requiring the application that created them. Archived files are typically maintained "as is" and are not alterable.

So, an archive really isn't a backup copy kept for a long time. We can also add that using backup software to produce large amounts of archives that may eventually need to be searched (i.e., legal discovery) is a bad idea.

About the author: Pierre Dorion is the Data Center Practice Director and a Senior Consultant with Long View Systems Inc. in Phoenix, AZ, specializing in the areas of business continuity and disaster recovery planning services, and corporate data protection.


Rate this Tip
To rate tips, you must be a member of SearchDataBackup.com.
Register now to start rating these tips. Log in if you are already a member.




DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



Enterprise Backup Solutions - Continuous Data Protection (CDP)
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2008 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts