SharePoint 2007 backup reminds me of scuba diving. A lot of people enjoy it, but some people hate it. If you do...
it right, it won't be cheap, but nothing should go terribly wrong. If you do it wrong, well, let's just say that most of your training focuses on the various ways you could die. Diving without proper training can be literal suicide, and starting a SharePoint deployment without proper training can be career suicide.
To back up SharePoint, one must understand it first. To accomplish its missions of content management, search and information sharing, it uses a variety of applications. Each SharePoint portal is made of one or more Web servers, application servers, query servers, index servers, application servers, all storing content inside a SQL Server database server. Each of these parts of a SharePoint environment can be loaded onto a single server, or distributed across several servers in a farm. Backing up SharePoint running on a single server is relatively straightforward; backing up a farm not so much.
SharePoint recycle bins
SharePoint 2007 introduced multistage recycle bins that can be used to accommodate issues like accidental deletion. SharePoint files, documents, list items, lists and document libraries that are deleted are first stored in the recycle bin. Users can use this to retrieve these items that they accidentally deleted. If they purge their recycled items, they are still stored in the second-stage recycle bin where they can be restored only by the administrator.
Before deciding how to back up and restore your installation of SharePoint, you should review all of the options available to you using this Technet article. (The Technet article goes into much more detail than I have space to do here.) Getting back to my scuba analogy for a second, this list reminds me all of the options you are given when you learn to dive with enriched air, or Nitrox, which contains a higher percentage of oxygen. They tell you how to dive with Nitrox tables, air tables, and even give you formulas to calculate your own values. Then they say -- or you can get a dive computer. I don't know anyone that dives Nitrox without a computer -- it's just too risky. Get the tables or formulas wrong, and you end up with decompression sickness or oxygen toxicity.
What does this have to do with backing up SharePoint, you ask? While the Technet article provides a lot of options, there are so many that the average person will be left confused. Therefore, this article provides the "just buy a dive computer" answer. I value simplicity and automation when diving Nitrox and when backing up data, so let's look at these options in that light. The built-in graphical tool available into the Central Administration GUI cannot be used to restore the administration content database or the configuration database -- and it can't be scheduled. So much for that one. The stsadm command-line tool can be scheduled as a Scheduled Task, but it also can't restore the administration content or configuration databases.
The most common method of backing up SharePoint is to back up the SQL Server database where its data resides. Unfortunately, this is also the method with the most limitations, including not being able to restore the configuration and administration databases, restoring the search database is also not supported. It cannot recover at any level other than the database and it doesn't back up configuration changes or customizations done in SharePoint.
The problem with all of the methods so far has been that there are several different places where SharePoint stores information, and you must be able to restore all of them to the same point in time for everything to work properly; otherwise you end up with referential integrity problems. (It seems to me that ease of backup and recovery was not a priority for Microsoft when this product was designed.)
Microsoft's Data Protection Manager (DPM) is the first tool that looks promising. Of course, it's also the first one to cost money (a DPM and Enterprise DP license will set you back $579 and $431 respectively). The Technet article above mentions only benefits (no limitations) of using Data Protection Manager to protect SharePoint. DPM uses Volume Shadow Services (VSS) to snapshot the entire configuration and replicate it to a second location. Because it's using VSS, it can create a quick "backup" of the entire SharePoint farm at a single point in time, allowing it to restore the farm to that single point in time during recovery. It also has the ability to recover at multiple levels (e.g., farm, database or object). If you're open to using DPM, it looks worthy of consideration.
If you're already using a commercial backup product (other than DPM) to back up the rest of your environment, you should examine their SharePoint backup and recovery agent. I am generally a fan of using officially supported backup and recovery agents, but this seems like one of the strongest cases for one that I've seen. If you're not using their agent, then you're doing one of two things: you're using SharePoint's or SQL Server's built-in tools to back up to a disk file, then backing that file up with your commercial tool, or you're backing it up via SQL Server using your commercial tool's agent for that. Both of these will have the same major limitations mentioned in the first half of this article (e.g., inability to recover configuration or administration databases), and I would not recommend them at all.
The basic operations of commercial SharePoint agents are the same. SharePoint provides an API for commercial backup software products to connect to. When it is time to back up SharePoint, they connect to this API and transfer the data directly to your backup destination of choice (e.g. disk, virtual tape library, tape). They do the reverse of this during recovery. Each product then adds many value-added services on top of this basic functionality, such as the ability to back up the entire site, portions of the site, or to recover individual objects or SharePoint's Active Directory components. You should investigate the agent available to you to ensure that it provides all of the functionality that you need. If it doesn't, then perhaps you should examine using Data Protection Manager just for SharePoint.
One final note: wherever you back up SharePoint, I think you should use data deduplication. SharePoint creates more internally duplicated data in its database than any application I've ever seen. Using deduplication should dramatically reduce the amount of disk needed to back up SharePoint.
W. Curtis Preston (a.k.a. "Mr. Backup"), executive editor and independent backup expert, has been singularly focused on data backup and recovery for more than 15 years. From starting as a backup admin at a $35 billion dollar credit card company to being one of the most sought-after consultants, writers and speakers in this space, it's hard to find someone more focused on recovering lost data. He is the webmaster of BackupCentral.com, the author of hundreds of articles, and the books "Backup and Recovery" and "Using SANs and NAS."