Bad data, an operating system problem or a hardware failure. Anyone of these three dilemmas could lead to errors while backing up your Microsoft Exchange Server.
A -1018 error while backing up MS Exchange Server probably means you have a checksum failure. This -1018 error prevents complete backups via Windows cold because Exchange Server will not let you back up potentially corrupted data. This situation is generally unique to Exchange Server in the Microsoft world because it is one of the few programs that performs such rigorous checksum validation of its data.
How to fix it?
You can double-check the status of the Exchange database by running the checksum command on the esefile utility which is included in the support directory of Exchange Server 5.5 in Service Pack 3. This will give you a list of the pages which have failed the checksum test. Do not try to run more than one instance of esefile at once as this will cause the computer to hang.
As usual, the first step in tracking the problem is to look for recent unusual events. If you have been having power problems, such as surges or sags, they may have triggered data loss in a marginal caching disk controller. If the server has been turned off without being cleanly shut down to replace a drive, cache data may be have lost when the RAID rebuilt. Events like these can offer important clues to what has gone wrong.
Often checksum errors indicate an underlying hardware problem. However sometimes you get lucky and the fix is simple. One of the first things to check is your SCSI terminations. SCSI problems often lead to -1018 errors and sometimes the fix is as simple as plugging the termination back in.
Another possibility is that another piece of software is modifying the data directly. According to Microsoft this isn't common, but it has been known to happen. In this case, you need to check with the software vendor about possible solutions.
One other place to look for a possible quick fix is your caching controller, if you use one. The controller needs to be completely fault tolerant, which means, among other things, having battery backup to support the cache. If the battery goes dead you could have a problem. Microsoft recommends using the cache mirroring feature if your controller supports it.
From that point diagnosing the problem is mostly a matter walking the decision tree and old-fashioned elimination. A good place to start is the SCSI subsystem since this is a common source of checksum errors.
The ultimate fix?
So what is the ultimate fix? Microsoft claims it's setting up a fault tolerant system for Exchange Server. While -1018 problems are a royal pain in the neck, they are also a warning of a vulnerability in your Exchange Server system. It's worth acting on that warning to make the system more fault tolerant before something a lot worse happens.
For more information:
Rick Cook has been writing about mass storage since the days when the term meant an 80K floppy disk. The computers he learned on used ferrite cores and magnetic drums. For the last twenty years he has been a freelance writer specializing in storage and other computer issues..