Published: 08 Aug 2002
There's an old saying that the devil you know is better than the devil you don't know. For example, take the Windows operating system. It's easy to use, but this ease of use conceals a number of issues with the backup and recovery of NTFS and FAT filesystems.
The first issue is of long and short file names. For compatibility with MS-DOS, earlier versions of Windows and earlier applications, every version of Windows to date generates an 8.3 file name for every file with a long name, longer than the original MS-DOS 8.3 limitation. In Windows 2000 and XP, the 8.3 file name is generated as follows:
- Delete Unicode characters from the file names that are illegal in MS-DOS.
- Delete all periods but one (8.3 names can have only one period).
- Truncate the file to six characters.
- Append a tilde (~) and a number.
- Truncate the file name extension to three or fewer characters.
Here's where the devil starts rearing its ugly head: As shown in the next code listing on the next page, C:My Documents becomes C:MYDOCU~1. If you then created another directory called C:My DocuDramas, its 8.3 file name would become C:MYDOCU~2.
C:> dir /x my* Volume in drive C has no label. Volume Serial Number is 1234-B6A4 Directory of C: 08/12/2001 10:09p <DIR> MYDOCU~1 My Documents 08/12/2001 11:19p <DIR> MYDOCU~2 My DocuDramas 0 File(s) 0 bytes 2 Dir(s) 1,818,880,512 bytes free
Most people understand the basics of what I've just described. What they usually don't know is this dynamic creation of the 8.3 file name also occurs during a restore. Therefore, the order that the files are restored will determine the 8.3 file name of files. Since the order that files are restored might not be the same as the order in which they were created, the 8.3 file name of any given file may be different after a restore. This is discussed in a Microsoft support document that can be found at: http://support.microsoft.com/support/kb/articles/Q240/2/40.ASP.
Microsoft says the Windows backup program will restore the file names in alphabetical order. This means that C:My DocuDramas will get restored before C:My Documents. Therefore, the 8.3 file name for C:My DocuDramas will be C:MYDOCU~1, and C:My Documents will become C:MYDOCU~2.
It's important to note that although the Microsoft support article only mentions Windows 2000's native backup program, the problem exists for any backup and recovery program. This is because Microsoft provides no API for backing up or recovering the short file name associated with a particularly long file name. Therefore, all backup and recovery products will have this issue.
My Windows-savvy friends aren't that concerned about this particular issue because they believe no modern application would make use of 8.3 file names. A quick scan of my Windows 2000 registry shows otherwise. I found dozens of references to short file names by many applications, most of which I have purchased within the last year. Also, people are tempted to use short file names when creating batch files or scripts, because doing otherwise would require quoting the file name. Obviously, only you can tell the degree to which this problem will affect your work.
Another Windows peculiarity allows a user to block access to a storage administrator and to their files. Unix also allows an individual user to specify who can access their files - but no matter what - the root account can see any file. This is important for many reasons, one of which is backup and recovery. If root can't access your files, then the backup application can't back them up.
With NTFS, however, any user can right click on any directory, select Properties and the Security tab and will be presented with a dialogue box. In order to see this tab in Windows XP, you must disable simple file sharing. The user can then deny other users - including Administrator and System - access to their files by changing the access control lists (ACLs) for those files.
The default ACL contains permissions for Administrators, Everyone and System. The Administrator entry specifies what permissions someone logged in the computer as Administrator will have. The System entry specifies what access will be given to programs running under the System account - normally, commercial backup programs run under this account. The Everyone entry specifies anyone not specifically listed.
Users can, however, create an additional entry for the Administrator and then specify that the Administrator is denied all access. As strange as this may seem, this allows any user to deny access to the very people - and programs - that need access in order to do their job. Suppose a paranoid employee had an important file they wanted to keep private. They could deny access to everyone except themselves, including the System account. However, what they don't realize is that by doing that, they have made sure that their file won't get backed up either.
Here's a fix
Although an individual user may think that removing access from the Administrator's account will keep his file private, it will only do so as long as the Administrator allows it to stay private. At any time, the Administrator can reset the ACLs on any file within the system.
As with many things in Windows, the difficulty comes in automating Administrator's access from the command line. Ideally, you would run a program that looks for directories and files that have denied access to the System account, and then reset the ACLs on those files. However, I know of no command that's capable of searching for files in this manner. That doesn't mean that such a command doesn't exist. If someone knows of such a command, please let me know.
What you can do is monitor your backup logs. Any decent backup program will create some type of log that will tell you about any files that it couldn't back up. You could look for files that weren't backed up due to ACL issues, and then use the subinacl command from the resource kit to reset them.
Windows alternate data streams
Another issue with Windows NT and 2000 is NTFS' alternate data streams. In NTFS, a file consists of multiple streams of data. One holds the security information, (ACLs, etc.) and another holds the actual data of the file. One suggested application of this would be a stream that held the formatting of the data, and another that held the text that needs to be formatted. Each file can actually have a number of streams holding various types of information. This data is completely hidden from the average user, and the only way you can retrieve the data in a hidden data stream is to know the name of the stream.
The figure below shows how you can make your very own hidden alternate data streams.
C:>echo HIDDEN TEXT >myfile.txt C:>type myfile.txt >visible.txt:hidden.txt C:>type visible.txt C:>more <visible.txt:hidden.txt >lnewfile.txt C:>type newfile.txt HIDDEN TEXT
The first command (echo hidden.txt) creates a file with the text "HIDDEN TEXT." The second command (type myfile.txt) creates the file visible.txt, with the hidden alternate data stream hidden.txt. The hidden stream will contain the text "HIDDEN TEXT." To illustrate this, the next command (type visible.txt) shows that there appears to be nothing in the file visible.txt. However, if we know the name of the alternate data stream, hidden.txt, we can retrieve its contents using the more command. The type newfile.txt command shows that our operation was successful. If you're as curious about this as I was when I first heard about this, there's a free tool called LADS (List Alternate Data Streams) that's available for download at http://www.heysoft.de. This tool shows if you have any files with alternate data streams.
Alternate data streams aren't an issue as long as you don't use them, or if you have a utility that supports the backup and recovery of these streams. The best way to ensure that your backup and recovery utilities supports alternate data streams is to:
- Create a file with an alternate data stream.
- Back it up with your standard backup method.
- Recover the file.
- Use the method shown above to see if the alternate data stream is still there.
Needless to say, alternate data streams don't convert to NFS very well. In fact, they don't even translate to a FAT filesystem. If you copy an alternate data stream file to a FAT filesystem, and then back to an NTFS filesystem, the alternate data streams will be deleted. This means that if your method of backing up a Windows system involves an NFS mount, it will definitely not support the backup and recovery of alternate data streams.
One of the most common statements made by people who find out about alternate data streams (ADS) is, "Gee, that would be a great place for someone to plant a virus!" In addition to talking to your backup and recovery software vendor about ADS, you also might want to talk to your virus protection software vendor about them. If you have a file with a virus that has been quarantined, but still available on your hard drive, you might try using the technique above to place it into an alternate data stream. Then run your virus scan against the new file and make sure that it finds the virus. As long as you don't execute the virus file, you should be fine. But - of course - you do this at your own risk.
I don't want to pretend to be an expert on the various Windows operating systems. I also don't want anyone thinking I'm bashing Windows or Microsoft. My biggest difficulty with these issues is that most people I talk to don't know about them. Hopefully, this article will help clear up some of these issues. If anyone knows any workarounds to any of the issues mentioned in this article, please e-mail me.