Answered By Heather Stern
This is the reformatting and basically kick-in-the-pants of a question that's been in the mill for a few months. For 3 months this fellow patiently sent the message again, certain that someday, we would get to him.
Before I get started with the actual question, I'd like to make it completely clear to our readers... we do enjoy answering questions. For some strange reason, that is part of what is fun about Linux for those of us here in the Answer Gang. The Gazette exists to make Linux a little more fun so here we are.
However, we're all volunteers, we all have day jobs and most of us have families and pillows we like to visit with once in a while. There is no guarantee that anyone who sends us mail gets an answer.
[ He also had some problems that made his mail a good candidate to get ignored. Since we had another thread elsewhere on features that will help you get an answer, I moved my comments there, and you saw those in Creed of the Querent earlier this month. ]
I've added paragraphs, and hit it with my best shot, and maybe the Gang can help out a bit further. Comments from you, gentle readers, are always welcome too!
So now, on to the tasty question.
I wanted to build a fresh installation on my portable (Red Hat 5.2 upgraded to 6.0), but I didn't want to just erase the old one.
So I pulled the notebook's hard drive, plugged it into my server (Red Hat 6.2) and archived the contents with cp -a file file. The -a (archive) tells cp to preserve links, preserve file attributes if possible and to copy directories recursively. The copy process didn't return any errors... so far so good.
[Heather] Okay. So far, we have that you wanted to upgrade, so you planned to back it up. That's a good idea, but the method isn't so hot.
cp -a really only works if you're root, and I can't tell if you were, or not. But it's not the method I would use to do a proper backup of everything. I normally use GNU tar:
tar czvfpS /usr/local/otherbox-60-backup.tgz .
The options stand for (in order) [c]reate, use g[z]ip compression, be [v]erbose, send the output to a [f]ile instead of a tape (this option needs a parameter), save the [p]ermissions, be careful about [S]parse files if they exist. The file parameter given has a tgz extension to remind myself later that it's a tar in gzip format, and I put it in /usr/local because that usually has lots of free space. The very last parameter is a period, so that I'm indicating the current directory. I do not want to accidentally mix up any parts from my server into my otherbox backup.
Among other things it properly deals with backing up device files... all those strange things you'd normally use mknod to create.
[Mike] Before untarring, you MUST do "umask 000" or risk having /dev/null and other stuff not be world-writable.
[Heather] I haven't encountered that (I think this is what the p flag for tar solves) but good catch! Now this works okay for most circumstances and the nice thing is that you have a very easy way to check it is okay:
tar dzvfpS /usr/local/otherbox-60-backup.tgz .
Where the d stands for diff and all the rest is the same. Diff does have a glitch, and will complain about one special kind of file called a socket. X often has at least one of these, the log system usually uses one, and the mouse often uses one too. It's okay to ignore that because a socket depends on the context of the program that owns it, and right now, there's no program running from that disk to give it the right context anyway. (Okay. I'm guessing. but, that is a theory I have which seems to fit all the ways I see sockets used.)
Now my husband Jim doesn't always trust tar, and sometimes uses cpio. I'll let him or one of the rest of the Gang answer with a better description of using cpio correctly. What I will tell you is why. When you are about to do a full restore of a tarball, it checks to see if can assign the permissions back to the correct original owners. However, a complete restore will be to an empty disk, which won't have correct passwd and group files yet. Oops.
But there is a fix for this too, and I use it all the time when restoring:
tar xzvfpS --numeric-owner /usr/local/otherbox-60-backup.tgz .
It's just as valid to use midnight commander to create /mnt/emptydisk/etc, open up the backup tgz file, and copy across the precious /etc/shadow, /etc/group, and /etc/passwd files before issuing your restore command.
But when I ran diff -r file file, I got screen-fulls of errors. The most obvious problem was that diff was stuck in a loop with "/usr/bin/mh", a symbolic link pointing back to "/usr/bin". Make a pair of directories, each containing a symbolic link pointing back at the directory it resides in, and then run diff -r on those two directories and you can see what I mean.
The diff program doesn't fail on all symbolic links... just those that lead into loops and some others (I didn't take time to explore what it was about the others). I removed "/usr/bin/mh" (I'd have preferred not to have had to, but I wanted to move along and see what other snags I could hit), ran diff again with output redirected to a file and started taking the file apart with grep and wc to find out what general classes of error I was dealing with... turns out diff was failing on "character special" files, sockets and "block special" files.
I don't know what any of those three are, but I used find and wc again on the file system and discovered that diff was failing on every single one that it tried to compare. Does anybody know what to do about these problems?
update: After a week of trying, I'm unable to duplicate the event above. I installed Red Hat 6.0 on a pair of Gateways... basically the same procedure as I did for my disk usage article at the other end of that link.
When I ran the diff, it seemed to start looping somewhere in "/tmp/orbit-root"... I let it run for about 24 hours and the hard drive light was still flashing the next day, no error message.
I tried 6.0 transplanted into a 6.2 box... same thing. I put 6.0 on my portable, pulled the drive and attached it to my server, and got the same thing. I put 5.2 on my portable, upgraded it to 6.0, pulled the drive and attached it to my server... same circumstances as the original event... and diff looped somewhere in "/tmp/.X11-unix" instead of "/tmp/orbit-root".
[Heather] I simply don't recommend that full backups ever waste any time capturing /tmp. The point of this directory is to have a big place where programs can create the files if they need to. Make the programs do their own dirty work making sure they have the right parts. In my case, /tmp is a seperate partition, and I wouldn't even mount it in rescue mode.
While we're at mentioning filesystems to skip, make sure not to bother getting the /proc filesystem, either. The -l (little ell) switch to tar when making a backup, will make sure it won't wander across filesystem borders unless you specify them on the command line.
(mount its subvolumes here ... skip tmp, proc, and devfs if you have it)
tar czvfpSl /usr/local/otherbox-60-backup.tgz . usr var home
The diff program definitely has issues with most types of non-regular files (directories excluded), as well as the problem of at other times looping without ever generating an error message (which could, of course, be related to the same basic problem with non-regular files). Suggestion(s), anyone?
[Heather] If any of you kind readers have other interesting ways to make sure your backups work when you do a restore... their only reason for existence, after all... let us know!
|1 2 3 4 5 6 7|