...making Linux just a little more fun!
By Dave Bechtel
If you were incredibly lucky (like me), perhaps you received an external USB hard drive for Christmas. Or perhaps you have one lying around already, with plenty of free space. And perhaps you also read the recent Slashdot article about compression software and have lots of fairly sizable gzipped files laying about.
After reading the comments in that article, I was dismayed to learn that my favorite compression tool of choice (gzip) has no error-correction capabilities. While I deem it to be the best all-around for quick backups with a decent compression ratio, gzip will choke if it gets a data error on restore - and there's something to be said for data integrity.
So, having this nice shiny new USB external drive and some time on my hands, I wrote a Bash utility script to re-compress gzip files to bzip2, using the external drive. It takes an order of magnitude longer to compress, but at least I'll save some space and have a hope of recovering the compressed data if things go wrong... Right??
My particular external drive is a 120-gig that came factory-formatted as a single FAT32 partition. Now, any Linux guru worth their salt knows that this thing practically begs to be customized, since Fat32 has a 2GB(Linux) or 4GB(Windows) filesize limit - depending on who's writing to it.
So, I fired up my Knoppix HD install and repartitioned it. Nothing fancy, just good old fdisk.
Here's how it looks now:
$ fdisk -l /dev/sdb Disk /dev/sdb: 255 heads, 63 sectors, 14593 cylinders Units = cylinders of 16065 * 512 bytes Device Boot Start End Blocks Id System /dev/sdb1 * 1 1 8032 83 Linux /dev/sdb2 2 14593 117210240 f Win95 Ext'd (LBA) /dev/sdb5 2 18 136552 82 Linux swap /dev/sdb6 19 4999 40009882 c Win95 FAT32 (LBA) /dev/sdb7 5000 5622 5004247 83 Linux /dev/sdb8 5623 14593 72059557 83 Linux
(I did make a note of the fact that the factory-default was one big type "c", in case I needed to go back to that.)
Notice the 40GB Fat32 partition. In my other life (sssshhh!) I run Windows 2000 Professional - and was forcibly reminded that everything after Windows ME has a 32GB partition size limit for formatting Fat32. Note that the limitation is on formatting - not accessing - this is by design, and Microsoft has publically admitted it.
After going through several free Windows tools for formatting and repartitioning (and running into a brick wall), I eventually gave up on Windows 2000 formatting the thing. The vendor has a utility on their website to restore the drive to factory-default partitioning, but that doesn't really help my intended use of the drive. I could have formatted it in Windows 98, but that's no fun - and it would need a separate driver for the OS to recognize the drive.
So, rather than give up a perfectly usable 8GB, good old Linux to the rescue again:
$ mkdosfs -F 32 -v -n wdfat40 /dev/sdb6
Presto! Windows 2000 recognizes the drive just fine now, and it passes all the chkdsk tests. And for all you dual-booters out there, a wonderful utility exists called Ext2IFS ( http://www.fs-driver.org/ ). This allows NT-based systems like Windows 2000 to access ext2/ext3 partitions just like a regular drive - read/write, so no need for NTFS!
The Linux partitions were formatted like so:
mke2fs -j -c -m1 -v /dev/sdbX
Here are the /etc/fstab entries I created for the drive, BTW:
/dev/sdb6 /mnt/wdfat40 auto defaults,noauto,noatime,user,suid,noexec,uid=dave 0 0 /dev/sdb7 /mnt/wdlinux ext3 defaults,noauto,noatime,rw 0 0 /dev/sdb8 /mnt/wdvast ext3 defaults,noauto,noatime,rw 0 0
Note the "uid=dave" in that first line. That's so my non-root user account will have write access to the drive by default.
Now onto the good part - the "rezip"
At first, I started out by writing a fairly basic script with a simple function call and manually-entered filenames. Then I sat down and took another look at it - and practically rewrote it from scratch, with some features that occurred to me after several test runs.
rezip Currently Features:
$ find /mnt/bkps -name \*.gz > ~/rezipp-files.txt && rezip
Note: if you abort the script and then re-run it, you have to manually delete the last (partial) .bz2 file it was working on, or that will be skipped as well. This is where the log comes in handy. :)
-- KNOWN BUG(s):
$ kill %jobnumber-- Example:
^Z + Stopped rezip ' kill %1 ' + Terminated rezip
$ >rezip.logwill reset it to 0 length.
$ dd if=any-gz-file-more-than-20MB.gz of=KNOWNBAD.gz bs=1M count=21and redo your "find" to include it.
During the course of writing the script, I had hard-coded most of the defaults, such as the size of files to skip, the log file name, etc. These were eventually changed to be variables before the script was published for LG - so that you, the end-user, can have More Control (TM) over its actions. ;-)
I encourage everyone to READ THE SOURCE CODE before running rezip. You may find it handy to view it in an editor that colorizes or highlights executable syntax, such as ' mcedit ' or ' jstar '.
Comments, feature requests, bug reports, etc., are welcome.
( Don't forget to ' chmod +x rezip ' and put it somewhere in your $PATH - /usr/local/bin is suggested. )
Bio: Born in 1972, Dave Bechtel grew up programming in Basic with Apple ][e's, TI99 4/A, IBM PC (640K!) and a Tandy 1000SX, none of which actually had hard drives -- 360K floppy only. And we LIKED IT! ;-)
Eventually left BASIC behind, and moved on to programming in REXX and Bash.
Got interested in Linux around 1997. Started with Red Hat and went on to SuSE, tried several other distros and a *BSD or two, and has now settled on Knoppix/Debian/Ubuntu, in roughly that order. Currently living in Lake Zurich, IL.
Likes: Computers, motorcycles, Linux, reading and watching sci-fi (currently Star Trek TOS, Stargate, and Battlestar Galactica)