...making Linux just a little more fun!

May 2008 (#150):


This month's answers created by:

[ Sayantini Ghosh, Amit Kumar Saha, Ben Okopnik, Kapil Hari Paranjape, Karl-Heinz Herrmann, René Pfeiffer, Neil Youngman, Rick Moen, Thomas Adam ]
...and you, our readers!

Our Mailbag

Delete the contents

sunil pradhan [kumar22.sunil at gmail.com]

Fri, 11 Apr 2008 17:56:11 +0530

Hi Sir ,

Can you help me how to delete the contents of the file..


Sunil Pradhan.


[ Thread continues here (3 messages/1.67kB) ]

Scalable TCP Tuning

René Pfeiffer [lynx at luchs.at]

Mon, 31 Mar 2008 21:29:06 +0200

On Mar 31, 2008 at 0024 -0700, Erik van Zijst appeared and said:

> René Pfeiffer wrote:
>> [...]
>>  - /proc/sys/net/ipv4/tcp_low_latency controls if the data is forwarded
>>    directly through the TCP stack to the application buffer (=1) or not
>>    (=0). I have never benchmarked or compared this setting, thought it's
>>    always on on my laptop (as I noticed just now, I must have fiddled
>>    with sysctl.conf here).
> I'm not sure what that one does exactly, but the problem is not the 
> client-side, as it is fast enough to read the video from the socket. 
> Instead, it's the server-side that saturates the socket, filling up the 
> entire send buffer and thereby increasing the end-to-end time it takes for 
> data to travel from server to client.

I meant to try this on the server. I think it is designed to work on the client side, but I am not sure.

> The way our streaming solution works is by letting the server anticipate 
> congestion (blocking write calls) by reducing the video bitrate in 
> real-time. As a result, the send buffer is usually completely filled. For 
> that same reason, disabling Nagle's algorithm has no effect either: the send 
> buffer always contains more than one MSS of data.

I see.

> This is fine, but as I frequently get buffer underruns on networks with 
> highly fluctuating Bandwidth-Delay-Products, it looks like Linux is happy to 
> increase the send buffer's capacity when beneficial, but less so to decrease 
> it again when circumstances change.

Judging from the measurements I've seen when playing with the congestion algorithms, the Linux kernel seems to be able to decrease the sender window. However I think the behaviour is really targetted at having a full buffer and a suitable queue all of the time. You could check which one of the algorithms works best for your application and create another kernel module with the desired window behaviour. I make the distinction between buffer and window size since I believe that the congestion algorithms only affect the window handling, not the buffer handling.

>>  - The application keeps its own buffer, but you can also influence the
>>    maximum socket buffers of the TCP stack in the kernel.
>>    http://dsd.lbl.gov/TCP-tuning/linux.html describes the maximum size
>>    of send/receive buffers. You could try reducing this, but maybe you
>>    can't influence both sides of the connection.
> Yes, I've been tempted to manually shrink the send buffer from the 
> application-level, but since the fluctuating bandwidth and delay justify a 
> dynamic buffer size, I'm reluctant to try and hardwire any fixed values in 
> user space.

Yes, I agree, having an algorithm doing that automatically would be more useful.

[ ... ]

[ Thread continues here (7 messages/12.56kB) ]

Sendmail and capacity

Dennis Veatch [dennisveatch at bellsouth.net]

Fri, 25 Apr 2008 10:34:49 -0400

Hi guys and gals.

I have what I thought would be a simple question. How do you figure out how many emails sendmail can process and not drive the load average over say 2 or 3? After much googling around and trying to glean information from the sendmail FAQs, etc I am still stumped. I know it depends on hardware configuration, the number of mailboxes, how many emails are sent and received for a given time frame, etc. But I can't even find a general rule of thumb to even get a ball park idea. Can ya help me out?

Perhaps I am approaching this from the wrong perspective as I realize the above statements are most likely way to general to give even a ball park answer, though if you could that would be great.

You can tuna piano but you can't tune a fish.

[ Thread continues here (7 messages/14.66kB) ]

Netcape to OpenLDAP Migration

top mungkala [pakin8 at gmail.com]

Tue, 22 Apr 2008 11:56:36 +0700

I'm beginning the project of upgraded mail hosting. In the process I have to migrate data in old Netscape LDAP server to an OpenLDAP server. I'm newbie in UNIX shell script and my task is mail address book migration. I have only one text file which has data like this:

dn: cn=ldap://:389,dc=yomo,dc=aaa,dc=bbb,dc=ccc
cn: ldap://:389
objectclass: top
objectclass: applicationprocess
objectclass: ldapserver
generation: 020000318055502
aci: (targetattr = "*")(version 3.0; acl "Configuration Adminstrators Group";
 allow (all) groupdn = "ldap:///cn=Configuration Administrators, ou=Groups, o
 u=TopologyManagement, o=NetscapeRoot";)
aci: (targetattr = "*")(version 3.0; acl "Configuration Adminstrator"; allow (
 all) userdn = "ldap:///uid=admin,ou=Administrators, ou=TopologyManagement, o
aci: (targetattr = "*")(version 3.0; acl "Local Directory Adminstrators Group"
 ; allow (all) groupdn = "ldap:///ou=Directory Administrators, o=arc.net.my";
aci: (targetattr = "*")(version 3.0; acl "XXX Group"; allow (all)groupdn = "ld
 ap:///cn=slapd-yomo, cn=Netscape Directory Server, cn=Server Group, cn=yom
 o.aaa.bbb.ccc, ou=aaa.bbb.ccc, o=NetscapeRoot";)
modifiersname: uid=admin,ou=Administrators,ou=TopologyManagement,o=NetscapeRoo
modifytimestamp: 20000318055506Z
dn: un=RMohana4bdbd8,ou=sharonscy,ou=People,o=aaa.bbb.ccc,o=aaa.bbb.ccc,o=pab
un: RMohana4bdbd8
objectclass: top
objectclass: person
objectclass: organizationalPerson
objectclass: inetOrgPerson
objectclass: pabperson
memberofpab: AddressBook3e0c2d8
mail: rmohan@aaa.bbb.ccc
givenname: R.
sn: Mohan
cn: R. Mohan
creatorsname: uid=msg-admin-1,ou=People,o=aaa.bbb.ccc,o=aaa.bbb.ccc
modifiersname: uid=msg-admin-1,ou=People,o=aaa.bbb.ccc,o=aaa.bbb.ccc
createtimestamp: 20050622142039Z
modifytimestamp: 20050622142039Z

After I reviewed the files I found that each box entries has "objectclass: pabperson" is box entries of email address book so at first I want to detect the "objectclass: pabperson" and cut all its box entries. For each box entries is separated by the new line. please you give me any pointers on how to do this successfully by using shell script?

Thank You,


[ Thread continues here (4 messages/5.54kB) ]

WEP: a 1-minute wonder

Ben Okopnik [ben at linuxgazette.net]

Mon, 14 Apr 2008 14:44:47 -0400

WEP, since pretty much its very beginning, was acknowledged as a stop-gap protocol. Seems that the gap has been bridged:


  Abstract: We demonstrate an active attack on the WEP protocol that is
  able to recover a 104-bit WEP key using less than 40.000 frames in 50%
  of all cases. The IV of these packets can be randomly chosen. This is an
  improvement in the number of required frames by more than an order of
  magnitude over the best known key-recovery attacks for WEP. On a IEEE
  802.11g network, the number of frames required can be obtained by
  re-injection in less than a minute. The required computational effort is
  approximately 2^{20} RC4 key setups, which on current desktop and laptop
  CPUs is neglegible.
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *

[ Thread continues here (9 messages/14.50kB) ]

how to create hard link

Ben Okopnik [ben at linuxgazette.net]

Sun, 6 Apr 2008 23:34:38 -0400

----- Forwarded message from kailas <kailas1711@rediffmail.com> -----

Date: 5 Apr 2008 08:41:39 -0000
From: kailas  <kailas1711@rediffmail.com>
Reply-To: kailas  <kailas1711@rediffmail.com>
To: editor@linuxgazette.net
Subject: how to create hard link
respected sir, i required information about how to create hard link in linux with suatable examplewise.so please provide this information .

----- End forwarded message -----

* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *

[ Thread continues here (2 messages/1.61kB) ]

NT description

Petr Vavrinec [Petr.Vavrinec at seznam.cz]

Tue, 08 Apr 2008 08:10:54 +0200 (CEST)

Allmighty TAG,

How can I find out NetBIOS name and "NT description" (or "server string" in Samba terminology) of a windows box, knowing its IP address ?

"nmblookup -A <win_ip_address>" gives me the NetBIOS name. That's OK.

But the "server string" - I'm not able to find it anywhere :-( Can you help me? Any info is greatly appreciated.

TIA, Petr

 Petr Vavrinec                       E-Mail: petr.vavrinec@seznam.cz
 Vysice 8, 388 01 Blatna, CZECHIA    Voice :          +420 383490147

[ Thread continues here (4 messages/3.94kB) ]

Shell scripting Help

Amit Kumar Saha [amitsaha.in at gmail.com]

Wed, 16 Apr 2008 16:33:49 +0530

Hello all,

I have got a shell variable (passed as an argument) which stores some value such as:

$4 = 'abc,def'

Now, I want to replace all the ',' by a ' ' such that the resulting value is 'abc def'.

How do I do it?

This seems to be very basic, so PLEASE do not flame me :-)

I tried doing this:

echo $4 > devlist.tmp
#awk script to extract the fields (invididual devices) in the list
awk 'BEGIN { FS = "," } ; {print $1, $2 }' devlist.tmp > awk_tmp.tmp
devs='cat awk_tmp.tmp';
echo $devs

Seems like I am going now-where.

do suggest a solution!

regards, Amit

Amit Kumar Saha
*NetBeans Community
Docs Coordinator*

[ Thread continues here (15 messages/17.41kB) ]

sunversion on linux

Ben Okopnik [ben at linuxgazette.net]

Mon, 14 Apr 2008 13:12:20 -0400

----- Forwarded message from deepali wadekar <wadekardeepali@gmail.com> -----

Date: Mon, 14 Apr 2008 11:20:21 +0530
From: deepali wadekar <wadekardeepali@gmail.com>
To: editor@linuxgazette.net
Subject: sunversion on linux
   I am Deepali Wadekar  I am working as a linux admin as fresher.
   I was read some PDF of subversion.
   I install subversion on linux (rhel 4)
   but, how to accesse
   so please can u guide me,
   how to use of  subversion.
   With Regards
   Miss Deepali Wadekar
   Mobile: 09225775467

----- End forwarded message -----

* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *

[ Thread continues here (2 messages/2.20kB) ]

Talkback: Discuss this article with The Answer Gang

Published in Issue 150 of Linux Gazette, May 2008



[ In reference to "Laptop review: Averatec 5400 series" in LG#108 ]

Ben Okopnik [ben at linuxgazette.net]

Wed, 16 Apr 2008 20:17:53 -0400

----- Forwarded message from Edward Blaize <edwardblaize@gmail.com> -----

Date: Wed, 16 Apr 2008 17:48:06 -0500
From: Edward Blaize <edwardblaize@gmail.com>
To: editor@linuxgazette.net
Subject: 5400 series averatec & linux.
hello, this is directed towards Ben, i just read his review of how the 5400 series worked with linux, honestly most of it was over my head, i didnt really know what he was talking about with all the technical stuff. having said that, i own an averatec 5400 series laptop, and have had it for 3.5 years and i love it. i am interested in starting to use linux and have tried several distros and cant get them to work. which one would he recommend to a non technophile like me who just hates windows and is willing to learn to program if i have to, but dont really have the time. i used to program in machine language, basic, and fortran77, but was a beginner. i have long since forgotten those things a lifetime ago.im looking for a distro i can install and be relatively easy to use and will function well on this machine. i want to set up a dual boot system with windows/linux. any help would be greatly appreciated, thank you.

----- End forwarded message -----

* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *

[ Thread continues here (9 messages/13.61kB) ]


[ In reference to "TCP and Linux' Pluggable Congestion Control Algorithms" in LG#135 ]

René Pfeiffer [lynx at luchs.at]

Sun, 30 Mar 2008 23:37:14 +0200

I forward this request since I asked for

----- Forwarded message from René Pfeiffer <lynx@luchs.at> -----

From: René Pfeiffer <lynx@luchs.at>
Date: Sun, 30 Mar 2008 20:26:00 +0200
To: Erik van Zijst <erik.van.zijst@layerstream.com>
Subject: Re: Scalable TCP Tuning
Message-ID: <20080330182600.GC4927@nephtys.luchs.at>
In-Reply-To: <47EF3432.9090307@layerstream.com> Hello, Erik!

I'll answer to in private, but please let me know if I can send this answer also to The Answer Gang mailing list. We like to keep all feedback there, so our readers can find it.

On Mar 29, 2008 at 2333 -0700, Erik van Zijst appeared and said:

> Hi Rene,
> I read some of your Linux Gazette articles, specifically the one on TCP's 
> pluggable congestion control and maybe you can give me a little push in the 
> right direction.
> I'm in a startup doing streaming video over TCP (rather than UDP with 
> forward error correction), but TCP is giving me some latency headaches. To 
> ensure uninterrupted playback, we use a chunky client-side playback buffer, 
> but in its relentless quest for throughput-optimization, often under high 
> bandwidth and transcontinental RTT's, TCP manages to introduce enough 
> latency to underrun any buffer.
> With streaming video, keeping latency within bounds is often more important 
> than squeezing out a few percent extra throughput. I've looked at the 
> pluggable congestion control algorithms which are great, but pretty much all 
> of them focus on high throughput, rather than latency.

Yes, most of the algorithms deal with increasing throughput on fat pipes and high latency. Only the algorithms TCP Veno and Westwood deal with other scenarios (frequent packet loss on wireless links). Apart from that maybe Interactive TCP (iTCP) seems to be interesting, but this isn't available as module (yet). http://www.medianet.kent.edu/itcp/main.html

> TCP maintains the dynamic send buffer between user- and kernel-space and in 
> order to minimize context switches, Linux seems to have a tendency to making 
> these really large. On high-latency, transcontinental connections, I often 
> get 1MB+ send buffers that can easily contain over 10 seconds of video. From 
> what I see, the kernel modules mostly seem to tune only the size of the cwnd 
> within the send buffer, rather than the send buffer as a whole, but since 
> this is probably the main cause for increased latency, I'm looking for a way 
> to tune this and always keep it as small as possible. Linux already seems to 
> increase the send buffer's capacity when the cwnd increases, but never seems 
> to shrink it again.
> Would you have any tips for a Linux-based startup when it comes to 
> low-latency TCP tuning?

The only thing I noticed are the following settings in /proc.

[ ... ]

[ Thread continues here (1 message/4.14kB) ]


[ In reference to "Measuring TCP Congestion Windows" in LG#136 ]

Thu, 10 Apr 2008 10:35:37 -0700


Read your article in the Gazette.

I was curious to know if there is a way to read TCP Acks programatically ? If so how?

Would appreciate any feedback you can provide.




[ In reference to "2-Cent Tips" in LG#149 ]

Mulyadi Santosa [mulyadi.santosa at gmail.com]

Wed, 16 Apr 2008 08:58:00 +0700

Hi Rolland... :)

On Wed, Apr 16, 2008 at 2:46 AM, Rolland Sovarszki <rollandsov@gmail.com> wrote:

> Hi there
> I am new to Linux, but with a big desire to become better. I have recently
> read your article "2-cent tips: convert the collection of your mp3 files
> into ogg"
> in Linux Gazzete. I have found it very interesting, and I wanted to try it
> out for myself. Everything went fine, except the script for replacing the "
> " (blanks) with "_".
>  After a little digging, and study of sed command, I have found out that the
> problem came from this line of code:
> do mv -v "$a" $(echo $a | sed s/\/\_/g);
> So I have replaced it with:
>  do mv -v "$a" $(echo $a | sed s/" "/\_/g);

Thanks a ton! :) If you check the discussion thread in TAG following my post, there are various criticism for this tip, whether it's improvement or correction. Therefore, I highly appreciate your feedback and CC your e-mail to TAG, so everybody can get the benefit.



[ Thread continues here (4 messages/4.92kB) ]


[ In reference to "VPN Networking" in LG#149 ]

Tue, 15 Apr 2008 09:09:34 -0400

This article may give me a start, but I am looking for something a little closer to my requirements. I need to emulate a network of embedded devices, preferably by creating a pseudo-router with something that acts like those devices behind it. Every article I have seen suggests putting the target devices into virtual machines with a router VM in front. But the embedded devices I need to emulate are neither PCs nor OS based. We buy the primary device from another company, and it comes with a Rabbit 2000 and Ethernet interface. That company provides us with firmware to our specifications, which we simply write into flash memory with utilities they also provide. I need to emulate a network of 2000 of these devices to do a decent load test on our servers. This emulation needs to duplicate the behavior of the TCP/IP stack as well as initiate socket connections, data requests and transactions just as the target devices do. Some of this behavior will have to be characterized by analyzing Wireshark traces and some experimentation.

Any suggestions?

Thank you,

Bob McConnell

Principal Communications Programmer
The CBORD Group, Inc.
61 Brown Road
Ithaca NY, 14850
Phone 607 257-2410
FAX 607 257-1902
Email rvm@cbord.com
Web www.cbord.com

[ Thread continues here (2 messages/3.03kB) ]


[ In reference to "Plotting the spirograph equations with 'gnuplot'" in LG#133 ]

Ben Okopnik [ben at linuxgazette.net]

Fri, 11 Apr 2008 14:35:01 -0400

----- Forwarded message from arnoldmashava@gmail.com -----

Date: Thu, 10 Apr 2008 23:03:36 +0200
From: Arnold Mashava <arnoldmashava@gmail.com>
To: tag@lists.linuxgazette.net
Subject: Talkback:133/luana.html
Cheers for the Gnu plot article. I am an OpenSUSE LINUX user and still doing my MSc.Eng at the University of Natal, Durban, South Africa. What's the Open Source equivalent of MathType and Office 2007, Open Office.org still has a long way to go until they catch up with MS Office 2007.


----- End forwarded message -----

* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *

[ Thread continues here (3 messages/2.35kB) ]


[ In reference to "Migrating a Mail Server to Postfix/Cyrus/OpenLDAP" in LG#124 ]

René Pfeiffer [lynx at luchs.at]

Sun, 30 Mar 2008 20:01:01 +0200

Hello, Pawel!

On Mar 11, 2008 at 1426 +0000, Pawel Eljasz appeared and said:

> Hi Rene.
> My name is Pawel and firstly I'd like to thank you for your contribution
> into open source, for sharing knowledge.

My pleasure, thanks.

> I've read your article on as in subject of this email. Ihave to mention
> don't know much about ldap and postfix much,
> I do my best to change that state of affairs. I wonder if you could
> briefly advise, put me on right line of thought.
> This is a problem I've come across:
> when I run your script:
> ../cgate_migrate_ldap.pl --source ubuntu --target localhost --verbose 4
> Connected to source server :   ubuntu
> Connected to target server :   localhost
> account has been moved to a remote system at ./cgate_migrate_ldap.pl
> line 174, <DATA> line 228.

I just saw that there is a password hardcoded into the script. I wanted to avoid that but fortunately the old server got decommissioned after the migration. :) You have to use the matching password for access to the LDAP tree on the source server. Usually you use a privileged password on the source LDAP server for that since you want to read everything from the source tree.

> I'm not sure if I understand it correctly when in the article you say:
> "...
> We use the LDAP tree of our organisation dc=example,dc=net and create a
> subtree for all our accounts.
> Then we create another subtree for the Postfix settings... "
> Do you mean here the actual creation of the structure you mention just
> just after, creation with help of
> ldap client tools like, ldapadd, ldapmodify. And I get above error
> because I did not do it?

No, I think you got the error simply because of the hardcoded password in the script (which is hopefully wrong to access your server).

Of course apart from that you can prepare any structures in your LDAP tree with the clients tools if you like. However the cgate_migrate_ldap.pl script will overwrite existing structures, so be careful.

> In case that is not the reason for the error I get, here is some info on
> systems I work on:
> cgpro 4.1.8 on ubuntu 2.6.20-16-server and target machine is fedora 8
> with openLdap 2.3.39

Should work fine, we also migrated from a 4.1.x CommuniGate Pro server.

> Sorry if I got so straight to the point, I hope you don't find it impolite.

No, I like direct questions so I can give direct answers. :)

Best regards, René.

  )\._.,--....,'``.      Let GNU/Linux work for you while you take a nap.
 /,   _.. \   _\  (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/
`._.-(,_..'--(,_..'`-.;.'  - System administration + Consulting + Teaching -
Got mail delivery problems?  http://web.luchs.at/information/blockedmail.php


[ In reference to "Automatic creation of an Impress presentation from a series of images" in LG#116 ]

Karl-Heinz Herrmann [kh1 at khherrmann.de]

Sun, 30 Mar 2008 21:34:43 +0200


since this is a question directly related to the linux-gazette I'm CC'ing the mailinglist there (TAG).

On Mon, 24 Mar 2008 14:44:43 -0400 "Jim Shupert, Jr." <jshupert@pps-inc.com> wrote:

> http://linuxgazette.net/116/herrmann.html
> I have the complete perl script.

be aware that recent versions of ::OODoc broke the script. A tiny correction will make it work again, see: http://linuxgazette.net/132/lg_talkback.html#talkback.02 http://www.khherrmann.de/Programs.shtml http://www.khherrmann.de/Code/img2ooImpress.pl

> 1 matter eludes me - i do not understand -
> Where is the info to tell the program where to find a dir of images.

the inbuilt mini-help says:

  img2ooImpress.pl ImageFileList [outputfile.odp]
  ImageFileList is a file containing all images to import

and yes, the script expects a filename containing all images, each on its own line. You can easily create a file like that by following the examples in the original linux-gazette article.

essentially, after making sure your files are sorted properly by ls something like:

ls image*.jpg > filelist

will write all jpg-images starting with "image" into the file "filelist".

running img2ooImpress.pl filelist

will then create the output odp file (default odp name, if you pass a second parameter it will be used as output file).

> i see this
> my $imgFileList=shift;how is this imagefile list made? I am expecting

no -- it simply fetches the first commandline argument into $imgFileList -- and subsequently opens that file (or fails).

> to see a path to a dir of images....I have just reread your code
> before sending this and Think I might understand now.the usage
> img2ooImpress.pl ImageFileList [outputfile.swi]the perl prog     then
> a txtfile   [ and this is optional? ]and then should the
> imageFileList be in the same dir with the perl?

No -- you can keep the perl script in ~/bin (and add ~/bin to your PATH). Also the given file-name for the file-list can contain a full path -- so no need to have it lokal in the the same directory as the perl script OR the images.

[ ... ]

[ Thread continues here (1 message/3.55kB) ]

Talkback: Discuss this article with The Answer Gang

Published in Issue 150 of Linux Gazette, May 2008

2-Cent Tips

2-cent tip: Convert a bunch of Images from one format to another

Amit Kumar Saha [amitsaha.in at gmail.com]

Tue, 22 Apr 2008 11:21:30 +0530

convert is a command line tool available with ImageMagick which can be used to convert between image formats. This shell script uses it to convert a bunch of images from '.ps' to '.png'. Follow embedded instructions to customize it to your needs.

#Script begins here
# Uses the 'convert' utility to convert a bunch of images from one
#format to another
# the script should reside in the same directory as the images
# You need to have 'ImageMagick' installed
# By Amit K.Saha
# Help taken from http://howtoforge.com/forums/showthread.php?t=4493
#to convert to/from different formats
# change 'ext' to reflect the format you want to convert "from"
# Chnage target to the format you want to convert to
for i in *.ps
base=$(basename $i $ext) #to extract only the filename and NOT the extension
convert $i $base.$target
#Script ends here

Try it and suggest improvements!

Thanks, Amit

Amit Kumar Saha
*NetBeans Community
Docs Coordinator*

[ Thread continues here (17 messages/35.76kB) ]

2-cent tip: Linux Magic System Request Keys

René Pfeiffer [lynx at luchs.at]

Fri, 11 Apr 2008 22:08:26 +0200


Some of you might now the Linux Magic System Request Keys (see the file sysrq.txt in the kernel documentation). These keys allow to send some commands directly to the Linux kernel. This is dangerous, but it can be very useful since you can sync all buffers to disk, see all processes, dump diagnostic information to the console and immediately reboot the system. You can enter the commands on x86 system with the keys ALT-SysRq-<command key>.

Well, what happens if your server is co-located and you don't have any keyboard at hand? You can still use the magic keys by means of writing to /proc/sysrq-trigger.

"echo s > /proc/sysrq-trigger" "presses" the magic key 's' that syncs all buffers. Likewise "echo b > /proc/sysrq-trigger" reboots the system immediately (i.e. without noticing applications and writing data to disk!). The latter was very useful for when recovering a co-located server from a crash of the kernel crypt daemon which blocked access to the filesystem.

Best, René.

2-cent tips: KeyTouch, .sig generators, Corrupted ISOs, etc.

Jonathan Clark [clarjon1 at gmail.com]

Wed, 2 Apr 2008 21:58:01 -0400

Hello, all. On my laptop, I've got those lovely little function keys which, by default, do absolutely NOTHING in Linux. However, all that can change!

I've found with my current distro, PCLinuxOS, a handy little utility called KeyTouch. This program allows you to map your keys to what they are supposed to do. You can easily map your keys like this, and, from what I understand, this doesn't work just with laptops, but also with multimedia keyboard, microsoft keyboards, and so forth. Very useful.

Now, on my laptop, I've got one of those buttons which, I suspect, is supposed to change the power usage profile in Windows. However, I've found a more useful job for it to do. In keytouch-editor, I have it set so that when that key combo is pressed, it will run:

kdialog --passivepopup "`acpi`"
which pops up a little 'speech bubble' to show me the current battery status, as well as time remaining for discharge or recharging the battery. Very useful, in my opinion.

What else do I have? Oh, yes. In Compiz-Fusion, there seems to be a bug with the alt-tab application switcher, which will occasionally kill off the Emerald window manager. I thought it was a bug with the keyboard shortcut, until I remapped it to win-tab and it would still kill it. So, if this is annoying you, disable the "Application Switcher" module, enable something like "Ring Switcher", and map it to Alt-Tab. Just like that, the decoration issue is solved.

Ever wondered how people get those static-yet-dynamic sig-generators? It's easy to do, with a simple bash script. First, create a file, let's call it sig. Put in the information that you are wanting to stay static. Make sure that there is an empty line at the end of the file. Now, you probably want to use something like fortune to get the changing part of the sig, right? Here's a shell script I use, called sig.sh:

cat ~/sig
Set this little script to be executable, then go to the mail client of your choice, tell it to get the sig from a program, and tell it to use your sig.sh.

And, finally, corrupted ISO downloads. A friend taught me how to fix these: If your ISO is corrupted, it seems like all you have to do to fix it is to create/get a torrent of it, stick the corrupted download in the directory where the ISO will be downloaded to by default, and let the torrent system correct it for you!

Hope that these may help some users, will be giving more when I can think of some more!

Proud Linux User.
PCLinuxOS on Dell Inspiron 1501,1 Gig ram,80 Gig Hard drive
1.7gHz AMD Athlon 64bit dualcore processor
Gmail/gtalk:  clarjon1
irc: #pclinuxos,#pclinuxos-support,##linux on freenode
Baruch's Observation:

[ Thread continues here (5 messages/12.39kB) ]

2-cent tip: Snapedit

Ben Okopnik [ben at linuxgazette.net]

Sat, 19 Apr 2008 14:00:33 -0400

When I see a post in The Answer Gang and want to try out some submitted code, I often want to run it to see what it does - but the procedure to do this (open another xterm, fire up 'vi', put it into "append" mode, paste the code, etc.) is a pain. So, I've created a script that helps me do all of this conveniently.

# Created by Ben Okopnik on Sun Apr 13 11:22:45 EDT 2008
# Requires 'dialog' and 'xclip'
cd /tmp
label="New filename:"
while :
	fname=`/usr/bin/dialog --stdout --inputbox "$label" 7 40`
	# WEIRD: '-f' doesn't expand/handle '~'! We'll borrow the shell's brain.
	fname=`eval /bin/echo $fname`
	if [ -f "$fname" ]
		label="\"$fname\" already exists. New name:"
		[ "$fname" = "" ] && exit
		/usr/bin/xclip -o > "$fname"
		/bin/vi "$fname"

I also created an icon on my Window Manager toolbar that executes "/usr/bin/xterm -e /usr/local/bin/snapedit". Now, whenever I highlight any kind of text, I can click on the icon, enter a filename (optionally including a path - otherwise, the file will be created in "/tmp") in the dialog box, and get a 'vi' session with the selected content already in the file. Ever since I created this thing, I've found more and more uses for it. Give it a try, and you will too!

* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *

[ Thread continues here (7 messages/10.11kB) ]

2-cent tips: Using git

Jimmy O'Regan [joregan at gmail.com]

Tue, 8 Apr 2008 01:04:34 +0100

Using git to recreate an old CVS layout
I had to go hunting for old data in a CVS repository today, and wanted to hang on to the data in one place. git-cvsimport does a nice job of maintaining the CVS history of each module, but because of the way the repository was laid out, each module had to be grabbed separately. To join them back together, I created a new git repository, and used 'git-submodule add' to drag in each of the converted modules.

It's not as convenient as being able to look at the whole project history at once, but at least it's a lot easier to pass to the next poor soul who finds themselves in my situation.

Git as a better SVN
I've been using git-svn for decentralised SVN access for a few months - sometimes I don't have access to an internet connection, sometimes I don't want to commit something until it's finished, sometimes I want to be able to work on a branchwithout having to redo everything when I want to merge - but today I found another use: to revert a whole commit, instead of individual files.

There's probably a proper way to do it, but being able to open gitk, right click on the commit and save it as a file, chop off the mail headers and use patch -R -p1 was a lot quicker than the subversion way. (Though SVN is probably a better if you just want to revert one file from a commit).

[ Thread continues here (4 messages/4.51kB) ]

Talkback: Discuss this article with The Answer Gang

Published in Issue 150 of Linux Gazette, May 2008


By Kat Tanaka Okopnik, Howard Dyckoff, and Deividson Luiz Okopnik

News Bytes

thunderboltthunderboltlightning boltContents:

Please submit your News Bytes items in plain text; other formats may be rejected without reading. A one- or two-paragraph summary plus a URL has a much higher chance of being published than an entire press release. Submit items to bytes@linuxgazette.net.

thunderboltNews in General

MySQL announces 5.1 GA, future 6.0 dual version

MySQL 5.1 was at version 5.1.24 before the GA announcement - in part because the company and community wanted to release a better and more stable product than the 5.0 release 2 years earlier. A near-final release candidate of the GPL software is available for download now at http://dev.mysql.com/downloads/.

MySQL 5.1 supports five forms of horizontal data partitioning: range, hash, key, list, and composite (sub-partitioning). By partitioning table and index data, faster query response times can be achieved as only the relevant partitions of data need to be scanned, instead of the entire table or index. It also features a new Event Scheduler for common recurring SQL-based tasks to execute on the database server instead of using external cron jobs.

Discussions at Slashdot and elsewhere speculated that MySQL was moving in a proprietary direction, based on mention of 'commercial extensions' planned for 6.0 that would only be available to subscribers to the enterprise edition of MySQL 6; however, Marten Mickos, formerly the CEO and spokesperson for the recently acquired MySQL Corporation, now SVP for Database Group at Sun Microsystems, clarified the thinking at MySQL in an email to Linux Gazette:

First and foremost, version 6.0 will be a great and fully functional free and open source software product, available for anyone to download, use, modify and redistribute. It is always our goal to expand the (non-paying) user base of MySQL. There exist today a number of ways to take backups of MySQL databases, but in 6.0 we will have a built-in and more advanced functionality for it. We are ourselves developing such add-ons, and we plan to deliver them to subscription customers only. Examples of such add-ons are encryption and compression of the backup. Users of MySQL 6.0 can manage very well without those add-ons. They can also build add-ons themselves, commission others to build add-ons, or buy add-ons from MySQL's partners and probably also competitors. And customers can buy the subscription from MySQL/Sun and thereby gain access to the add-ons we are producing. Also, let me note that this model is part of our MySQL Enterprise subscription offering. Since 3 years we have a similar model for the MySQL Monitor, which is an add-on tool that we ship to subscription customers only. The decision on this was made long before we were acquired by Sun (so it is entirely incorrect to think that Sun is behind this).

The list of new features in MySQL 6.0: http://dev.mysql.com/doc/refman/6.0/en/mysql-nutshell.html

For downloads and more information on MySQL 5.1, please visit http://dev.mysql.com/downloads/.


Interop Las Vegas - 2008
April 27 - May 2, Mandalay Bay, Las Vegas, NV

JavaOne 2008
May 6 - 9, San Francisco, CA

IWW2008 Internet Identity Workshop
May 12 - 14, Mountain View, CA http://iiw.idcommons.net/index.php/Iiw2008a

ChicagoCon 2008 Spring (ChiCon 2008s) - Ethical Hacking Conference
May 12 - 18, MicroTrain Center, Lombard, IL

May 13 - 15, Chicago, IL
Free Expo Pass Code: EM1

Forrester's IT Forum 2008
May 20 - 23, The Venetian, Las Vegas, NV

Data Governance 2008
June 2 - 5, Hotel Kabuki, San Franciso, CA

DC PHP Conference & Expo 2008
June 2 - 4, George Washington University, Washington, DC

Gartner IT Security Summit
June 2 - 4, Washington DC

Symantec Vision 2008
June 9 - 12, The Venetian, Las Vegas, NV

Red Hat Summit 2008
June 18 - 20, Hynes Convention Center, Boston, MA

Gilbane Content Management Conference SF
June 18 - 28, San Franciso, CA

The 2008 USENIX Annual Technical Conference (USENIX '08)
June 22 - 27, Boston, MA
Register by June 6 and save up to $300!

Dr. Dobb's Architecture & Design World 2008
July 21 - 24, Hyatt Regency, Chicago, IL

The 17th USENIX Security Symposium
July 28 - August 1, San Jose, CA
Join top security researchers and practitioners in San Jose, CA, for a 5-day program that includes in-depth tutorials by experts such as Simson Garfinkel, Bruce Potter, and Radu Sion; a comprehensive technical program including a keynote address by Debra Bowen, California Secretary of State; invited talks including "Hackernomics," by Hugh Thompson; the refereed papers track including 27 papers presenting the best new research; Work-in-Progress reports; and a poster session. Learn the latest in security research including voting and trusted systems, privacy, botnet detection, and more.
USENIX Security '08
Register by July 14 and save up to $250!

Linuxworld Conference
August 4 - 7, San Francisco, CA

thunderboltDistro News

openSUSE 11.0 Beta 1 Announced

The openSUSE team announced the first Beta release of openSUSE 11.0. According to banner ads associated with the release, the target for final release is near the end of May.

openSUSE 11.0 beta 1 includes changes and new features including:

To help test KDE and GNOME in openSUSE 11.0, see the wiki for info on reporting bugs.

Ubuntu 8.04 (Hardy Heron) LTS Desktop and Server Editions Released

As scheduled, Ubuntu 8.04 (code named "Hardy Heron") was released on 22.04. This is a very special and long-awaited release; Hardy Heron is the newest LTS (Long Term Support) version of the distribution. It comes in two versions - Desktop, supported till 2011, and Server, supported till 2013. That means this distribution will have updates and security fixes for quite a long time, a great feature for both home and business use.

With this release, Canonical tries to focus even more on the home and office uses of GNU/Linux, bundling together several of the best open source alternatives to the most common commercial software.

This release includes several updates, including Firefox 3.0, clock and calendar integration, better and easier multimedia capabilities, YouTube/MythTV integration in the default movie player, automatic camera, phone, PSP, and other device recognition - all to make things easy to the less-experienced Linux user, but without losing any of the security options that Linux has always had.

Final Considerations: Loads of updates, some new features, combined with all the features Ubuntu already had (like Printing, Windows Compatibility, Automatic Updates), along with the Long Term Support makes this release a very special one, worthy of checking out.

Home Page: http://www.ubuntu.com/
Downloads: http://www.ubuntu.com/getubuntu/download
Free CD: http://shipit.ubuntu.com/ (even the worldwide shipping is free!).


Open Office 2.4 released, 3.0 on the way

OpenOffice 2.4 has finally been released. It includes new features, enhancements, and bug fixes to all its core components. OpenOffice 2.4 is available for immediate download from http://download.openoffice.org .

New features:

The next major release - 3.0 - is planned for the autumn of 2008. That major release will offer support for the unique file formats of MS-Office 2007 and will have official support for Mac. Multiple Page Views has been implemented in the 3.0 beta, which will allow pages side-by-side instead all pages stacked top-to-bottom. The "View Layout" control in 3.0 switches between single page, several pages side by side, and book layout views.


Kickfire Appliance Accelerates MySQL

At the MySQL User Conference in April, Kickfire announced the first high-performance database appliance for the expanding MySQL market. Kickfire had been in stealth mode prior to the event.

Separately, Kickfire and Sun Microsystems announced record-breaking TPC-H price/performance benchmark results that demonstrate the performance efficiency and price/performance leadership of Kickfire's design. Some SQL queries run at 100x the speed it would on a dedicated server.

Based on a patented SQL chip that packs the power of tens of CPUs into an exceptionally small, low-power form factor, Kickfire appliances avoid the server build out, power, and space costs of today's data warehouse and database offerings. Kickfire's query performance enables organizations to use MySQL for demanding business intelligence, reporting, and analysis applications rather than non-open source alternatives. Kickfire appliances scale from gigabytes to terabytes and are based on commodity hardware and Linux. They leverage existing storage as well as the openness of MySQL and its entire ecosystem to ensure compatibility and rapid deployment.

The appliance uses MySQL running on a standard x86 Linux server but processes SQL on a dedicated chip. Customers have the option to connect their own storage devices.

"Kickfire's Database Appliance delivers query performance of half a room of hardware in a double-height pizza box with the power requirement of a microwave oven," said Raj Cherabuddi, CEO and co-Founder, Kickfire. "No longer do MySQL database customers need to migrate away from the world's leading open source database to scale up to higher performance as data volumes grow."

Kickfire also announced partnerships with five leading vendors in the open source world to provide joint customers the ability to deploy business intelligence solutions that leverage its high-performance MySQL database appliance. Working with JasperSoft, Pentaho, Sun Microsystems, Talend, and Zmanda, Kickfire customers can deploy high-performance, end-to-end business intelligence solutions with online backup based on MySQL.

The Kickfire Database Appliance is in beta testing. For more information, please visit http://www.kickfire.com.

Defensics Test Platform Fuzzes Network Equipment

In April, Codenomicon released DEFENSICS 3.0, the third generation of its security and quality testing platform that allows network product and service manufacturers and vendors to identify and fix flaws before their offerings reach the market. Defensics 3.0 is a series of test modules that apply model-based fuzzing and penetration techniques to network and communications protocols and allows user-supplied testing as well.

Since 2001, the Codenomicon Defensics test platform has been applying fuzzing techniques to provide preemptive security testing for network equipment. Defensics is a Java-based software package that runs on any standard computing platform with JVM support. This includes COTS x86 hardware, MACs, Sparc systems and Power architecture servers. This can provide a significant cost advantage over dedicated applilances. Pentitum class hardware with 1 GB of RAM is suffient for running the software in a small environment with a few targets.

The new version has support for several new and exiting protocols and digital media formats such as WiMAX, STUN, TURN, vCal, iCal, vCard, OCSP, SOAP, XMPP, and many others. The previous version of defensics already had 140 communications protocols included. It can test wireless devices, consumer electronics and Web services.

Ari Takanen, founder and CTO of Codenomicon, told Linux Gazette that his company is providing several open source projects with full access to its testing solutions, through Codenomicon's CROSS program, to find and fix a large number of critical flaws very rapidly. This differs from the traditional FOSS model of users and security researchers reporting bugs one by one. In the first phase of the CROSS initiative, Codenomicon has targeted 15-20 open source projects to work with. After evaluating the program, Codenomicon hopes to open it to more open source projects. Takanen told Linux Gazette that a large number of vulnerabilities had been detected and the results could not be publicly released until these had been addressed by the participating projects.

A DEFENSICS 3.0 overview and a new features list are available at http://www.codenomicon.com/d3/

Here is an MP3 interview from 2007 with Rick Lohorn, CISO at DataLine and Codenomicon's Senior Security Analsyt, Heikki Kortti: http://www.codenomicon.com/resources/webcasts/radio_071005.mp3

InnoDB Plugin 1.0 for MySQL 5.1 Announced

At the April MySQL User Conference, InnoDB announced the initial availability of the early adopter release of the InnoDB Plugin. Testers can now download an early adopter release of the upgraded version. The update supports several new features for improved performance, reliability, ease of use and flexibility. See here for a description of the new features for MySQL 5.1. Documentation for the plugin is available here.

The early adopter release is available in source and binary (for most platforms) and is licensed under GPLv2. Users can dynamically install it within MySQL 5.1 on Linux and other Unix-like operating systems (and soon on Windows) without compiling from source or relinking MySQL. (For now, Windows users must re-build from source.)

Databases created with the built-in InnoDB in MySQL are compatible with the InnoDB Plugin, and support is available via the new InnoDB Forums.

Avnet announces launch of Complete MicroBlaze Processor Linux Design Solution

Avnet announced the launch of a Complete MicroBlaze Processor Linux Design Solution, including:

The stand-alone Linux for MicroBlaze Processor DVD is based on both PetaLogix Petalinux and LynuxWork's BlueCAT Linux distribution and tool chains. The DVD demonstrates how to port Linux into a Field Programmable Gate Array (FPGA) design using the 32-bit Xilinx MicroBlaze processor. It also highlights the benefits and tradeoffs when using the new Memory Management Unit (MMU) in the MicroBlaze processor. The new MMU enables designers to use commercial-grade operating systems when implementing their embedded designs with Xilinx FPGAs.

The MicroBlaze Processor Linux Starter Kit includes the Linux for MicroBlaze Processor DVD, the Xilinx(r) Embedded Development Kit - Spartan(tm)-3A DSP MicroBlaze Processor Edition, and attendance at an Avnet Linux for MicroBlaze Processor Workshop, providing a complete platform for running embedded Linux on a MicroBlaze processor implemented in a Spartan-3A DSP FPGA.

Avnet has also launched the Linux for MicroBlaze Processor Workshop -- part of the Avnet SpeedWay Design Workshop Series. The workshop offers support for embedded developers through a full day hands-on session that provides practical knowledge on utilizing Xilinx tools and Internet Protocol (IP) to create a MicroBlaze processor (based Linux system within an FPGA). The course is aimed at software and hardware designers considering an operating system for their next embedded processor based application.

The Linux for MicroBlaze DVD and starter kit are available through Avnet in the Americas, Japan and selected countries in Europe and Asia. For more information and to purchase, please visit: www.em.avnet.com/linuxmb

Macraigor Systems, Eclipse Foundation Member, Provides Eclipse Integration for GNU Embedded Development and Debugging Toolset on Linux Hosts

Macraigor Systems announced the immediate availability of a free Eclipse-compliant embedded debugging solution with sample Eclipse projects that run on many standard evaluation boards for hosting on the Linux platform. This provides embedded systems engineers with an integrated platform for developing and debugging embedded systems using the Eclipse platform.

The Macraigor Eclipse + GNU Tools Suite (http://www.macraigor.com/Eclipse/) is an implementation that packages Eclipse 3.2.1, several of the open source GNU tools and utilities, and a program called OCDRemote, which provides an interface between Eclipse, the GDB debugger and a Macraigor On-Chip Debug device. The GNU tools provided by Macraigor Systems include binutils, gcc, gdb and gdbtui.

Truviso Contributes PostgreSQL Enhancements to Open Source Community

As part of its commitment to the open source community, Truviso, a leading provider of next-generation business intelligence solutions, announced that it has completed an enhancement to the PostgreSQL open source database system that further extends its suitability for streaming data analysis. Truviso also announced it will contribute this enhancement to the PostgreSQL community, reinforcing the company's foundational tenet of building upon and sharing mutually beneficial improvements to open source code with the PostgreSQL community.


Vyatta Community Edition 4 Scales from Branch Office to Data Center

Vyatta, a leader in Linux-based networking, announced Vyatta Community Edition 4 (VC4), the latest release of its reliable, commercially-supported open-source network operating system. VC4 delivers significant scalability improvements and expanded application support to the pre-existing router/firewall/VPN feature set.

VC4 now scales from DSL to 10 Gigabit Ethernet environments, providing enterprises and service providers with high-performance networks a robust, open alternative to expensive proprietary systems.


VIA Announces Strategic Open Source Driver Development Initiative

VIA will provide a vehicle for improved collaboration with the Open Source community with opening of the official VIA Linux Website. VIA Linux Portal will offer drivers, technical documentation and source code for popular Linux distributions such as Ubuntu.

As the first step in this initiative, VIA opened its official VIA Linux website in April. The site will initially host drivers, technical documentation, source code, and information regarding the VIA CN700, CX700/M, CN896, and the new VIA VX800 chipsets.

The VIA Linux Portal currently offers graphics drivers for the VIA CN896 digital media IGP chipset for the new Ubuntu 8.04 LTS distribution. Documentation and source code for these drivers will be released over the coming weeks, with official forums and bug tracking scheduled for implementation later this year. The VIA Linux Portal will also adhere to a regular release schedule that is aligned with kernel changes and the release of major Linux distributions.

Over the following months, VIA will work with the community to enable 2D, 3D and video playback acceleration to ensure the best possible Open Source experience on VIA Processor Platforms.

"We welcome the steps being taken by VIA to improve its support to the Open Source community," said Chris Kenyon, Director of Business Development at Canonical. "We look forward to working with VIA to ensure these drivers get built into Ubuntu by default and that Ubuntu developers and users enjoy a great experience when using VIA platforms."

"VIA is excited to be taking a more active role within the open source ecosystem," said Richard Brown, Vice President of Corporate Marketing, VIA Technologies, Inc. "Opening the VIA Linux Portal is an important step in our long term open source initiative and offering support for Ubuntu, one of the most widely known of the Linux distributions, is an ideal place to start."

The beta version of the VIA Linux Portal is located at http://linux.via.com.tw and currently offers driver files for Ubuntu 8.04 LTS and SUSE Linux Enterprise Desktop 10 Service Pack 1 for the VIA CN896 chipset with two south bridge options.

Talkback: Discuss this article with The Answer Gang

Bio picture

Kat likes to tell people she's one of the youngest people to have learned to program using punchcards on a mainframe (back in '83); but the truth is that since then, despite many hours in front of various computer screens, she's a computer user rather than a computer programmer.

Her transition away from other OSes started with the design of a massively multilingual wedding invitation.

When away from the keyboard, her hands have been found wielding of knitting needles, various pens, henna, red-hot welding tools, upholsterer's shears, and a pneumatic scaler. More often these days, she's occupied with managing her latest project.

Bio picture

Howard Dyckoff is a long term IT professional with primary experience at Fortune 100 and 200 firms. Before his IT career, he worked for Aviation Week and Space Technology magazine and before that used to edit SkyCom, a newsletter for astronomers and rocketeers. He hails from the Republic of Brooklyn [and Polytechnic Institute] and now, after several trips to Himalayan mountain tops, resides in the SF Bay Area with a large book collection and several pet rocks.

Howard maintains the Technology-Events blog at blogspot.com from which he contributes the Events listing for Linux Gazette. Visit the blog to preview some of the next month's NewsBytes Events.


Deividson was born in União da Vitória, PR, Brazil, on 14/04/1984. He became interested in computing when he was still a kid, and started to code when he was 12 years old. He is a graduate in Information Systems and is finishing his specialization in Networks and Web Development. He codes in several languages, including C/C++/C#, PHP, Visual Basic, Object Pascal and others.

Deividson works in Porto União's Town Hall as a Computer Technician, and specializes in Web and Desktop system development, and Database/Network Maintenance.

Copyright © 2008, Kat Tanaka Okopnik, Howard Dyckoff, and Deividson Luiz Okopnik. Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 150 of Linux Gazette, May 2008

Deividson on Databases: Stored Procedures

By Deividson Luiz Okopnik

Stored Procedures

Stored Procedures are subroutines that are stored inside the database. They allow you to select and manipulate data, and, with the use of control structures and loops, perform complex computations and return the calculated result to the client. This saves considerable amounts of client/server communication.

PostgreSQL allows Stored Procedures to be written in several different Procedural Languages, including Perl, Python, TCL, and pgSQL - the PostgreSQL internal procedure language. User-defined Procedural Languages can also be used, and several of these languages are easily downloadable, e.g. PL/Java.

In this article, we will be using PL/pgSQL. PL/pgSQL is very similar to normal SQL, but adds many more features to it, like control structures and user-defined data types and functions.

Example 1: The Basic Stored Procedure

Let's get started with a very basic stored procedure that returns "Hello World!" - not very useful, I know, but it will get us started with the basic syntax of PL/pgSQL. Here's the code:

create or replace function hello() RETURNS text AS $$
  hello text;
  hello := 'Hello World!';
  return hello;	
$$ LANGUAGE plpgsql;

Here's what it does:

create or replace function hello() RETURNS text AS $$

Creates the function called hello which receives no parameters and returns text. You must always define what the function returns; use VOID if you don't need to return anything.


Opens the variable declarations block.

hello text;

Declares a variable called "hello" of type "text". To define multiple variables, use ";" as the separator. You can use any of standard types used in tables, like integer and float, and even user-defined types or domains.


Starts the actual function code.

	hello := 'Hello World!';

Pupulates the variable "hello" with 'Hello World!'. Note that you have to use single quotes for string/text values.

return hello; 

Returns our value.


Ends the function.

$$ LANGUAGE plpgsql;	

Defines what language we used - 'plpgsql' in this case. To call that function, you use the following SQL code.

select * from hello();

The output will be a text field called "hello", with the value of "Hello World!".

Example 2: Populating a Table with Test Data

This is another use of a Stored Procedure ('SP' from now on) - generating test data for your tables. Let's use last month's article as an example - we used a SP to generate 500K rows of data for one of our tables. Here's the code:

create or replace function test_data_computer()
RETURNS integer AS $$
  count integer;
  sql text;
  count = 1;
    sql = 'insert into computer(computer_id, computer_ram, cpu_id, video_id) values';
    sql = sql || '('|| count ||', ' || random()*1024 || ', ' || (random()*49999)+1 || ', ' || (random()*49999)+1 || ')';
    EXECUTE sql;
    count = count + 1;
    EXIT WHEN count > 500000;
  return count;	
$$ LANGUAGE plpgsql;	

It starts much like our previous example, but this time we declare 2 variables instead of one. Things become different at line 8, where we introduce the LOOP statement. The loop is a basic repeating structure: it repeats the code inside indefinitely, until it finds a EXIT or EXIT WHEN clause.

Lines 9 and 10 are used to generate the SQL code to include a simple record in our tables. The double pipes ("||") is the concatenation operator. Random() generates a random float number between 0 and 1 (so "random()*49999)+1" will generate a random number between 1 and 50000).

Line 11 executes the SQL code stored inside the sql variable, adding the registry to the table.

Lines 12 and 13 are used to control the flow of the LOOP, and if omitted will make the loop an infinite one. "EXIT WHEN count > 500000;" makes the loop stop when the condition is met (when "count" goes over 500000 in this case.)

Line 14 closes the LOOP block, making the function go back to line 8, executing everything that is inside the loop again (and again, and again).

Line 15 returns the number of added registries (plus one in this case).

Example 3: Calculations and Date/Time Handling

Let's make up a scenario for this one. Imagine that you are building a system for a doctor, and one of the bits of data he wants is exactly how much time he spends with his patients (NOT just idling in the office.) Even more, he wants to be able to select the data for a given date or date interval, and he wants the option of selecting the records of either a single patient or all of them. Complex scenario, right? Well, we can solve it all with a single SP. These are the tables our database will have:

create table patient (
patient_id serial primary key, 
patient_name text );

create table visits (
v_id serial  primary key, 
patient_id integer references patient,
v_date date,
v_time_start time,
v_time_end time );

One for the patients, another one to store the visits, with the date, start, and end time. Let's now populate the tables with some data:

insert into patient (patient_name) values ('Deividson');
insert into patient (patient_name) values ('John');
insert into patient (patient_name) values ('Benjamin');
insert into visits (patient_id, v_date, v_time_start, v_time_end) values (1, '10/04/2008', '08:00', '09:00');
insert into visits (patient_id, v_date, v_time_start, v_time_end) values (1, '14/04/2008', '13:00', '13:45');
insert into visits (patient_id, v_date, v_time_start, v_time_end) values (1, '18/04/2008', '10:00', '10:15');
insert into visits (patient_id, v_date, v_time_start, v_time_end) values (2, '11/04/2008', '14:00', '15:00');
insert into visits (patient_id, v_date, v_time_start, v_time_end) values (2, '12/04/2008', '14:00', '15:45');
insert into visits (patient_id, v_date, v_time_start, v_time_end) values (2, '17/04/2008', '14:00', '15:15');
insert into visits (patient_id, v_date, v_time_start, v_time_end) values (3, '15/04/2008', '08:00', '12:00');

Three patients, seven records - enough to test our SP. Here is the code:

CREATE OR REPLACE FUNCTION total( date1 date, date2 date, patient integer ) 
RETURNS interval AS $$
  total interval;
  rec record;
  sql text;
  total = '00:00:00'::time;
  sql = 'select * from visits';
  if date1 is not null OR patient is not null then
		sql = sql || ' where ';
  end if;

  if patient is not null then
		sql = sql || '(patient_id = ' || patient || ')';
  end if;

  if date2 is not null AND date1 is not null then
		if patient is not null then
			sql = sql || ' AND ';
		end if;
		sql = sql || '(v_date between ''' || date1 || ''' and ''' || date2 || ''')';	
		if date1 is not null then
			if patient is not null then
				sql = sql || ' AND ';
			end if;
			sql = sql || '(v_date = ''' || date1 || ''')';	
		end if;  
  end if;

  for rec in EXECUTE sql loop
    total = total + (rec.v_time_end - rec.v_time_start);
  end loop;
  return total;
$$ LANGUAGE plpgsql;

Wow! Big one this time, eh? Let's take a look at it. The start of the code is pretty similar to the other examples, but we have 3 variables this time. 'total' will store the total time to return to the client, and 'rec' (of type record) is a variable that will hold the result of the query we will run.

On line 8, we start the variable total with the value of 00:00:00 - the "::" is a typecast - ":: time" means the string we're passing ("00:00:00") needs to be turned into a time.

From line 9 all the way down to line 31, all we are doing is creating the SQL statement that will select the data we want. Here, we use another type of structure - the IF. IFs are basic flow-control structures, and its syntax is (as in most programming languages):

IF (condition) THEN (commands) [ELSE (commands)] END IF;

The condition can be any logical comparison ( <, > , =, IS NULL, or IS NOT NULL), and you can combine multiple conditions using the logical operators (AND, OR, etc). If the condition is true, then the execution will continue with the commands inside the THEN clause, or if it's false, execution will move to the commands in the ELSE (if it exists), or to after the END IF.

It's in those IFs that we create the conditions (single date, date interval, single patient, etc.)

On line 33, we execute the SQL code we generated, looping over each of the records of the result. We then add the duration of each visit in the 'total' variable, and return the result when there are no more records available.

We can call this SP in one of several different ways, each way selecting a different data set and giving us a different result:

# All the records, from all patients
select * from total(NULL, NULL, NULL);
# All the records, from patient #3 only
select * from total(NULL, NULL, 3);
# Records from '14/04/2008', all patients
select * from total('14/04/2008', NULL, NULL);
# Records from '14/04/2008', patient #1 only
select * from total('14/04/2008', NULL, 1);
# Records from '14/04/2008' through '17/04/2008', all patients
select * from total('14/04/2008', '17/04/2008', NULL);
# Records from '14/04/2008' through '17/04/2008', patient #2 only.
select * from total('14/04/2008', '17/04/2008', 2);


Stored Procedures are powerful and flexible, and can be a very good way to help you pre-select and pre-process data, as well as allowing you to manipulate data and run code directly on the server.

PostgreSQL offers a comprehensive manual on their site, including a chapter about PL/pgSQL. You can find it here: http://www.postgresql.org/docs/8.3/static/plpgsql.html

That's it for Stored Procedures - see you next month, when we'll discuss Triggers!

Talkback: Discuss this article with The Answer Gang


Deividson was born in União da Vitória, PR, Brazil, on 14/04/1984. He became interested in computing when he was still a kid, and started to code when he was 12 years old. He is a graduate in Information Systems and is finishing his specialization in Networks and Web Development. He codes in several languages, including C/C++/C#, PHP, Visual Basic, Object Pascal and others.

Deividson works in Porto União's Town Hall as a Computer Technician, and specializes in Web and Desktop system development, and Database/Network Maintenance.

Copyright © 2008, Deividson Luiz Okopnik. Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 150 of Linux Gazette, May 2008

Knoppix 5.3.1

By Edgar Howell

At the recent CeBIT computer fair, Klaus Knopper finally released a new version of his popular Knoppix live-CD/DVD variety of GNU/Linux. The initial version was fine for the CeBIT crowd but had some problems, the most obvious of which was a mixture of languages in menus if you booted with the parameter "lang=" (by the way, it is now "lang=us" rather than "lang=en"; this will take getting used to after many years of using "lang=en".)

The vital statistics on the post-CeBIT version, 5.3.1, are:

The DVD is close to full and the software is distributed between 2 compressed directories. This DVD should include just about everything but the proverbial kitchen sink.

New to me (absent appropriate hardware) but not new to the 5.x.x Knoppix series is recognition of dual-core CPUs: on boot, Knoppix shows you Tux twice. However, new to 5.3 (at least since 5.0) is the use of a key that for me has always been a dead-key: the "flag" to the left of "alt" to the left of the space-bar. This key now activates the start menu as if you had clicked on the icon at the far left of the panel. Those who have learned to use the keyboard ought to appreciate that progress away from mouse-dependency.

To my knowledge virtualization is also a new topic with this release. You will find KVM as well as a XEN-kernel and VirtualBox. I played briefly with VirtualBox, and it was necessary to load the 'vboxdrv' module manually before starting the graphics interface.

For what it's worth, during the last nine months or so everything I've done in the office here other than USB and burning CD/DVD has been under VMs. It is a topic well worth investigating, particularly in this environment.

By now, most readers certainly have heard about Knoppix and ought to have some idea of what it is all about - and this was the first time that I encountered any significant problems with a new release of Knoppix. So, I think that topic deserves some comment this time around.

Linux has a certain reputation - it's supposed to behave well on older hardware. While this is indeed true in large measure, if all you have is Knoppix on a DVD, you won't be able to do much with it on a machine that can't boot from DVD or doesn't even have a DVD drive! But this was the first time that I ever had trouble with hardware that has worked fine with various distributions for many years.

And it was a strange experience indeed. The desktop failed in two ways, both error messages from KDesktop: "Unable to create io-slave" and "Signal 6 SIGABRT". Unfortunately the former error didn't prevent Konqueror from starting and hiding the small window with the error message which I initially missed because I am typically busy with other things as Knoppix boots from DVD. These errors occurred to one degree or another on a 5-year-old PC, a 3-year-old notebook and a notebook I've only had for a couple of weeks. In desperation I even tried the beta of KDE 4 to no avail.

As it turns out, KDE development seems to be moving in a direction that I have no use for anyhow: the best I can say about 'compiz' is that it seems to be a significant waste of developer's time and CPU cycles. Trying to re-position a window is like trying to push a wet noodle. The floppy window and flames that delay closing don't make much sense to me either. Fortunately, Knoppix lets us set the "no3d" parameter.

The beta of KDE 4 was a turn-off from the start menu. It looks just like SuSE 10.3 for crying out loud! Why!? Why would any sane person think it reasonable to use 5 times as much space as necessary for a line of text in a menu? This just forces some entries into another level in the hierarchy thus hiding things deeper and more mouse-clicks away. That makes about as much sense as adding another level of bureaucracy to the government...

So, the time had come to abandon KDE and re-master the standard release of Knoppix with a default configuration that I like. Isn't that why we all use Linux? Because we are in charge?

Re-mastering a distribution may sound like pretty heavy-duty guru-type stuff, but it needn't be difficult at all. If you want to add software to one of the standard locations (rather than just putting it somewhere on the DVD) it is pretty much a 2-pass algorithm since you have to add something to a directory that then is converted into a compressed file which then has to fit along with everything else into an iso-image - not too big for the DVD or CD. But in this case, all that is necessary is to make a tiny alteration to a boot parameter.

Here are the commands needed, to be executed as root as usual:

mount -o loop /media/hda3/KNOPPIX_V5.3.1DVD-2008-03-26-EN.iso /mnt
mkdir KNOPPIX_master
cp -Rp /mnt/* KNOPPIX_master/
vi KNOPPIX_master/boot/isolinux/isolinux.cfg
/usr/bin/mkisofs -pad -l -r -J -v -V "Knoppix 5.3.1" -P "Linux Gazette"   \
                 -no-emul-boot -boot-load-size 4 -boot-info-table         \
                 -b boot/isolinux/isolinux.bin -c boot/isolinux/boot.cat  \
                 -hide-rr-moved -o knoppix.iso KNOPPIX_master/ 

This is what's going on here:

For my purposes it was sufficient to add the following part (bold text) at the end of the second line of "isolinux.cfg":

APPEND ramdisk_size=100000 init=/etc/init lang=us apm=power-off \
  vga=791 initrd=minirt.gz nomce highres=off loglevel=0         \
  libata.atapi_enabled=1 quiet SELINUX_INIT=NO nmi_watchdog=0   \
  BOOT_IMAGE=knoppix keyboard=de desktop=icewm

After having made this change, and burning the ISO-image to a DVD of course, booting from the DVD without entering any parameters at all brings up Knoppix with the keyboard needed and IceWM as the desktop.

Given the parameters one can use when booting Knoppix, re-mastering really isn't absolutely necessary. But as easy as it is to do and as convenient as it makes later use of Knoppix, why not?

An alternative to re-mastering would be to create a USB-device with persistent settings; I've used this technique in the past and it works nicely. But on one occasion a release change introduced an incompatibility and I decided it is just as easy to have a USB-device with a script or two that can be executed as needed: if the notebook is attached to the office LAN and there is a need for the network printer then it is easy to run the script that sets up CUPS.

So when all is said and done in spite of a couple of "minor" problems I am still an enthusiastic user of Knoppix. It definitely belongs in every toolbox.

And the fact that I don't like what is happening with KDE is absolutely irrelevant: the world is a big place, there is room for lots of different opinions. It's just a matter of taste. You might like it; you might not want to re-master - that's great too. Someone else might want Gnome instead of IceWM. The point is that with Linux, you have options. Go for it!

Talkback: Discuss this article with The Answer Gang

Bio picture Edgar is a consultant in the Cologne/Bonn area in Germany. His day job involves helping a customer with payroll, maintaining ancient IBM Assembler programs, some occasional COBOL, and otherwise using QMF, PL/1 and DB/2 under MVS.

(Note: mail that does not contain "linuxgazette" in the subject will be rejected.)

Copyright © 2008, Edgar Howell. Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 150 of Linux Gazette, May 2008

Virtualizing without Virtualizing

By Kapil Hari Paranjape


Nowadays one reads a lot about virtualization. The Gazette even ran an article on it. Most of these reports talk about Xen, Vserver, OpenVZ, Qemu, User Mode Linux or co-Linux. Each of those modern technologies has its place no doubt, but this article will concentrate on the humble chroot way of running programs in a virtual environment.

Here is the EULA1 that you must agree to if you want to read further:

Here some possible scenarios where one might want to run programs in a virtual context in spite of these (admittedly rather stringent) conditions:

Ask the super

Your super-user avatar (SU) must carry out the following steps for you to enter the virtual context.

Allocate disk space

Either there is already enough disk space on some mounted partition or you have an unused partition. You will need this empty space to appear in some fixed location like /srv/schroot. SU can use mount (with the --bind option in the former case) to set this up.

Create a new Debian installation

Do not reach for that Debian boot CD! Here is a way that does not destroy your uptime record. SU should install debootstrap:

apt-get install debootstrap

and run it:

debootstrap --include=iceweasel,mozplugger lenny /srv/schroot

SU can of course replace iceweasel and mozplugger by any comma-separated list of programs which need to be installed. It is nice to add a local mirror at the tail end of the debootstrap invocation so as to get a faster download.

The main program

Since you run Debian, installing schroot is as easy as SU running:

apt-get install schroot

You also need to configure schroot; there are a number of rather interesting options. The following configuration stanza seemed to be just right:

description=Debian lenny (testing)

If you are planning to use a 32-bit chroot under a 64-bit system then you need to have personality=linux32 as part of this configuration. The above stanza is placed in the file schroot.conf in the configuration directory /etc/schroot/. It says that SU allows the user luser to use schroot to run under the directory /src/schroot after various standard setup and startup scripts have been executed.

You should also look through the setup script 10mount in the setup.d sub-directory of the above directory as you may need to create some additional mounts. For example, adding the line

do_mount "-o rw,bind" "/dev/snd"    "${CHROOT_PATH}/dev/snd"

to 10mount together with the creation of the dev/snd subdirectory of /srv/schroot ensures that the (ALSA) sound devices are accessible in the chroot. This is extremely important for flash!

That completes the setup that the super-user needs to do. After this, luser can go ahead and play in this newly created “sand-box”.

All play and no work

The authorized user luser can execute a shell by running

schroot -p -c lenny

The shell will normally run in an environment where the users home directories, /tmp and /dev will be mounted from the base system. Hence it should be possible to execute commands with need the X window environment as well. (The -p option given above is required to preserve the environment that includes the DISPLAY variable).

Another way to run a command like iceweasel directly is

schroot -p -c lenny iceweasel

Note that each such command creates a new schroot “session”. To re-use an already created session, you must save the session identifier and use it. For example, you can start a new session, without any command:

SCHROOT_SESSION=$(schroot -b -c lenny)

If you then issue the command

schroot -p --chroot $(SCHROOT_SESSION) iceweasel

the weasel will start up and run in that session. If you run this command another time, you will not create a new session.


The chroot command has existed for “eons”, but it was often felt that it is “for the super-user only”. By using schroot it becomes quite safely accessible to the regular user of the system. Using this kind of minimal virtualisation is certainly not in the same league security-wise as the “real” virtualisation techniques but has no overhead (except disk-usage); I hope the article demonstrates that schroot is at least as easy to set up.

Using schroot is a good solution to the frequently asked question:

How do I run the late-esht version of <name your favourite rapidly developing application> on Debian?

In my opinion, the above solution is to be preferred over running a mixed stable/testing version of Debian. Even backports are slightly worse, as a mixed stable/backports environment is not what the packages are being tested in by most developers.

The motivation to write this article came from a discussion I saw in the letters to Linux Weekly News where people said that Debian was hindering those who want to run 32-bit programs on a 64-bit system. If indeed mixed library setups are well packaged and maintained, then that is easier than the solution herein. All the same, Debian does have this solution!

The title of this article is inspired by John Archibald Wheeler, one of the most fascinating physicists of the 20th century, who passed away recently.

This is a dyslexic acronym for “Lookout — Advanced Experimental Usage”
Mark Shuttleworth: Benevolent Dictator For Life for Ubuntu
or boys as the case may be

Talkback: Discuss this article with The Answer Gang

Bio picture Kapil Hari Paranjape has been a ``hack''-er since his punch-card days. Specifically, this means that he has never written a ``real'' program. He has merely tinkered with programs written by others. After playing with Minix in 1990-91 he thought of writing his first program---a ``genuine'' *nix kernel for the x86 class of machines. Luckily for him a certain L. Torvalds got there first---thereby saving him the trouble (once again) of actually writing code. In eternal gratitude he has spent a lot of time tinkering with and promoting Linux and GNU since those days---much to the dismay of many around him who think he should concentrate on mathematical research---which is his paying job. The interplay between actual running programs, what can be computed in principle and what can be shown to exist continues to fascinate him.

Copyright © 2008, Kapil Hari Paranjape. Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 150 of Linux Gazette, May 2008


By Aurelian Melinte

"When two trains approach each other at a crossing, both shall come to a
full stop and neither shall start up again until the other has gone."
 -- old Kansas statute


The dynamic linker allows us to override functions exported by the shared objects that are used by programs. In this article, we will use this interposition functionality and build a library that wraps around the 'pthreads' library to diagnose mutex-related problems, including the well-known deadlock.

Deadlocks and how to avoid them

Simply put, a deadlock is a run-time condition where threads are waiting for resources in a circular chain. The well-known solution to this is to assign a process-wide order to these resources and have each thread acquire the resources it needs in that particular order; threads should release the taken resources in reverse order.

To keep things simple for our debugging library, let's assume the resources that threads are competing for are all mutexes. But the techniques presented here can be extended to other type of resources threads would be competing for (e.g. semaphores, condition variables).

With pthreads, a thread can also self-deadlock if it attempts to lock a mutex it already owns. The ptreads library can help with self-lock: when creating the mutex, you can create it:

One certainly should understand exactly why and whether a particular thread needs to multiple lock the mutex before making usage of such techniques.

Library interposition

The dynamic linker allows users to intercept any function call an application makes to any shared library that the application uses. The dynamic linker will first load libraries specified in the LD_PRELOAD environment variable (the interposition libraries) and the linker will use these libraries before any other when it resolves symbols. This functionality offered by the linker is used traditionally for debugging purposes, just like we will do shortly.

Functions exported by the interposition library will get called instead of the functions in the shared objects (here pthreads) that the application normally uses. Thus, in the interposition library we can wrap around the "real" functions; or replace them outright.

Hooking pthreads

We need to know which thread acquires and releases what and when. At a minimum, we have to hook pthread_mutex_lock(), pthread_mutex_trylock() and pthread_mutex_unlock(). Other candidates might be pthread_mutex_init(), pthread_mutex_destroy(), pthread_cond_timedwait(), pthread_cond_wait(). But since we decided to tackle only mutexes, the first three might well be enough, depending on your strategy to communicate the resources' acquiring order to the debug library. Also, hooking pthread_mutex_destroy() is worthwhile because attempting to destroy a locked mutex is undefined behavior according to the standard (aside from being a programming error).

Below is the code used to hook our mutex functions, stripped of error checking code. We use the dlsym() function to dig out the real function, so that we can call it from within our wrapper function. The hooking function in the interposition library looks like this:

    #define _GNU_SOURCE
    #include <dlfcn.h>

    typedef int (*lp_pthread_mutex_func)(pthread_mutex_t *mutex); 

    int lp_hookfunc(lp_pthread_mutex_func *fptr, const char *fname) 
        char *msg = NULL; 

        *fptr = dlsym(RTLD_NEXT, fname); 
        if ((*fptr == NULL) || ((msg = dlerror()) != NULL)) {
            lp_report("dlsym %s failed : %s\n", fname, msg); 
            return -1; 
        } else {
            return 0; 

We'll use it to hook the pthreads-exported functions we need. This will be done as soon as the linker loads the debugging library, well before the main() function of the application gets called:

    static lp_pthread_mutex_func next_pthread_mutex_lock = NULL; 
    static lp_pthread_mutex_func next_pthread_mutex_trylock = NULL; 
    static lp_pthread_mutex_func next_pthread_mutex_unlock = NULL; 

    void _init()
        lp_hookfunc(&next_pthread_mutex_lock,    "pthread_mutex_lock"); 
        lp_hookfunc(&next_pthread_mutex_trylock, "pthread_mutex_trylock"); 
        lp_hookfunc(&next_pthread_mutex_unlock,  "pthread_mutex_unlock"); 
        /*And check for errors.*/

Next, we'll compile the code in a shared object:

    $ gcc -g -Wall -fPIC -I. -c lp_hooks.c -o lp_hooks.o
    $ ld -o liblockpick.so -shared lp_hooks.c 

Finally, we load this diagnostic shared object so that it takes over pthreads. At a Bourne shell prompt:

    $ LD_PRELOAD=./liblockpick.so ./deadlock

Now, each time the application will call pthreads_mutex_lock(), the overriding/wrapping pthreads_mutex_lock() function from the debugging library will be called instead of the real one. Having taken control over the locking and unlocking functions, we can start keeping track of what is going on. The overriding function looks like this:

    int pthread_mutex_lock(pthread_mutex_t *mutex)
        int rc = EINVAL;

        if (mutex != NULL) {

            /* Call the real pthread_mutex_lock() */
            rc = next_pthread_mutex_lock(mutex);

            lp_lock_postcheck(mutex, rc);
        } else {
            lp_report("%s(): mutex* is NULL.\n", __FUNCTION__ );

        return rc;

Resource ordering & bookkeeping

One important issue to consider is how to pass the resource-ordering information down to the debug library. There are various solutions that would require making the application aware of the debugging library and thus would require modifications of the application itself. These solutions would eventually make the library less generic too.

This resource-ordering information is not needed to diagnose deadlock situations, but we need it if we want to check for out-of-order locking issues.

The code accompanying this article is strictly application-agnostic: the order given to mutexes is the order in which the library finds out about the existence of these mutexes. The mutex to be acquired first is the first one it finds out about. Practical for exemplifying this article -- we need to hook only three functions -- but maybe not so practical for a real application.

For each of the functions hooked, there are two moments when we do bookkeeping and proceed to check the sanity of the application: before the real call is made and after the real call returns. Making the most of the time before the call proceeds is important as the call might block indefinitely and thus we can detect deadlocks before actually placing the call.

Before the call, we can check:

Before the call, we also store the information about which mutex the thread is trying to acquire. After the call, we keep evidence of which mutexes have been acquired or released by which thread.

    void lp_lock_precheck_diagnose(const pthread_mutex_t *mutex)
        int rc = -1; 
        /*Highest ranking mutex currently acquired by 'me'*/
        int maxmidx = LP_INVALID_IDX; 
        int midx = LP_INVALID_IDX; 
        pthread_t me = pthread_self(); 

        pthread_t             thr = LP_INVALID_THREAD; 
        const pthread_mutex_t *mtx = NULL; 

        /* Thread tries to lock a mutex it has already locked? */
        if ((rc=lp_is_mutex_locked(mutex, me)) != 0) {
            lp_report("Mutex M%lx is already locked by thread.\n", mutex); 

        /* Is mutex order respected? */
        maxmidx = lp_max_idx_mutex_locked(me); 
        midx = lp_mutex_idx(mutex); 
        if (midx < maxmidx) {
            lp_report("Mutex M%lx will be locked out of order (idx=%d, crt max=%d).\n", 
                    mutex, midx, maxmidx); 

        /* Will deadlock? Check for a circular chain. */
        lp_report("Checking if it will deadlock...\n");
        thr = me; 
        mtx = mutex; 
        while ((thr=lp_thread_owning(mtx)) != LP_INVALID_THREAD) {
            lp_report("  Mutex M%lx is owned by T%lx.\n", mtx, thr); 
            mtx = lp_mutex_wanted(thr); 
            lp_report("  Thread T%lx waits for M%lx.\n", thr, mtx); 

            if (mtx == LP_INVALID_MUTEX) 
                break; /*no circular dead; */

            if (0 != pthread_equal(thr, me)) {
                lp_report("  Deadlock condition detected.\n");

    void lp_unlock_precheck_diagnose(const pthread_mutex_t *mutex) 
        int rc = -1; 
        int maxmidx = LP_INVALID_IDX, midx = LP_INVALID_IDX; 

        /* Thread tries to unlock a mutex it does not have? */
        if ((rc=lp_is_mutex_locked(mutex, pthread_self())) == 0) {
            lp_report("Attempt to release M%lx NOT locked by thread.\n", mutex); 

        /* Are mutexes released in reverse order? */
        maxmidx = lp_max_idx_mutex_locked(pthread_self());
        midx = lp_mutex_idx(mutex); 
        if (midx < maxmidx) {
            lp_report("Mutex M%lx will be released out of order (idx=%d, crt max=%d).\n",
                    mutex, midx, maxmidx); 

A sample of the results obtained at run-time can be seen below:

$ make test

Tx40571bb0: Mutex M8049e48 will be locked out of order (idx=1, crt max=2). 
Tx40571bb0: Checking if it will deadlock... 
Tx40571bb0:   Mutex M8049e48 is owned by T40370bb0. 
Tx40571bb0:   Thread T40370bb0 waits for M8049e30. 
Tx40571bb0:   Mutex M8049e30 is owned by T40571bb0. 
Tx40571bb0:   Thread T40571bb0 waits for M8049e48. 
Tx40571bb0:   Deadlock condition detected. 
Tx40571bb0: Mutexes: 
Tx40571bb0:   [00] M4001d180 [owned by:T40571bb0 02] 
Tx40571bb0:   [01] M08049e48 [owned by:T40370bb0 01] 
Tx40571bb0:   [02] M08049e30 [owned by:T40571bb0 02] 
Tx40571bb0:   [03] M08049e60 [owned by:T40772bb0 03] 
Tx40571bb0: Threads: 
Tx40571bb0:   [00] T4016f6c0 [owns:M0000000000000000000000000000000][wants:M00000000] 
Tx40571bb0:   [01] T40370bb0 [owns:M0100000000000000000000000000000][wants:M08049e30] 
Tx40571bb0:   [02] T40571bb0 [owns:M1010000000000000000000000000000][wants:M08049e48] 
Tx40571bb0:   [03] T40772bb0 [owns:M0001000000000000000000000000000][wants:M00000000] 
Tx40772bb0: Mutex M8049e60 is already locked by thread. 

On Heisenberg's uncertainty principle

A word of caution: the Heisenberg's uncertainty principle, in its general form, applies to software! In plain words, Heisenberg stated that observation of a system does modify the behavior of the observed system. In our case, the debug library affects the behavior of the application in at least two ways:

Thus, it might happen that the debug library makes it easier to reproduce the deadlock. Or, on the contrary, it may make it harder to reproduce.


Talkback: Discuss this article with The Answer Gang


Aurelian is a software programmer by trade. Sometimes he programmed Windows, sometimes Linux and sometimes embedded systems. He discovered Linux in 1998 and enjoys using it ever since. He is currently settled with Debian.

Copyright © 2008, Aurelian Melinte. Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 150 of Linux Gazette, May 2008

Searching for Text (Part II)

By René Pfeiffer

In the first part of this series, we took a look at various methods that help us to index text and to facilitate text searches. The approaches were focused on SQL databases. Provided you don't get nervous about abandoning SQL queries, I can show you another way to build an index and to query it for your favourite phrase. Prepare to meet Lucene!

Collecting and Converting Files

I've already said that organising your data is of premium importance. This starts with a smart directory structure (or database design) and ends with proper document summaries and tagging data or having sensible categories. No matter how you store your files, if you start to build an index you need to collect them somehow. A lot of documents are stored in directory trees (on file servers for example); some servers hold thousands or millions of files. Indexing this much documentation requires a good strategy - e.g., you don't want to keep rebuilding indexes for data that hasn't changed.

When it comes down to indexing you need plain text; this means you have to convert your file content. Since there is no such thing as plain text anymore, we also have to think about the character encoding. Will it be sufficient to use 7-bit US ASCII? Do we use ISO-8859-1 or even ISO-8859-15 because we'd like to have the Euro sign? It might be a good idea to use UTF-8. It is a variable-length character encoding for Unicode that 'feels' like 7-bit US ASCII as long as no special character is encountered and has some extra room for a lot of strange letters. This is the reason why a lot of people use it. Be aware that if you use an Unicode encoding in one place, you should use it everywhere. It makes life a lot easier.

When looking at a pile of a hundred thousand files on a server you probably don't want all of them; the ones you want either contain text or can be converted into text. A picture might contain text fields (such as JPEG EXIF tags or PNG iTXt/tEXt chunks), but often it is useful to stick to text file formats. How do we choose? Well, the easiest way is to look for the file extension. This isn't very accurate since the extensions ".jpg" and ".jpeg" both most certainly mark JPEG files, but determining a file type by analysing its content is not easy (the GNU tool file does a good job if you choose to go that way.) For now we will stick to extension for the sake of simplicity.

So what do we do now? Well, we follow the UNIX tradition and split the problem into smaller, manageable parts. First we do a little walking and collecting - a filesystem walk to be exact. We need to create a list of interesting files for later indexing. Let's start with the syntax of the config file that lists all interesting extensions.

extension = [ ".doc", ".htm", ".html", ".odp", ".ods", ".odt", ".pdf", ".txt", ".xls" ];
nice      = 13;
output    = "filelist.txt";

The config option extension lists all interesting extensions. nice is the nice level for your program. A filesystem walk creates more than enough load, so we'll let other processes have the CPU first. output tells our program where to write the list of interesing files. It will be written to the file filelist.txt. The syntax of the config file originates from the C/C++ library libconfig. It parses the file for use and extracts the options. The parser isn't that difficult to write, but we have some more complex tasks ahead. Later in the article, I'll provide a link to the source code for filelist, the helper code files helper.h and helper.cc, the command line options configuration filelist.ggo and a Makefile to build everything. You will need the Boost filesystem and iostreams libraries as well as the gengetopt skeleton generator from the GNU project. gengetopt and especially the Boost libraries are extremely useful to make your life in C/C++ a lot easier.

Lucene and its Ports

The Apache Lucene project is a collection of tools for building software that features search functions. The core component is Lucene itself, which is a Java-based indexing and search library. The project offers additional code to build web search applications and dealing with metadata. Lucene has been ported to Perl (PLucene), Python (PyLucene), C (Lucy), and C++ (CLucene), so you can access a Lucene index with your favourite tool chain. The Java code also contains classes that facilitate the import of documents by using stream classes. Creating and maintaining an index is very easy, and the index format can be used across different platforms and accessed by various ports of Lucene without the need for conversion.

We'll focus on the C++ port of Lucene in order to index our documents.

Indexing with CLucene

CLucene introduces some concepts that should be understood before using the API. Every object that is to be indexed is called a document; in an ideal word this document consists of pure text to make the work of the indexer easier. CLucene analyses the content and uses different algorithms to extract useful words or tokens from the indexed document. The analysers are classes of their own. They contain the algorithms such as dividing by white spaces, using stop words, replacing special characters, and other methods.

Every document can be described by a list of fields with content. The fields can have arbitrary names. These names are used to access the content of the field, very similar to associative arrays or hashes.

Field      | Content
title      | My notes from the conference
author     | R. Pfeiffer
content    | Lots and lots of text, ...
timestamp  | 1207958005
type       | UTF-8 Unicode English text
...        |

This means that the document is a container for all kinds of information added by the fields. It is not necessary to put the whole content into the index, but it helps if your aim is to have a full text search. It is also useful to pack additional metadata into the document. The search covers all fields or a selection of them, so that you can tune your search later in order to reduce the number of hits.

The index as a whole resides in a directory and is completely managed by CLucene; there are no user-serviceable parts, all access is done through CLucene (or one of the other ports). Moving the index around is as easy as copying a file. You can also copy the index directory between different platforms and it should work without conversion.

The CLucene library works with Unicode. The strings are typed with wchar_t *, which means they consist of wide character literals (marked with a L such as L'x'). Therefore I suggest that you use Unicode when interfacing with CLucene.

Simple Indexing Strategy

How do we get from our list of interesting files to a properly-filled index? Simple, we need a little strategy.

  1. Walk through the list of files one by one
  2. Check if the file has been modified since the last indexing run
  3. Determine file type (by extension or more elaborate means)
  4. Convert file to plain text with a given character encoding
  5. Add file name, content, timestamps, file type, etc. to index

Now we know what our next piece of code should do. I will describe some important aspects of the task.

Converting Documents to Plain Text

I've already said that we need text; this means we have to convert PDF, PostScript, and anything that is not text into text. We should also ensure that the end result is suitably Unicode encoded (UTF-8 for example). In order to avoid doing this in our C++ code, we rely on external programs. This isn't very elegant, but it helps to maintain a flexible approach to future data formats that we wish to index. The external helper tools are defined by file extension in a configuration file.

// List of known file extensions and their converters to
// plain text
pdf=(pdftotext -q -eol unix -enc UTF-8 $IN - > $OUT)
ps=(pstotext $IN | iconv -f ISO-8859-1 -t UTF-8 -o $OUT -)
doc=(antiword $IN > $OUT)
html=(html2text -nobs -o $OUT $IN)
htm=(html2text -nobs -o $OUT $IN)
odp=(ooo_as_text $IN > $OUT)
ods=(ooo_as_text $IN > $OUT)
odt=(ooo_as_text $IN > $OUT)
php=(html2text -nobs -o $OUT $IN)
rtf=(unrtf --nopict --text $IN > $OUT)
txt=(cat $IN > $OUT)
xls=(py_xls2txt $IN > $OUT)
xml=(cat $IN > $OUT)

$IN is the file to be converted. $OUT is the output file, which will be a temporary file. The extension is to the left of the equal sign. The commands inside the brackets will be executed by the indexer prior to feeding the content to the CLucene index. An alternative would be to use classes that understand file types, read them, and convert them; the Strigi project has code for this. They call it JStreams. JStreams provide a standardised interface for accessing the contents of different file types. The approach with external tools is a bit more generic. Note that all OpenOffice document formats can be converted with a single tool (ooo_as_text is still not released but part of the OOoPy project. Kindly ask the author of OOoPy for a copy if it's not contained in the downloads).

The configuration file is parsed by a parser generated with Boost's Spirit library. The whole parser is defined by using templates. struct filter_grammar in helper.cc contains the full rules for the parser. Once you understand how the templates work the Spirit library is a convenient way to build your own parsers.

Maintaining a Database of Timestamps

When you consider the documents stored on a file server, you will realise that most of them don't change very often. Maybe some change more often when someone works on them. Once you have a kind of "library", most of the documents will stay the same. So it's a good idea to keep track of the modification time of the files. By doing this we can decide to update the document depending on its last modification timestamp. This saves a lot of I/O when indexing a large collection of documents.

The indexer keeps track of the timestamps by a SQLite database. The table could also be replaced by a hash, but I wanted to do more in SQL and then didn't do it. indexer.cc has the creation statement.

CREATE TABLE IF NOT EXISTS fileaccess ( filename TEXT PRIMARY KEY, mtime INT(8) )

SQL is overkill for that, but that's why we use SQLite. :-) We use a transaction during the whole indexing process. The changes are not committed unless all documents are indexed without error.

Creating and Writing to a CLucene Index

Before you can do anything with an index you have to open it. This is very similar to other resources such as files or sockets. However, before you can open or create it, you have to select an analyser object. This analyser defines how you wish to process the data fed to the index. Since we don't know exactly what we are indexing, we'll use a whitespace analyser. Then we open the index.

// This is the analyser we use.
WhitespaceAnalyzer analyser;

// Initialise CLucene index writer
IndexWriter::IndexWriter index_repository( args_info.index_arg, &analyser, new_index, true );

args_info.index_arg is a string containing the directory where the index will live. &analyser is our analyser and new_index is a boolean flag that indicates whether we open an existing index or create a new one. The last argument is described with closeDir in CLucene's Doxygen documentation.

The index can now be filled with document objects. A CLucene document is simply a container for fields describing something.

// Fields we want to put into the index
lucene::document::Field *field_filename;
lucene::document::Field *field_file_content;
lucene::document::Field *field_mtime;
lucene::document::Field *field_type;

 // Create Lucene document for adding to index directory.
 file_document = new lucene::document::Document;
 // Add fields to document object.
 if ( file_has_content ) {
     file_document->add( *field_file_content );
 file_document->add( *field_filename );
 file_document->add( *field_mtime );
 file_document->add( *field_type );
 index_repository.addDocument( file_document, &analyser );

You can add as many fields as you like. Adding metadata is a good idea since you probably don't want to search for content all of the time. The addDocument() method adds the document to the index repository.

Since CLucene maintains the index repository, it's also a good idea to call the optimise method if you change a lot of data.

// Optimising the index should be done after there were changes to the index.

// Close index.

That's the short introduction to CLucene, and these are the basics you need to get started. The library can do many more things.

Reading from a CLucene index

We won't read from the index (yet), but reading is as easy as writing. You open the index repository, send search queries and retrieve documents. A fragment of code performing a search and retrieving all hits looks like this.

using namespace lucene::index;
using namespace lucene::analysis;
using namespace lucene::util;
using namespace lucene::store;
using namespace lucene::document;
using namespace lucene::search;
using namespace lucene::queryParser;

wstring search_string = L"Where is it?";

lucene::index::IndexReader    *index_reader;
lucene::search::IndexSearcher *index_searcher;
Query                         *index_query;
Hits                          *index_hits;
WhitespaceAnalyzer            analyser;

index_reader   = IndexReader::open( args_info.index_arg );
index_searcher = new IndexSearcher(index_reader);
index_query    = QueryParser::parse( search_string.c_str(), L"content", &analyser );
index_hits     = index_searcher->search(index_query);
if ( index_hits->length() > 0 ) {
    for( long i=0; i < index_hits->length(); i++ ) {
        Document &doc = index_hits->doc(i);
        wcout << "FOUND: " << doc.get(L"filename") << endl;

delete index_hits;
delete index_query;

delete index_searcher;

It is important to use the same analyser as the indexing process did. In our case this is the WhitespaceAnalyzer again. IndexReader::open() opens the index, QueryParser::parse() performs the search, and CLucene returns Query objects whose content, the Document objects, can be retrieved. As you can see, all strings are wide strings, so using Unicode really is important.

If you do some debugging with your created CLucene indices, you might want to try Luke, the Lucene Index Toolbox. It provides a Java tool that can display the contents of an index. You can browse the documents, look at the fields, and perform search queries.

The Code

Since this article is already much longer than anticipated, I'll just provide a link to the complete tar archive of all the code I've shown above. It also contains a Makefile to facilitate compiling. I used the GCC/G++ 4.1.2 for development (and I'd like to see what Intel's compiler says about my code). If you use a Debian system you will need the following packages:

I compiled SQLite, libconfig++, and CLucene from source, because I wasn't happy with the packages in Debian Etch. The newer SQLite version is especially interesting, since its provides a new API to some functions (marked with ..._v2()). If you want to compile the Boost library as well, you have to add the path to its include files (which is /usr/local/include/boost-1_35 for the current version if installed from source). Since the Boost libraries consist mainly of templates, the compile process is fairly short despite the size of Boost's distribution.

Test Run with a little "Benchmark"

And now for the final question: Why all the fuss? How fast is it? Do we need to care which port we use? I can't answer these questions. All I can do is run the indexer over a list of files (this is not a benchmark, and has no statistical significance.) The directory looks like this:

rpfeiffer@miranda:/nfs/Bibliothek$ du -h --max-depth=1
1.3M    ./Lyrics
703M    ./Security
0       ./Biometrie
16M     ./Sysadmin
172K    ./Misc
403M    ./Teaching
12M     ./Programming
57M     ./Hardware
3.9M    ./Reports
7.3M    ./Networks
92K     ./Chaos
32K     ./Gfx
12K     ./UTF-8
23M     ./VoIP
1.8M    ./Science
1.2M    ./Manuals
1.2G    .

The documents reside on our file server in the office and are accessed via NFSv3 and Gigabit Ethernet. The machine running the indexer is a Core2 Duo with 2.13 GHz and 2 GB RAM. filelist counts 539 interesting documents (by using the extensions I listed earlier). Let's try to run the indexer and create a new index. Note that the directory was cached in part due to my du command and that the 539 documents may be less than the 1.2 GB because we only look for specific file extensions.

rpfeiffer@miranda:~/code$ time ./indexer -c ./indexer.cfg -i /var/tmp/i -n 1 -l ./filelist.txt 

real    1m48.767s
user    1m12.337s
sys     0m17.849s

The index looks like this:

rpfeiffer@miranda:/var/tmp/i$ ls -lh
total 5.2M
-rwxr-xr-x 1 rpfeiffer rpfeiffer    4 2008-04-18 23:56 deletable
-rwxr-xr-x 1 rpfeiffer rpfeiffer 5.2M 2008-04-18 23:56 _gk.cfs
-rwxr-xr-x 1 rpfeiffer rpfeiffer   28 2008-04-18 23:56 segments

So, the indexer did something and stored something to disk. An inspection with Luke shows familiar documents and content.

A Word about Alpha Code and Bugs

Please be aware that the code shown in this article is of alpha quality. The source contains some dead code and still needs some improvement (especially the execution of the external helper binaries). So far, it works, and it doesn't segfault that often anymore - but that's about it. It is built on solid and stable libraries, but it isn't meant to be in production status (yet). It's also a bit messy and should be cleaned up, because it took me a while to understand the libraries that I used. If you have suggestions, just send patches - that's what the GPL is for. If you have no code but a lot of good ideas, let's hear them! Preferably in an article for one of our next issues. ;-)

Useful resources

Talkback: Discuss this article with The Answer Gang

Bio picture

René was born in the year of Atari's founding and the release of the game Pong. Since his early youth he started taking things apart to see how they work. He couldn't even pass construction sites without looking for electrical wires that might seem interesting. The interest in computing began when his grandfather bought him a 4-bit microcontroller with 256 byte RAM and a 4096 byte operating system, forcing him to learn assembler before any other language.

After finishing school he went to university in order to study physics. He then collected experiences with a C64, a C128, two Amigas, DEC's Ultrix, OpenVMS and finally GNU/Linux on a PC in 1997. He is using Linux since this day and still likes to take things apart und put them together again. Freedom of tinkering brought him close to the Free Software movement, where he puts some effort into the right to understand how things work. He is also involved with civil liberty groups focusing on digital rights.

Since 1999 he is offering his skills as a freelancer. His main activities include system/network administration, scripting and consulting. In 2001 he started to give lectures on computer security at the Technikum Wien. Apart from staring into computer monitors, inspecting hardware and talking to network equipment he is fond of scuba diving, writing, or photographing with his digital camera. He would like to have a go at storytelling and roleplaying again as soon as he finds some more spare time on his backup devices.

Copyright © 2008, René Pfeiffer. Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 150 of Linux Gazette, May 2008

Joey's Notes: Guide to adding a new partition or drive to an existing system

By Joey Prestia

Joey's Notes image

The basic steps involved in this process are:

  1. Determine what partitions need to be created and where.
  2. Create the partitions (I use fdisk here but any Linux disk partitioning tool should work)
  3. Re-read the partition table either with partprobe or by a reboot
  4. Make a filesystem on the partition, label it, and create the necessary mount points
  5. Add the appropriate entries to /etc/fstab so the partitions are mounted upon reboot


Imagine that we have a server running RHEL 4 and our supervisor comes over and wants a 10 GB partition created for the data processing department. This is in addition to what the server currently has allocated, so we can either create a partition out of unpartitioned space on the existing disk (experienced Linux system administrators will leave unpartitioned disk space for future expansion) if available, or we can add another drive. This scenario actually happens quite frequently in the production world, so this is a valuable skill to have even if you administer nothing more than your home machines.

We'll assume that your supervisor has given you the latitude of deciding which of the above options you'll use, so your first task is to check to see if space is available on your existing media. We'll run "fdisk -l" to see the size of the disk; the data we need is on the first line of output.

[root@station17 ~]# fdisk -l

Disk /dev/sda: 80.0 GB, 80000000000 bytes
255 heads, 63 sectors/track, 9726 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          38      305203+  83  Linux
/dev/sda2              39        7687    61440592+  83  Linux
/dev/sda3            7688        7942     2048287+  82  Linux swap

From this we can see the size of our drive is 80.0 GB. Now, we'll use "df -h" to calculate the size of the partitions that are on our system. We only need to be concerned with the rows that have a device label, the others (labeled with "none") don't concern us. The column labeled "Size" has the numbers we'll need to add up to get a overall size.

[root@station17 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2              58G  6.5G   49G  12%   /
/dev/sda1             289M   17M  258M   6%   /boot
none                 1013M     0 1013M   0%   /dev/shm

[root@station17 ~]# 

From this, we can see that /dev/sda2 is 58G and /dev/sda1 is 289M - a total of 58.3GB. Now we need to add in our swap size; "cat /proc/swaps" will tell us what size our swap partition is.

[ If you feel like using an actual system utility for this, "swapon -s" will do the same thing. -- Ben ]

[root@station17 ~]# cat /proc/swaps 
Filename                              Type            Size    Used    Priority
/dev/sda3                             partition       2048276    0       -1
[root@station17 ~]# 

Adding in the 2GB from this means that we have 19.7 GB to work with - well over what we need. Now, let's move on to creating our partition: "fdisk /dev/sda" will open our drive's partition table for modification. Since we're already using 3 partitions on the drive, we'll have to make our 4th one an extended one - a container to house any additional partitions, including the one we are creating now. We'll want to accept the defaults on this extended partition, which will make the whole rest of the drive available for our new partitions. We'll be using an ext3 filesystem, so we also need to keep this in mind: the "mkfs" command reserves 5% of the blocks for root. Given all that, we'll make our new partition 11.5GB to compensate for the blocks reserved for root plus a little extra.

[root@station17 ~]# fdisk /dev/sda

The number of cylinders for this disk is set to 9726.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
Selected partition 4
First cylinder (7943-9726, default 7943): 
Using default value 7943
Last cylinder or +size or +sizeM or +sizeK (7943-9726, default 9726): 
Using default value 9726

Here you can see where I selected "n" for a new partition and "e" to make a extended partition. I then accepted the defaults for both the starting cylinder and again for the ending cylinder.

Command (m for help): n
First cylinder (7943-9726, default 7943): 
Using default value 7943
Last cylinder or +size or +sizeM or +sizeK (7943-9726, default 9726):.+11500M

Next, I hit "n" to create a new partition; then, when prompted to use a starting cylinder, I hit 'enter' to accept the default. For the ending cylinder I entered "+11500M" to specify the size. The plus is important - without it, you will get an error. Its a good idea to hit "p" to at this point to get "fdisk" to print the partition table. This will show what we have done before saving our changes.

Command (m for help):.p

Disk /dev/sda: 80.0 GB, 80000000000 bytes
255 heads, 63 sectors/track, 9726 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          38      305203+  83  Linux
/dev/sda2              39        7687    61440592+  83  Linux
/dev/sda3            7688        7942     2048287+  82  Linux swap
/dev/sda4            7943        9726    14329980    5  Extended
/dev/sda5            7943        9341    11237436   83  Linux

If there are any mistakes just quit "fdisk" with a "q" and no changes will be saved. This looks right - so lets write our changes with a "w".

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.
[root@station17 ~]# 

This warning can be remedied by using the 'partprobe' command to force the kernel to reread the partition table. Remember - if this were a production machine, we wouldn't want to have to reboot it.

[root@station17 ~]# partprobe

At this point our 11.5G partition is /dev/sda5 and raw - it has neither a file system nor a label descriptor - so let's format it and give it a label. Giving the partition a label can be done at the same time that the file system is being created with the -L option, but I prefer to do it in a separate step.

[root@station17 ~]# mkfs.ext3 /dev/sda5
mke2fs 1.35 (28-Feb-2004)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
1406272 inodes, 2809359 blocks
140467 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=2877292544
86 block groups
32768 blocks per group, 32768 fragments per group
16352 inodes per group
Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208

Writing inode tables: done                            
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 34 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
[root@station17 ~]# 

Now we'll give it label ("/data"):

[root@station17 ~]# e2label /dev/sda5 /data
[root@station17 ~]# 

Next, we need to create a mount point in our filesystem and make sure that it's mounted at boot time. Let's create a directory on our system called /data.

[ The usual method of allocating new space is often much more complex than that - at least in the planning stages. In fact, creating a non-standard directory name in the root of the filesystem as suggested here is incorrect and violates the Filesystem Hierarchy Standard (FHS). As an example of a more typical situation, if an administrator finds that a shared machine's drive is running out of room, he may first examine the machine to see where the most activity/space consumption is occuring. Assuming that it's in the space assigned to users (i.e., "/home"), he would most likely back up the data in that subdirectory, restore it to the newly-created partition, delete "/home", and mount the new partition as "/home". This would recover all the space used by the original "/home" and leave it available for the rest of the system to use - and most users would not even realize that any change had been made. This approach doesn't require rebooting the machine either. -- Ben ]

[ I do understand that the partitioning is inconsistent with the FHS, but our RedHat course materials do instruct us to create directories in / for simplicity and ease in the aid backups. We are also led by instruction to do things such as specialized partitioning schemes for different things this way here at the RedHat academy. -- Joey ]

[root@station17 ~]# mkdir /data
[root@station17 ~]# 

Now we put it in the file system table, '/etc/fstab', so it gets mounted on every boot.

[root@station17 ~]# vi /etc/fstab

# This file is edited by fstab-sync - see 'man fstab-sync' for details
LABEL=/                /                       ext3    defaults        1 1
LABEL=/data            /data                   ext3    defaults        1 1
LABEL=/boot            /boot                   ext3    defaults        1 2
none                   /dev/pts                devpts  gid=5,mode=620  0 0
none                   /dev/shm                tmpfs   defaults        0 0
none                   /proc                   proc    defaults        0 0
none                   /sys                    sysfs   defaults        0 0
LABEL=SWAP-sda3         swap                   swap    defaults        0 0
/dev/scd0   /media/cdrecorder   auto    pamconsole,exec,noauto,managed 0 0

I used the root partition as a guide in this sample. The label is in the first column, the mount point is in the second, then we have the file system type and the mount options. The last two numbers are the dump indicator and the fsck indicator; they determine when the system gets backed up if you're using 'dump', and when the system gets checked for errors. Basically, you can copy these numbers and options just as I have. Write your changes and exit the editor. Then, to make sure that there were no errors, run "mount -a" to mount all the partitions listed in /etc/fstab. Any errors would be reported at this point.

[root@station17 ~]# mount -a

Since we didn't get any errors, let's do a "df -h" and see how everything looks.

[root@station17 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2              58G  6.6G   49G  12% /
/dev/sda1             289M   17M  258M   6% /boot
none                 1013M     0 1013M   0% /dev/shm
/dev/sda5              11G   59M   10G   1% /data

[root@station17 ~]# 

That's it - we are now ready to start using this new partition, keeping in mind we may have to modify permissions as needed for our users and groups. This is a very common task, one that all Linux users should become familiar with because you will almost certainly be faced with needing more room. This process is very similar to adding another disk - you would simply substitute your device labels as required.

Talkback: Discuss this article with The Answer Gang


Joey was born in Phoenix and started programming at the age fourteen on a Timex Sinclair 1000. He was driven by hopes he might be able to do something with this early model computer. He soon became proficient in the BASIC and Assembly programming languages. Joey became a programmer in 1990 and added COBOL, Fortran, and Pascal to his repertoire of programming languages. Since then has become obsessed with just about every aspect of computer science. He became enlightened and discovered RedHat Linux in 2002 when someone gave him RedHat version six. This started off a new passion centered around Linux. Currently Joey is completing his degree in Linux Networking and working on campus for the college's RedHat Academy in Arizona. He is also on the staff of the Linux Gazette as the Mirror Coordinator.

Copyright © 2008, Joey Prestia. Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 150 of Linux Gazette, May 2008

Hands-on Linux Software RAID

By Amit Kumar Saha

What is Software RAID?

Software RAID is RAID implemented with software - no additional hardware such as a RAID controller is needed. Thus, software RAID is a good starting point to start getting some hands-on RAID experience. Also, software RAID is independent of proprietary management software - maintaining a software RAID works the same way on all machines that run Linux. However, there is something to think about too: when considering software RAID, think about performance. All RAID algorithms are done by the system CPU and every block has to be copied over the system's data bus (i.e. sda1 <-> IO controller <-> RAM, possibly CPU <-> IO controller <-> sdb1). (Thanks to René Pfeiffer of the Answer Gang for pointing that out.)

Enabling your Kernel to support RAID

I am using Ubuntu 7.10 with the "stock" kernel for my experiments. The test machine has an 80GB SATA HDD.

First, check whether the RAID support is enabled in your kernel:

cat /proc/mdtstat

If you get a message saying:

cat: /proc/mdstat: No such file or directory

then you need to enable RAID support. There are two possiblities:

  1. RAID support was disabled while compiling the kernel and you will have to recompile it
  2. You will have to insert the multiple disk (md) support module manually. Check whether the "md*" modules exist under /lib/modules/$(uname -r)/kernel/drivers/md/ and insert the module as follows:
    $ sudo modprobe md-mod 
    (Thanks to Kapil of the Answer Gang for this one)

Now, you can verify whether RAID support is active:

amit@amit-desktop:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
unused devices: 

This means that now we have RAID support in the kernel.

Tools for manipulating RAID arrays

Now that you have got a RAID enabled Kernel, you will need to use some user-space tools to help you out to play with RAID.

Slightly outdated 'raidtools' and the newer, better 'mdadm' are the tools available to you. My focus in this article will be on 'mdadm'. For more information on using 'raidtools' and a comparison of the two, please refer to the How-To mentioned in the References.

Installing 'mdadm'

amit@amit-desktop:~$ sudo apt-get install mdadm
Reading package lists... Done
Building dependency tree  
Reading state information... Done
Recommended packages:
The following NEW packages will be installed:
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 220kB of archives.
After unpacking 627kB of additional disk space will be used.
Get:1 http://in.archive.ubuntu.com gutsy/main mdadm 2.6.2-1ubuntu2 [220kB]
Fetched 220kB in 48s (4515B/s)                                            
Preconfiguring packages ...
Selecting previously deselected package mdadm.
(Reading database ... 88932 files and directories currently installed.)
Unpacking mdadm (from .../mdadm_2.6.2-1ubuntu2_i386.deb) ...
Setting up mdadm (2.6.2-1ubuntu2) ...
Generating array device nodes... done.
Generating mdadm.conf... done.
Removing any system startup links for /etc/init.d/mdadm-raid ...
update-initramfs: deferring update (trigger activated)
* Starting MD monitoring service mdadm --monitor                        [ OK ]

Processing triggers for initramfs-tools ...
update-initramfs: Generating /boot/initrd.img-2.6.22-14-generic

Creating a RAID device

My disk setup now is as follows:

 Name      Flags       Part Type    FS Type                 [Label]     Size (MB)
sda1       Boot        Primary      NTFS                    []           20612.56      
sda5                   Logical      W95 FAT32                            20579.66
sda6                   Logical      W95 FAT32                            20587.88
sda7                   Logical      Linux ext3                           12000.69
sda8                   Logical      Linux swap / Solaris                  1019.94
sda9                   Logical      Linux                                 2048.10
sda10                  Logical      Linux                                 2048.10
sda11                  Logical      Linux                                 3446.40

I will now combine sda9 and sda10 to form one large logical device to form a RAID array. For the purpose of demonstration, and also since 0 is always a good point to start, creating a level-0 RAID is next.

[ Note the type of the partition. The Linux RAID kernel driver can automatically start a RAID device if the type of the partition is marked as 0xFD meaning "Linux RAID partition with autodetect using persistent superblock". -- René ]

Combining 2 consecutive partitions to form a RAID is not a smart thing to do, I was told by the Answer Gang. But till I find it why, I shall persist.

[ The purpose of having a RAID is to distribute the I/O load of any read/write operations over multiple disks. Hard disks are slow, and take a while to complete commands given to them. Depending on the I/O operation, a RAID will allow the system to let the disks in a RAID work in parallel. This is especially true when reading from a RAID0 or RAID1.
If you create a RAID device on the same physical device the RAID driver doesn't notice. The problem you have then is that you put the poor drive under a lot of load, since the driver now thinks it can issue a command in parallel while in reality the is no parallelism. This means that the heads of the drive will probably move a lot - and this is a bad idea as a friend of mine who does professional data recovery once explained to me.
So, it's OK to do this for educational purposes, but please, don't ever ever put live data on a production server into a RAID consisting of partitions on the same physical drive. -- René]

Creating a Level-0 RAID

amit@amit-desktop:~$ sudo mdadm --create --verbose /dev/md0 --level=0 --raid-devices=2 /dev/sda9 /dev/sda10
[sudo] password for amit:
mdadm: chunk size defaults to 64K
mdadm: array /dev/md0 started.

Let us now check the RAID array we just created:

amit@amit-desktop:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid0 sda10[1] sda9[0]
     3999872 blocks 64k chunks
unused devices: 

Now, we'll create a filesystem on the new RAID device:

amit@amit-desktop:~$ sudo mkfs -t ext3 /dev/md0
[sudo] password for amit:
mke2fs 1.40.2 (12-Jul-2007)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
500960 inodes, 999968 blocks
49998 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1027604480
31 block groups
32768 blocks per group, 32768 fragments per group
16160 inodes per group
Superblock backups stored on blocks:
       32768, 98304, 163840, 229376, 294912, 819200, 884736

Writing inode tables: done                           
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 33 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
Mount the RAID device:
amit@amit-desktop:~$ sudo mkdir /media/RAID0
amit@amit-desktop:~$ mount /dev/md0 /media/RAID0/
mount: only root can do that
amit@amit-desktop:~$ sudo mount /dev/md0 /media/RAID0/

amit@amit-desktop:~$ df

Filesystem           1K-blocks      Used Available Use% Mounted on

/dev/md0               3936940     73440   3663508   2% /media/RAID0

let us now use 'mdadm' to get some details on the RAID array:

amit@amit-desktop:~$ sudo mdadm --query /dev/md0 --detail
/dev/md0: 3.81GiB raid0 2 devices, 0 spares. Use mdadm --detail for more detail.

amit@amit-desktop:~$ sudo mdadm --detail /dev/md0
       Version : 00.90.03
 Creation Time : Tue Mar 11 13:05:22 2008
    Raid Level : raid0
    Array Size : 3999872 (3.81 GiB 4.10 GB)
  Raid Devices : 2
 Total Devices : 2
Preferred Minor : 0
   Persistence : Superblock is persistent

   Update Time : Tue Mar 11 13:05:22 2008
         State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
 Spare Devices : 0

    Chunk Size : 64K

          UUID : f77bd177:706b589c:2a7af8c6:cbd32339 (local to host amit-desktop)
        Events : 0.1

   Number   Major   Minor   RaidDevice State
      0       8        9        0      active sync   /dev/sda9
      1       8       10        1      active sync   /dev/sda10

Looking ahead

A RAID experimental bed is now ready for us. In some future articles, I shall try to share my experiments on the RAID setup. You may also consider visiting my blog posts on RAID here.


  1. What is RAID?
  2. Software RAID How-To
  3. 'mdadm' manual page


Thanks to the Answer Gang (TAG) for discussions on RAID a while back. Though none of the cool suggestions has been tried by me, the next article shall have them tried, tested and appreciated. I also had the privilege to get my article "live-edited" by the Answer Gang which I believe was a limited period offer :-). Thanks, guys!

Talkback: Discuss this article with The Answer Gang

Bio picture

The author is a freelance technical writer. He mainly writes on the Linux kernel, Network Security and XML.

Copyright © 2008, Amit Kumar Saha. Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 150 of Linux Gazette, May 2008


By Shane Collinge

These images are scaled down to minimize horizontal scrolling.

Flash problems?

All HelpDex cartoons are at Shane's web site, www.shanecollinge.com.

Talkback: Discuss this article with The Answer Gang

Bio picture Part computer programmer, part cartoonist, part Mars Bar. At night, he runs around in his brightly-coloured underwear fighting criminals. During the day... well, he just runs around in his brightly-coloured underwear. He eats when he's hungry and sleeps when he's sleepy.

Copyright © 2008, Shane Collinge. Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 150 of Linux Gazette, May 2008


By Randall Munroe

More XKCD cartoons can be found here.

Talkback: Discuss this article with The Answer Gang


I'm just this guy, you know? I'm a CNU graduate with a degree in physics. Before starting xkcd, I worked on robots at NASA's Langley Research Center in Virginia. As of June 2007 I live in Massachusetts. In my spare time I climb things, open strange doors, and go to goth clubs dressed as a frat guy so I can stand around and look terribly uncomfortable. At frat parties I do the same thing, but the other way around.

Copyright © 2008, Randall Munroe. Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 150 of Linux Gazette, May 2008