From Kevin Monceaux on 27 May 1998
Dear Answer Guy,
I really enjoy "The Answer Guy" column, and I hope you can help me with this one. I'm running Linux 2.0.29. I've been using this version for quite a while now. Up until now everything's been fine. A couple of days ago the problem developed. What appears to be happening is that when programs are run they are not deallocating the memory they used. Upon first booting the system there is already almost 9 megs of RAM in use. I've run free to check the memory usage, ran another command, such as ls, then ran free again and the free memory decreases. I've noticed that if I run the same command, such as ls, again the memory usage stays the same. It's only when commands that haven't been executed before are run that the amount of free memory decreases. It doesn't take long before I'm out of memory and have to reboot. Any suggestions you could give me with this problem would be greately appreciated.
Thanks in advance,
If you suspect a memory leak I highly recommend getting a log of your 'free' or 'vmstat' output before and after a few commands -- several snapshots.
You can make a cron job to mail you a snapshot of this every hour or so. You might want to append the output of a ps command to each of these e-mail snapshots.
Unfortunately it isn't as easy to interpret the output of these commands as it should be. It's entirely too easy to misinterpret the output fields from them -- since Linux normally uses most of the available memory for file cache buffers -- and large portions of the shared libraries and memory allocated to forked process is shared (the memory manager uses "copy-on-write" and other techniques to minimize the utilization of physical memory). This makes correlating actual memory usage difficult.
You can also use 'top' (which is a curses process viewer). It can show you the current state of the system and sort by memory (M) or CPU utilization (P). You want to isolate the specific process(es) that is(are) causing the problem. Don't leave 'top' running unattended, however, since it is a bit of a resource hog in its own right.
If you do isolate this to a particular program you'll want to see if there are updates available for it, or for any of the libraries it uses. You may also want to consider getting a newer kernel --- such as 2.0.33 or (if it's ready by the time you read this: 2.0.34).
Sorry I can't be more specific --- but you'll have to narrow down the problem a bit before we can do more. Incidentally you can start up in single user mode and manually start all of the daemons and processes that you normally run your multi-user (initdefault) mode. Do this slowly, one command/daemon at a time, to see when the problem first appears. If it happens right away then boot with the -b option to prevent the execution of any of your boot up scripts and manually load any kernel modules you're using one at a time.