[ale] Something I thought I'd never see

Jeff Lightner jlightner at water.com
Wed Oct 11 11:14:07 EDT 2006


Thanks.

FYI - ps -ef doesn't give the STAT column that would show the "D".

Running "ps -eo pid,user,args,stat |grep D$" did give me all the
processes I'd previously identified so your post led me to what I wanted
for future issues.  

Also lsof provides all the open file handles (including libraries and
sockets) but the /proc/<PID>/fd did show the relevant one for this
issue.

-----Original Message-----
From: ale-bounces at ale.org [mailto:ale-bounces at ale.org] On Behalf Of
To: ale at ale.org
Danny Cox
Sent: Wednesday, October 11, 2006 9:57 AM
To: Atlanta Linux Enthusiasts
Subject: Re: [ale] Something I thought I'd never see

Jeff,

On Wed, 2006-10-11 at 09:31 -0400, Jeff Lightner wrote:
<snip>

> However I'm wondering how I might have figured this out if I hadn't
> been able to narrow down the day except by running ps -ef and looking
> for oddities such as the ones I found?   This prompted the question
> above.   I often see what appear to me to be abnormally high load
> averages (as compared to what I'd think reasonable on the UNIX boxes
> I've worked on) but they don't seem to actually impact performance
> overall.   

	With a "ps ef" you'll continually see processes stuck in 'D'
state.
Usually, you'll only be able to catch one or two in that state, and the
next time you run ps, they'll be 'R'unning or 'S'leeping.  

	'D' is described as a "short sleep".  It's present during the
time the
kernel is running on behalf of the process doing disk I/O.  That's
usually much less than a second.

	So, if you're continually seeing processes stuck in 'D' state,
that's
probably filesystem corruption, or a disk slowly dieing.

	You can do an ls -l on /proc/<pid>/fd to see what files it has
open.
One of those will be the problem child.  You can then determine the
filesystem in question.

	You might also try using strace -p <pid> to trace the process.
It may
give the system call it's currently trying to use.  If it does, the
first argument in a read or write is the fd.  Then use
the /proc/<pid>/fd/<fd> to determine the filesystem in question.

	Good luck!

-- 
Daniel S. Cox
Internet Commerce Corporation


_______________________________________________
Ale mailing list
Ale at ale.org
http://www.ale.org/mailman/listinfo/ale



More information about the Ale mailing list