Skip Menu |
 

To: krb5-bugs@MIT.EDU
Subject: kshd hanging tests
From: Ken Raeburn <raeburn@MIT.EDU>
Date: Mon, 23 Jun 2003 22:56:13 -0400
In a few nightly test runs recently on my Linux box (recently updated
to Athena 9.2, based on Red Hat 9), kshd has been hanging. The expect
process sits there, waiting for it to die, so the test suite
(tests/dejagnu) is still running a few days later.

(1) kshd should probably be dying; if I catch it in this state again
I'll investigate further what it's doing.

(2) The test suite shouldn't wait forever for it to die; if it doesn't
die fairly quickly, kill it with -9, maybe wait a few seconds to
clean up the zombie process, and move on whether or not it died.

Both the krb5-current and krb5-1.3 tests were having this problem.

I don't know if it relates at all to the SIGCHLD/SIG_IGN/wait problem.

Ken
From: raeburn@mit.edu
Subject: CVS Commit
Download (untitled) / with headers
text/plain 1.4KiB
A typical stack trace:

#0 0xffffe002 in ?? ()
#1 0x420da75f in syslog () from /lib/tls/libc.so.6
#2 0x0804ad06 in cleanup (signumber=15) at krshd.c:567
#3 <signal handler called>
#4 0xffffe000 in ?? ()
#5 0x4202774e in sigaction () from /lib/tls/libc.so.6
#6 0x0804ac82 in cleanup (signumber=1) at krshd.c:548
#7 <signal handler called>
#8 0xffffe002 in ?? ()
#9 0x4202774e in sigaction () from /lib/tls/libc.so.6
#10 0x420daa21 in vsyslog () from /lib/tls/libc.so.6
#11 0x420da75f in syslog () from /lib/tls/libc.so.6
#12 0x0804b670 in doit (f=3, fromp=0xbfffda50) at krshd.c:1313
#13 0x0804ab87 in main (argc=11, argv=0xbfffdb34) at krshd.c:459
#14 0x420156a4 in __libc_start_main () from /lib/tls/libc.so.6

Yes, we're calling syslog from inside a signal handler. Yes, this is
bad. And from some poking about that I did earlier, it appears that
there's some locking code in vsyslog which may be deadlocking in the
nested call. And this usually seems to happen when logging the "shell
process completed" message.

This is a quick patch to switch off the signal handlers before logging
that message. I suspect the breakage happens earlier, though, so this
might not fix the bug, just maybe move it around a little.

* krshd.c (ignore_signals): Split out from cleanup().
(doit): Call it when the shell process has completed, before calling syslog.


To generate a diff of this commit:



cvs diff -r5.379 -r5.380 krb5/src/appl/bsd/ChangeLog
cvs diff -r5.100 -r5.101 krb5/src/appl/bsd/krshd.c
The patch appears to work -- the nightly tests of the branch are
consistently running to completion now. I think we can include this for
the 1.3 branch -- and if we do, those nightly tests may actually work,
too.
From: tlyu@mit.edu
Subject: CVS Commit
pullup from trunk


To generate a diff of this commit:



cvs diff -r5.375.2.3 -r5.375.2.4 krb5/src/appl/bsd/ChangeLog
cvs diff -r5.98.2.2 -r5.98.2.3 krb5/src/appl/bsd/krshd.c