Skip Menu |
 

Subject: sparc-solaris9 nightly build test failures: kadm5 api
There are intermittent failures in the RPC tests, but these failures in
the kadm5 api test suite seem to dominate in the recent runs:

Running ./api.1/lock.exp ...
FAIL: 3: eof while waiting for permanent
FAIL: 9.1: eof while waiting for released
FAIL: 10: eof while waiting for released

Sometimes there are more, sometimes 9.1 doesn't fail, but 3 and 10 in
api.1/lock.exp seem to be failing fairly reliably recently on dcl. This
is in all four combinations of 32-bit gcc and 64-bit cc compilers, and
static and shared libraries.
This problem exists on the 1.3 branch as of last night, as well as on
the trunk.
In doing some experimenting and syscall tracing, I'm finding that output
produced right before the child process exits may not be read properly
by expect. For example, a failure in test 8 had this sequence of calls:

5123/1: 93.4088 write(7, "\n", 1) = 1
5313/1: 93.4088 read(0, "\n", 1024) = 1
5313/1: 93.4102
open("/var/krbsnap/autobuild-static/work-20030923.0842/krb5-current/src/kadmin/testing/krb5-test-root/kdb5.kadm5.lock",
O_RDWR|O_CREAT|O_EXCL, 0600) = 4
5313/1: 93.4109 write(1, " r e l e a s e d\n", 9) = 9
5313/1: 93.4113 llseek(4, 0, SEEK_CUR) = 0
5313/1: 93.4115 close(4) = 0
5313/1: 93.4121 _exit(0)
...
5123/1: 93.4283 poll(0xFFBFBAD0, 1, 60000) = 1
5123/1: fd=7 ev=POLLRDNORM|POLLRDBAND rev=POLLERR
5123/1: 93.4288 read(7, 0x000EEC20, 4096) Err#22 EINVAL

Here fd 7 had previously been used successfully for reading and writing
while the child process was alive. But after the process exits, the
output that had been sent is not available.

This behavior *could* be new to (running the tests on) Solaris 9.
Oh yes: If I put a sleep(1) call just before the exit call in
lock-test.c, all the tests succeed, supporting the notion that it's a
timing issue relating to closing the pty and having the parent read the
pending data. But adding the sleep call doesn't fix the real problem,
it just hides it.
To: rt@krbdev.mit.edu
Subject: Re: [krbdev.mit.edu #1792] sparc-solaris9 nightly build test failures: kadm5 api
From: Tom Yu <tlyu@mit.edu>
Date: Wed, 24 Sep 2003 13:56:34 -0400
RT-Send-Cc:
Show quoted text
>>>>> "Ken" == Ken Raeburn via RT <rt-comment@krbdev.mit.edu> writes:

Show quoted text
Ken> This behavior *could* be new to (running the tests on) Solaris 9.

I just updated my machine to Solaris 9, and am seeing the same
failures, using expect-5.38.0. I may try updating my expect
installation later, just to be sure. A expect script that just has a
tight loop spawning echo and expecting on its output reveals similar
behavior, so it's either an expect bug or a Solaris 9 kernel bug.

---Tom
To: rt-comment@krbdev.mit.edu
Subject: Re: [krbdev.mit.edu #1792] sparc-solaris9 nightly build test failures: kadm5 api
From: Ken Raeburn <raeburn@MIT.EDU>
Date: Thu, 25 Sep 2003 17:40:35 -0400
RT-Send-Cc:
"Tom Yu via RT" <rt-comment@krbdev.mit.edu> writes:
Show quoted text
> I just updated my machine to Solaris 9, and am seeing the same
> failures, using expect-5.38.0. I may try updating my expect
> installation later, just to be sure. A expect script that just has a
> tight loop spawning echo and expecting on its output reveals similar
> behavior, so it's either an expect bug or a Solaris 9 kernel bug.

It appears there's a 5.39.0 release out a couple months now, though
the history file doesn't indicate that anything's been fixed on
Solaris. I'll give it a try anyways.

Could you send (or attach in RT) the test script? If we can't figure
this out here, we should report it to Don Libes and see if he knows
anything.

Ken
To: rt@krbdev.mit.edu
Subject: Re: [krbdev.mit.edu #1792] sparc-solaris9 nightly build test failures: kadm5 api
From: Tom Yu <tlyu@mit.edu>
Date: Fri, 26 Sep 2003 11:34:45 -0400
RT-Send-Cc:
Show quoted text
>>>>> "Ken" == Ken Raeburn via RT <rt-comment@krbdev.mit.edu> writes:

Show quoted text
Ken> Could you send (or attach in RT) the test script? If we can't figure
Ken> this out here, we should report it to Don Libes and see if he knows
Ken> anything.

Sun knows about this problem. It's Bug ID 4927647 in their database,
"pty loses last output before close/exit". My test script follows:

for { set i 1 } { 1 } { incr i } {
spawn echo foobarbaz
expect {
"foobarbaz" {
puts "ok"
expect eof
wait
}
eof {
puts "$i passes"
wait
exit 1
}
}
}
From: tlyu@mit.edu
Subject: CVS Commit
* api.1/lock.exp: Work around a race condition in the Solaris 9
pty implementation: output sent to a pty slave immediately before
last close/exit can get lost on the way to the master. This is
Sun bug #4927647. The workaround consists of changing the tests
to always make lock-test wait to read a character prior to
exiting, so any output prior to the "wait" directive will not get
lost.


To generate a diff of this commit:



cvs diff -r1.57 -r1.58 krb5/src/lib/kadm5/unit-test/ChangeLog
cvs diff -r1.10 -r1.11 krb5/src/lib/kadm5/unit-test/api.1/lock.exp
From: tlyu@mit.edu
Subject: CVS Commit
pullup from trunk


To generate a diff of this commit:



cvs diff -r1.55.2.2 -r1.55.2.3
krb5/src/lib/kadm5/unit-test/ChangeLog
cvs diff -r1.10 -r1.10.2.1
krb5/src/lib/kadm5/unit-test/api.1/lock.exp