From tlyu@MIT.EDU Wed Dec 18 02:41:56 1996 Received: from MIT.EDU (SOUTH-STATION-ANNEX.MIT.EDU [18.72.1.2]) by rt-11.MIT.EDU (8.7.5/8.7.3) with SMTP id CAA25750 for ; Wed, 18 Dec 1996 02:41:56 -0500 Received: from TESLA-COIL.MIT.EDU by MIT.EDU with SMTP id AA03822; Wed, 18 Dec 96 02:41:55 EST Received: by tesla-coil.MIT.EDU (5.x/4.7) id AA03497; Wed, 18 Dec 1996 02:41:52 -0500 Message-Id: <9612180741.AA03497@tesla-coil.MIT.EDU> Date: Wed, 18 Dec 1996 02:41:52 -0500 From: tlyu@MIT.EDU Reply-To: tlyu@MIT.EDU To: krb5-bugs@MIT.EDU Subject: gssrpc coredumps in recvmsg() [post-1.0] X-Send-Pr-Version: 3.99 >Number: 301 >Category: krb5-libs >Synopsis: gssrpc coredumps in recvmsg() >Confidential: no >Severity: serious >Priority: medium >Responsible: bjaspan >State: closed >Class: sw-bug >Submitter-Id: unknown >Arrival-Date: Wed Dec 18 02:42:01 EST 1996 >Last-Modified: Sun Feb 22 21:21:51 EST 1998 >Originator: Tom Yu >Organization: mit >Release: post-1.0 >Environment: System: SunOS tesla-coil 5.4 Generic_101945-37 sun4m sparc >Description: I found this on the mainline; don't panic... Basically the rpc unit tests fail due to a coredump deep in the rpc library. Core was generated by `./server -u'. Program terminated with signal 11, Segmentation fault. [spewage deleted] #0 0xef5aa710 in recvmsg () (gdb) bt #0 0xef5aa710 in recvmsg () #1 0xef791ad0 in svcudp_recv (xprt=0x220f8, msg=0xefffe980) at ../../../src/lib/rpc/svc_udp.c:188 #2 0xef78d688 in svc_getreqset (readfds=0xefffea28) at ../../../src/lib/rpc/svc.c:430 #3 0xef7906e8 in svc_run () at ../../../src/lib/rpc/svc_run.c:69 #4 0x11244 in main (argc=0, argv=0xefffebac) at ../../../../src/lib/rpc/unit-test/server.c:120 (gdb) up #1 0xef791ad0 in svcudp_recv (xprt=0x220f8, msg=0xefffe980) at ../../../src/lib/rpc/svc_udp.c:188 188 rlen = recvmsg(xprt->xp_sock, &dummy, MSG_PEEK); (gdb) info locals dummy = {msg_name = 0x221", msg_namelen = 16, msg_iov = 0x0, msg_iovlen = 0, msg_accrights = 0x0, msg_accrightslen = 0} su = (struct svcudp_data *) 0x22150 xdrs = (XDR *) 0x22158 rlen = 0 reply = 0x0 replylen = 0 diassembling a bit... 0xef5aa66c : save %sp, -184, %sp 0xef5aa670 : st %i2, [ %fp + 0x4c ] 0xef5aa674 : mov 0xc, %l3 0xef5aa678 : ld [ %i1 + 0xc ], %o0 0xef5aa67c : cmp %o0, 1 0xef5aa680 : ble,a 0xef5aa70c 0xef5aa684 : ld [ %i1 + 8 ], %i3 0xef5aa688 : clr %i4 0xef5aa708 : nop 0xef5aa70c : ld [ %i1 + 8 ], %l5 0xef5aa710 : ld [ %i3 + 4 ], %i3 0xef5aa714 : ld [ %l5 ], %l5 0xef5aa718 : call 0xef5bcda0 <.L670+67820> [disclaimer... I'm not that clueful about sparc assembler...] Here I'm assuming %i0 gets the first arg, %i1 gets the second, etc., thus %i1 + 0xc is &msg->msg_iovlen, and %o0 gets msg->msg_iovlen == 0. I'm guessing that there are one or two jump slots after the ble instruction, so that %i3 gets msg->msg_iov, which is NULL. This makes sense that the fault would occur on the "ld [ %i3 + 4 ], %i3". (gdb) p $i0 $19 = 4 (gdb) p (char*)$i1 $21 = 0xefffe3f8 "" (gdb) p $i2 $22 = 2 (gdb) p $i3 $23 = 0 (gdb) up #1 0xef791ad0 in svcudp_recv (xprt=0x220f8, msg=0xefffe980) at ../../../src/lib/rpc/svc_udp.c:188 188 rlen = recvmsg(xprt->xp_sock, &dummy, MSG_PEEK); (gdb) p &dummy $14 = (struct msghdr *) 0xefffe3f8 (gdb) p xprt->xp_sock $24 = 4 Anyway, it does seem to point rather strongly in favor of the conclusion that recvmsg() doesn't check to see if msg->msg_iov is a NULL pointer, or if msg->msg_iovlen is zero. >How-To-Repeat: Try running make check in lib/rpc. >Fix: * Pester Sun about recvmsg() being broken. (yeah, right) It is not clear that they're at fault; the manpage doesn't say what recvmsg() does if it gets a NULL msg->msg_iovec. * Fix svcudp_recv() so that it actually passes an iovec. >Audit-Trail: Responsible-Changed-From-To: krb5-unassigned->bjaspan Responsible-Changed-By: tlyu Responsible-Changed-When: Wed Dec 18 02:45:24 1996 Responsible-Changed-Why: I think Barry is the one to make the change that caused this breakage... From: "Barry Jaspan" To: tlyu@MIT.EDU Cc: krb5-bugs@MIT.EDU Subject: Re: krb5-libs/301: gssrpc coredumps in recvmsg() Date: Wed, 18 Dec 1996 11:57:43 -0500 This feels a lot like a bug in recvmsg(). However, it is easy enough for the code to call recvmsg with a non-zero iovec and ignore the results (it is passed the MSG_PEEK flag anyway). State-Changed-From-To: open-analyzed State-Changed-By: tlyu State-Changed-When: Tue Feb 11 11:45:42 1997 State-Changed-Why: I have a partial fix; udp transport gssrpc now only fails without coredumping on Solaris. I've patched svc_udp.c to use a non-NULL iovec to avoid the coredump, but there are still problems because Solaris is broken and returns INADDR_ANY on a connected udp socket that was originally bound to INADDR_ANY, unlike almost every other sane socket implementation. This results in the bad channel bindings error. I don't know whether this is fixed in Solaris 2.5, but if it isn't, we may want to mark the udp test case as an expected failure. Incidentally, this is probably the second bug I've found in Solaris udp socket code. changed files: svc_udp.c 1.13, 1.14 (ezra's patch to explicitly include uio.h) State-Changed-From-To: analyzed-closed State-Changed-By: tlyu State-Changed-When: Sun Feb 22 21:21:14 1998 State-Changed-Why: Don't run the UDP gssrpc tests under Solaris 2.[12345]* >Unformatted: