Skip Menu |
 

Date: Tue, 23 Dec 2008 10:21:37 -0500
From: "Jorgen Wahlsten" <jorgen@wahlsten.com>
To: krb5-bugs@MIT.EDU
Subject: Alignment problem in resolver test
Download (untitled) / with headers
text/plain 1.5KiB
Hello!
In compiling krb5, version 1.6.3, I ran into a core dump (bus error) during 'gmake check' on a Solaris 2.10/Sun Fire T200 build system, when it tried to run tests/resolve/resolve.

It seems there is an alignment problem with the automatic 'addrcopy' variable in src/tests/resolve/resolve.c. I forced it to align (and execute successfully) after the change below. I figured this was an easier test, than changing the code to use an in_addr struct -- which is perhaps a more "correct" fix.

Thanks
/Jorgen Wahlsten


Show quoted text
============================== cut here ==============================
--- resolve.c~  2003-07-22 14:02:34.000000000 -0400
+++ resolve.c   2008-12-23 10:09:31.475658000 -0500
@@ -77,7 +77,11 @@
 {
        char myname[MAXHOSTNAMELEN+1];
        char *ptr;
-       char addrcopy[4];
+       /* char addrcopy[4]; */
+       union {
+           char addrcopy[4];
+           int make_me_aligned;
+       } u;
        struct hostent *host;
        int quiet = 0;
 
@@ -123,10 +127,10 @@
            printf("Host address: %d.%d.%d.%d\n",
                   UC(ptr[0]), UC(ptr[1]), UC(ptr[2]), UC(ptr[3]));
 
-       memcpy(addrcopy, ptr, 4);
+       memcpy(u.addrcopy, ptr, 4);
 
        /* Convert back to full name */
-       if((host = gethostbyaddr(addrcopy, 4, AF_INET)) == NULL) {
+       if((host = gethostbyaddr(u.addrcopy, 4, AF_INET)) == NULL) {
                fprintf(stderr, "Error looking up IP address - fatal\n");
                exit(2);
        }
Show quoted text
============================== cut here ==============================

Hi,

Why do you believe you need to use a struct in_addr? Every man page for
gethostbyaddr takes a char * - and the alignment is then I suppose
compiler specific... You did not indicate a compiler version - is this
gcc or Sun's compiler?

I do not have access to solaris 2.10 - but what does the manpage on
gethostbyaddr say? What does the prototype in /usr/include/netdb.h indicate?

I suspect that the problem is not in our code - but something in the OS
library. Is this a known issue with Solaris?

Now scanning the krb5 code - I see one other place in which gethostaddr
is used without a struct (gssapi code) - but it would appear that it is
using malloc - which should be suitably aligned for any variable type...

So - I guess we need to know what is broken - the compiler, the library,
or our code... Coding a work around without understanding the problem
is probably not best. Perhaps you can give us more information...

Ezra
Okay - I have been able to reproduce the problem on one MIT's solaris
machines - but only when using gcc....

This is gcc version 3.4.3 on a machine running Solaris 10 3/05
s10_74L2a SPARC.

I modified the code to allow me to vary the alignment of the memory.
Sure enough, gethostbyaddr segfaults in the nss library.

Now - why is this only w/ gcc and not sunsoft's compiler - I could not
tell you.

Looking at the opensolaris library source for libnss - I do see that
there is a path where the pointer to the char * is assigned a struct
in_addr *.

See IN6_INADDR_TO_V4MAPPED macro in /usr/include/netinet/in.h.

The error does not appear to be an optimization issue...

Now - why is gcc screwing up? I will need to investigate... I will have
to look at the assembly code to see what is going on....

So - can you please confirm that your problem is observed w/ gcc?
Ezra
Date: Tue, 30 Dec 2008 11:42:37 -0500
From: "Jorgen Wahlsten" <jorgen@wahlsten.com>
To: rt-comment@krbdev.mit.edu, rt@krbdev.mit.edu
Subject: Re: [krbdev.mit.edu #6308] Alignment problem in resolver test
RT-Send-Cc:
Download (untitled) / with headers
text/plain 2.6KiB
Hello Ezra.

Yes, I used a 32-bit compiled gcc 4.1.1.

I do not have Sun's C compiler and can not verify that it is working with it, but if it's important for you to verify that it is working with Sun's C compiler, I can try to get it. Let me know.

If I turn on optimization (-O2 or -O3) the program works, while using -O0/-O1 the program still core dumps.

Compiling resolve.c as a 32-bit binary or a 64-bit binary does not seem to matter. It core dumps no matter what.

I was thinking this had something to do with automatic variable alignment, and therefore changed:

char myname[MAXHOSTNAMELEN+1]

to

char myname[MAXHOSTNAMELEN+8]

(MAXHOSTNAMELEN is defined as 256 in netdb.h),

which makes the program work again (not core dump).

For what it's worth, I think Sun's C compiler *always* aligns the automatic variables on 8-byte boundaries, while GCC tries to fit the addrcopy[4] into the "padding" of myname[256+1] (7 bytes padding?), as an optimization. It does seem strange that this is done *without* optimization though.

Regardless, of the differences between Sun's C compiler and GCC, I would be inclined to agree with you that this is a solaris C library bug, that is being triggered by using GCC in this particular case.

As for my comment about using an in_addr pointer, I think I merely looked at the EXAMPLE section in the solaris 2.10 gethostbyaddr man page. There is also a NOTE in that man page about INET_ADDR.

Let me know if you want to any additional information.

Thanks,
Jorgen




On Sat, Dec 27, 2008 at 3:17 PM, Ezra Peisach via RT <rt-comment@krbdev.mit.edu> wrote:
Show quoted text
Okay - I have been able to reproduce the problem on one MIT's solaris
machines - but only when using gcc....

This is gcc version 3.4.3 on a machine running  Solaris 10 3/05
s10_74L2a SPARC.

I modified the code to allow me to vary the alignment of the memory.
Sure enough, gethostbyaddr segfaults in the nss library.

Now - why is this only w/ gcc and not sunsoft's compiler - I could not
tell you.

Looking at the opensolaris library source for libnss - I do see that
there is a path where the pointer to the char * is assigned a struct
in_addr *.

See IN6_INADDR_TO_V4MAPPED macro in /usr/include/netinet/in.h.

The error does not appear to be an optimization issue...

Now - why is gcc screwing up? I will need to investigate... I will have
to look at the assembly code to see what is going on....

So - can you please confirm that your problem is observed w/ gcc?
Ezra



--
_______________________________________________________________________
Jorgen Wahlsten -- http://www.wahlsten.com/
AIM: JorgenWahlsten
YIM: jorgenwahlsten
ICQ: 171198501
Date: Tue, 30 Dec 2008 11:42:37 -0500
From: "Jorgen Wahlsten" <jorgen@wahlsten.com>
To: rt-comment@krbdev.mit.edu, rt@krbdev.mit.edu
Subject: Re: [krbdev.mit.edu #6308] Alignment problem in resolver test
RT-Send-Cc:
Download (untitled) / with headers
text/plain 2.6KiB
Hello Ezra.

Yes, I used a 32-bit compiled gcc 4.1.1.

I do not have Sun's C compiler and can not verify that it is working with it, but if it's important for you to verify that it is working with Sun's C compiler, I can try to get it. Let me know.

If I turn on optimization (-O2 or -O3) the program works, while using -O0/-O1 the program still core dumps.

Compiling resolve.c as a 32-bit binary or a 64-bit binary does not seem to matter. It core dumps no matter what.

I was thinking this had something to do with automatic variable alignment, and therefore changed:

char myname[MAXHOSTNAMELEN+1]

to

char myname[MAXHOSTNAMELEN+8]

(MAXHOSTNAMELEN is defined as 256 in netdb.h),

which makes the program work again (not core dump).

For what it's worth, I think Sun's C compiler *always* aligns the automatic variables on 8-byte boundaries, while GCC tries to fit the addrcopy[4] into the "padding" of myname[256+1] (7 bytes padding?), as an optimization. It does seem strange that this is done *without* optimization though.

Regardless, of the differences between Sun's C compiler and GCC, I would be inclined to agree with you that this is a solaris C library bug, that is being triggered by using GCC in this particular case.

As for my comment about using an in_addr pointer, I think I merely looked at the EXAMPLE section in the solaris 2.10 gethostbyaddr man page. There is also a NOTE in that man page about INET_ADDR.

Let me know if you want to any additional information.

Thanks,
Jorgen




On Sat, Dec 27, 2008 at 3:17 PM, Ezra Peisach via RT <rt-comment@krbdev.mit.edu> wrote:
Show quoted text
Okay - I have been able to reproduce the problem on one MIT's solaris
machines - but only when using gcc....

This is gcc version 3.4.3 on a machine running  Solaris 10 3/05
s10_74L2a SPARC.

I modified the code to allow me to vary the alignment of the memory.
Sure enough, gethostbyaddr segfaults in the nss library.

Now - why is this only w/ gcc and not sunsoft's compiler - I could not
tell you.

Looking at the opensolaris library source for libnss - I do see that
there is a path where the pointer to the char * is assigned a struct
in_addr *.

See IN6_INADDR_TO_V4MAPPED macro in /usr/include/netinet/in.h.

The error does not appear to be an optimization issue...

Now - why is gcc screwing up? I will need to investigate... I will have
to look at the assembly code to see what is going on....

So - can you please confirm that your problem is observed w/ gcc?
Ezra



--
_______________________________________________________________________
Jorgen Wahlsten -- http://www.wahlsten.com/
AIM: JorgenWahlsten
YIM: jorgenwahlsten
ICQ: 171198501
Date: Tue, 30 Dec 2008 12:01:59 -0500
From: Ezra Peisach <epeisach@MIT.EDU>
To: rt-comment@krbdev.mit.edu
Subject: Re: [krbdev.mit.edu #6308] Alignment problem in resolver test
RT-Send-Cc:
Download (untitled) / with headers
text/plain 2.6KiB
Great - thanks for the insight... I will try to code around it next week...

Ezra
Jorgen Wahlsten via RT wrote:
Show quoted text
> Hello Ezra.
>
> Yes, I used a 32-bit compiled gcc 4.1.1.
>
> I do not have Sun's C compiler and can not verify that it is working with
> it, but if it's important for you to verify that it is working with Sun's C
> compiler, I can try to get it. Let me know.
>
> If I turn on optimization (-O2 or -O3) the program works, while using
> -O0/-O1 the program still core dumps.
>
> Compiling resolve.c as a 32-bit binary or a 64-bit binary does not seem to
> matter. It core dumps no matter what.
>
> I was thinking this had something to do with automatic variable alignment,
> and therefore changed:
>
> char myname[MAXHOSTNAMELEN+1]
>
> to
>
> char myname[MAXHOSTNAMELEN+8]
>
> (MAXHOSTNAMELEN is defined as 256 in netdb.h),
>
> which makes the program work again (not core dump).
>
> For what it's worth, I think Sun's C compiler *always* aligns the automatic
> variables on 8-byte boundaries, while GCC tries to fit the addrcopy[4] into
> the "padding" of myname[256+1] (7 bytes padding?), as an optimization. It
> does seem strange that this is done *without* optimization though.
>
> Regardless, of the differences between Sun's C compiler and GCC, I would be
> inclined to agree with you that this is a solaris C library bug, that is
> being triggered by using GCC in this particular case.
>
> As for my comment about using an in_addr pointer, I think I merely looked at
> the EXAMPLE section in the solaris 2.10 gethostbyaddr man page. There is
> also a NOTE in that man page about INET_ADDR.
>
> Let me know if you want to any additional information.
>
> Thanks,
> Jorgen
>
>
>
>
> On Sat, Dec 27, 2008 at 3:17 PM, Ezra Peisach via RT <
> rt-comment@krbdev.mit.edu> wrote:
>
>
>> Okay - I have been able to reproduce the problem on one MIT's solaris
>> machines - but only when using gcc....
>>
>> This is gcc version 3.4.3 on a machine running Solaris 10 3/05
>> s10_74L2a SPARC.
>>
>> I modified the code to allow me to vary the alignment of the memory.
>> Sure enough, gethostbyaddr segfaults in the nss library.
>>
>> Now - why is this only w/ gcc and not sunsoft's compiler - I could not
>> tell you.
>>
>> Looking at the opensolaris library source for libnss - I do see that
>> there is a path where the pointer to the char * is assigned a struct
>> in_addr *.
>>
>> See IN6_INADDR_TO_V4MAPPED macro in /usr/include/netinet/in.h.
>>
>> The error does not appear to be an optimization issue...
>>
>> Now - why is gcc screwing up? I will need to investigate... I will have
>> to look at the assembly code to see what is going on....
>>
>> So - can you please confirm that your problem is observed w/ gcc?
>> Ezra
>>
>>
>
>
>
>
From: Ken Raeburn <raeburn@MIT.EDU>
To: rt@krbdev.mit.edu
Subject: Re: [krbdev.mit.edu #6308] Alignment problem in resolver test
Date: Tue, 30 Dec 2008 13:43:12 -0500
RT-Send-Cc:
Download (untitled) / with headers
text/plain 1.6KiB
On Dec 30, 2008, at 11:42, Jorgen Wahlsten via RT wrote:
Show quoted text
> For what it's worth, I think Sun's C compiler *always* aligns the
> automatic
> variables on 8-byte boundaries, while GCC tries to fit the
> addrcopy[4] into
> the "padding" of myname[256+1] (7 bytes padding?), as an
> optimization. It
> does seem strange that this is done *without* optimization though.

I worked on GCC in a past job, so please forgive a brief diversion:

It's not an optimization you'd turn on or off. The "optimization" is
built into the code that decides how the stack frame is laid out. A
char array doesn't need any special alignment, so myname[] doesn't
have padding, it just ends, with the following byte probably at an odd
address; since addrcopy[] is also a char array without special
alignment needs, it's allowed to be put there.

Show quoted text
> Regardless, of the differences between Sun's C compiler and GCC, I
> would be
> inclined to agree with you that this is a solaris C library bug,
> that is
> being triggered by using GCC in this particular case.
>
> As for my comment about using an in_addr pointer, I think I merely
> looked at
> the EXAMPLE section in the solaris 2.10 gethostbyaddr man page.
> There is
> also a NOTE in that man page about INET_ADDR.

The POSIX spec I just pulled up says the address argument points to an
address, not a bunch of bytes, and in particular,

Show quoted text
> The addr argument of gethostbyaddr() shall be an in_addr structure
> when type is AF_INET.

(Presumably they meant the pointed-to thing.)

So, converting it to an actual in_addr would probably be the right fix
(though it leaves us with IPv6 support still missing in those places).

Ken
Date: Tue, 30 Dec 2008 14:32:04 -0500
From: Ezra Peisach <epeisach@MIT.EDU>
To: rt@krbdev.mit.edu
Subject: Re: [krbdev.mit.edu #6308] Alignment problem in resolver test
RT-Send-Cc:
Ken,

With regards to alignment - I believe that is why the "fix" to use a
union would force an alignment. I will use an in_addr to ensure alignment.
As I indicated in my initial analysis - there is one other place
gethostbyaddr is used w/o a struct in_addr - but the memory is
malloced. I am correct that malloc will return an alignment that is
compatible w/ any structure - right?

Ezra
To: rt@krbdev.MIT.EDU
Subject: Re: [krbdev.mit.edu #6308] Alignment problem in resolver test
From: Tom Yu <tlyu@MIT.EDU>
Date: Tue, 30 Dec 2008 14:40:56 -0500
RT-Send-Cc:
"Ezra Peisach via RT" <rt-comment@krbdev.mit.edu> writes:

Show quoted text
> As I indicated in my initial analysis - there is one other place
> gethostbyaddr is used w/o a struct in_addr - but the memory is
> malloced. I am correct that malloc will return an alignment that is
> compatible w/ any structure - right?

It's defined to return memory with an alignment suitable for any
object.
From: Ken Raeburn <raeburn@MIT.EDU>
To: rt-comment@krbdev.mit.edu
Subject: Re: [krbdev.mit.edu #6308] Alignment problem in resolver test
Date: Tue, 30 Dec 2008 14:42:41 -0500
RT-Send-Cc:
On Dec 30, 2008, at 14:32, Ezra Peisach via RT wrote:
Show quoted text
> With regards to alignment - I believe that is why the "fix" to use a
> union would force an alignment. I will use an in_addr to ensure
> alignment.
> As I indicated in my initial analysis - there is one other place
> gethostbyaddr is used w/o a struct in_addr - but the memory is
> malloced. I am correct that malloc will return an alignment that is
> compatible w/ any structure - right?

To the best of my knowledge, yes, it'll have to be aligned well enough.

Ken
From: epeisach@mit.edu
Subject: SVN Commit

Use a struct in_addr to insure alignment of address - instead of
random alignment on the stack. Solaris 2.10 has issues if the address
is not aligned. The rest of the code in the tree uses a struct
in_addr or mallocs the address - which will be sufficiently aligned.




https://github.com/krb5/krb5/commit/b3a5ec5c59e2c854a9372e0b44bc50db4d755b84
Commit By: epeisach
Revision: 21794
Changed Files:
U trunk/src/tests/resolve/resolve.c
see also r21820