Skip Menu |
 

Date: Tue, 4 Jan 2005 08:57:32 +1100
From: Bojan Smojver <bojan@rexursive.com>
To: krb5-bugs@mit.edu
Subject: Memory leak in krb5-libs up to 1.3.6
I'm not sure if you're aware of this bug report:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=131769

It started off as a PostgreSQL bug reports, but Tom Lane (of PostgreSQL
development) investigated and concluded that it is in fact a krb5 bug. The test
case in included on the bug report page and is very easy to give it a try on any
Fedora Core box with with PostgreSQL and krb5-libs installed.

If you need any more info, please let me know.

--
Bojan
To: rt@krbdev.mit.edu
Subject: Re: [krbdev.mit.edu #2862] Memory leak in krb5-libs up to 1.3.6
From: Tom Yu <tlyu@mit.edu>
Date: Tue, 04 Jan 2005 15:00:51 -0500
RT-Send-Cc:
Show quoted text
>>>>> "rt" == Bojan Smojver via RT <rt-comment@krbdev.mit.edu> writes:

Show quoted text
Bojan> I'm not sure if you're aware of this bug report:

Show quoted text

Ok, I've looked at it briefly. The leak should be fixed in 1.4, which
is in beta-test. Does the segfault still happen in 1.3.6, or 1.3.5
with double-free fixes? I think the segfault in the redhat bug report
results from attempting to close an invalid ccache handle.

---Tom
Date: Wed, 5 Jan 2005 08:51:25 +1100
From: Bojan Smojver <bojan@rexursive.com>
To: rt-comment@krbdev.mit.edu
Cc: krb5-prs@mit.edu, krb5-bugs@mit.edu
Subject: Re: [krbdev.mit.edu #2862] Memory leak in krb5-libs up to 1.3.6
RT-Send-Cc:
Quoting Tom Yu via RT <rt-comment@krbdev.mit.edu>:

Show quoted text
> Ok, I've looked at it briefly. The leak should be fixed in 1.4, which
> is in beta-test. Does the segfault still happen in 1.3.6, or 1.3.5
> with double-free fixes? I think the segfault in the redhat bug report
> results from attempting to close an invalid ccache handle.

I didn't notice any segfaults recently, only memory leaks. But, I'll test again
and I'll let you know.

--
Bojan
Date: Wed, 5 Jan 2005 08:51:25 +1100
From: Bojan Smojver <bojan@rexursive.com>
To: rt-comment@krbdev.mit.edu
Cc: krb5-bugs@mit.edu
Cc: krb5-prs@mit.edu
Subject: Re: [krbdev.mit.edu #2862] Memory leak in krb5-libs up to 1.3.6
RT-Send-Cc:
Quoting Tom Yu via RT <rt-comment@krbdev.mit.edu>:

Show quoted text
> Ok, I've looked at it briefly. The leak should be fixed in 1.4, which
> is in beta-test. Does the segfault still happen in 1.3.6, or 1.3.5
> with double-free fixes? I think the segfault in the redhat bug report
> results from attempting to close an invalid ccache handle.

I didn't notice any segfaults recently, only memory leaks. But, I'll test again
and I'll let you know.

--
Bojan
Date: Mon, 10 Jan 2005 09:37:16 +1100
From: Bojan Smojver <bojan@rexursive.com>
To: rt-comment@krbdev.mit.edu
Cc: krb5-prs@mit.edu
Subject: Re: [krbdev.mit.edu #2862] Memory leak in krb5-libs up to 1.3.6
RT-Send-Cc:
Download (untitled) / with headers
text/plain 1.4KiB
Quoting Tom Yu via RT <rt-comment@krbdev.mit.edu>:

Show quoted text
> Ok, I've looked at it briefly. The leak should be fixed in 1.4, which
> is in beta-test. Does the segfault still happen in 1.3.6, or 1.3.5
> with double-free fixes? I think the segfault in the redhat bug report
> results from attempting to close an invalid ccache handle.

I've had three types of errors pop up in the httpd log file, some of them thanks
to the new glibc in FC3, which detects double-free and invalid pointers to free:

---------------------------------------
*** glibc detected *** free(): invalid pointer: 0x08d68670 ***
[Mon Jan 10 08:08:26 2005] [notice] child pid 14565 exit signal Abort (6)
*** glibc detected *** double free or corruption (out): 0x08d46d38 ***
[Mon Jan 10 08:08:49 2005] [notice] child pid 14630 exit signal Abort (6)
*** glibc detected *** double free or corruption (out): 0x08d6ca60 ***
[Mon Jan 10 08:09:13 2005] [notice] child pid 14496 exit signal Abort (6)
[Mon Jan 10 08:09:49 2005] [notice] child pid 14494 exit signal Segmentation fau
lt (11)
---------------------------------------

You'll also notice another segfault, which is probably a result of closing the
invalid ccache handle, as you explained above. This all happened within 10000
requests, which is considered a "small test". I normally run millions of
requests when I want to stress test the whole setup.

If you want to run inside gdb to produce stack traces or some other info, let me
know.

--
Bojan
To: rt@krbdev.mit.edu
Subject: Re: [krbdev.mit.edu #2862] Memory leak in krb5-libs up to 1.3.6
From: Tom Yu <tlyu@mit.edu>
Date: Mon, 10 Jan 2005 15:49:07 -0500
RT-Send-Cc:
Download (untitled) / with headers
text/plain 1.3KiB
Show quoted text
>>>>> "Bojan" == Bojan Smojver via RT <rt-comment@krbdev.mit.edu> writes:

Show quoted text
Bojan> ---------------------------------------
Bojan> *** glibc detected *** free(): invalid pointer: 0x08d68670 ***
Bojan> [Mon Jan 10 08:08:26 2005] [notice] child pid 14565 exit signal Abort (6)
Bojan> *** glibc detected *** double free or corruption (out): 0x08d46d38 ***
Bojan> [Mon Jan 10 08:08:49 2005] [notice] child pid 14630 exit signal Abort (6)
Bojan> *** glibc detected *** double free or corruption (out): 0x08d6ca60 ***
Bojan> [Mon Jan 10 08:09:13 2005] [notice] child pid 14496 exit signal Abort (6)
Bojan> [Mon Jan 10 08:09:49 2005] [notice] child pid 14494 exit signal Segmentation fau
Bojan> lt (11)
Bojan> ---------------------------------------

Do the above errors occur when running with one of the 1.4 beta
releases, or with the 1.3.6 release?

Show quoted text
Bojan> You'll also notice another segfault, which is probably a result
Bojan> of closing the invalid ccache handle, as you explained
Bojan> above. This all happened within 10000 requests, which is
Bojan> considered a "small test". I normally run millions of requests
Bojan> when I want to stress test the whole setup.

Do the invalid free() and double-free errors occur when you change the
code to not attempt to close an invalid ccache handle?

Show quoted text
Bojan> If you want to run inside gdb to produce stack traces or some
Bojan> other info, let me know.

Stack traces would be useful.

---Tom
From: bojan@rexursive.com
Date: Tue, 11 Jan 2005 09:53:36 +1100
To: rt-comment@krbdev.mit.edu
Subject: Re: [krbdev.mit.edu #2862] Memory leak in krb5-libs up to 1.3.6
RT-Send-Cc:
Quoting Tom Yu via RT <rt-comment@krbdev.mit.edu>:

Show quoted text
> Do the above errors occur when running with one of the 1.4 beta
> releases, or with the 1.3.6 release?

This is Fedora Core 3 1.3.6-2 RPM.

Show quoted text
> Do the invalid free() and double-free errors occur when you change the
> code to not attempt to close an invalid ccache handle?

Well, this is not actually my code (I never call any krb5 functions from code).
It is whatever PostgreSQL 7.4.6 does when it closes the connection. So, I
honestly wouldn't know.

Show quoted text
> Stack traces would be useful.

OK. I'll run the whole thing inside gdb and I'll also make sure I have debugging
symbols installed. This should give you a clearer picture where things are going
wrong. It may take a few days for me to do that due to some other stuff I'm
working on.

--
Bojan
To: rt@krbdev.mit.edu
Subject: Re: [krbdev.mit.edu #2862] Memory leak in krb5-libs up to 1.3.6
From: Tom Yu <tlyu@mit.edu>
Date: Mon, 10 Jan 2005 20:11:58 -0500
RT-Send-Cc:
Download (untitled) / with headers
text/plain 1.6KiB
Show quoted text
>>>>> "Bojan" == Bojan Smojver via RT <rt-comment@krbdev.mit.edu> writes:

Show quoted text
Bojan> Quoting Tom Yu via RT <rt-comment@krbdev.mit.edu>:
Show quoted text
>> Do the above errors occur when running with one of the 1.4 beta
>> releases, or with the 1.3.6 release?

Show quoted text
Bojan> This is Fedora Core 3 1.3.6-2 RPM.

krb5-1.3.x and earlier are known to have thread-safety issues. Also,
I believe that the memory leak in the ccache code is still present in
krb5-1.3.6.

Show quoted text
>> Do the invalid free() and double-free errors occur when you change the
>> code to not attempt to close an invalid ccache handle?

Show quoted text
Bojan> Well, this is not actually my code (I never call any krb5
Bojan> functions from code). It is whatever PostgreSQL 7.4.6 does
Bojan> when it closes the connection. So, I honestly wouldn't know.

I have looked at the code in postgresql-7.4.5 (I don't have 7.4.6
handy), as well as at the krb5 ccache code, and it seems that the
failure you're seeing is "impossible". Since you appear to be running
a multi-threaded application, I strongly suggest that you try out the
krb5-1.4 beta release. Releases prior to krb5-1.4 are known to have
thread-safety issues, which may be part of your problem.

Show quoted text
>> Stack traces would be useful.

Show quoted text
Bojan> OK. I'll run the whole thing inside gdb and I'll also make sure
Bojan> I have debugging symbols installed. This should give you a
Bojan> clearer picture where things are going wrong. It may take a few
Bojan> days for me to do that due to some other stuff I'm working on.

Does the new glibc give stack traces on double-free and other
memory-management error conditions if debugging symbols are available?
Or do you need to use something like valgrind?

---Tom
Date: Tue, 11 Jan 2005 16:52:49 +1100
From: Bojan Smojver <bojan@rexursive.com>
To: rt-comment@krbdev.mit.edu
Subject: Re: [krbdev.mit.edu #2862] Memory leak in krb5-libs up to 1.3.6
RT-Send-Cc:
Quoting Tom Yu via RT <rt-comment@krbdev.mit.edu>:

Show quoted text
> I have looked at the code in postgresql-7.4.5 (I don't have 7.4.6
> handy), as well as at the krb5 ccache code, and it seems that the
> failure you're seeing is "impossible". Since you appear to be running
> a multi-threaded application, I strongly suggest that you try out the
> krb5-1.4 beta release. Releases prior to krb5-1.4 are known to have
> thread-safety issues, which may be part of your problem.

Aha! I'm running this inside Apache 2.0.52. I have two different version of
Apache - one that runs a worker model (multi threaded) and one that runs
prefork (not multi threaded). I'll compare and let you know.

Show quoted text
> Does the new glibc give stack traces on double-free and other
> memory-management error conditions if debugging symbols are available?
> Or do you need to use something like valgrind?

Normally I just load postgresql-debuginfo and krb5-debuginfo RPMS, which contain
debugging symbols, and then run the binary (Apache) inside gdb. Once the thing
crashes, I do a backtrace. I'm not an expert in this, so maybe there are better
ways...


--
Bojan
Subject: Re: [krbdev.mit.edu #2862] Memory leak in krb5-libs up to 1.3.6
From: Bojan Smojver <bojan@rexursive.com>
To: rt-comment@krbdev.mit.edu
Date: Tue, 11 Jan 2005 21:28:54 +1100
RT-Send-Cc:
On Tue, 2005-01-11 at 00:53 -0500, Bojan Smojver via RT wrote:

Show quoted text
> Aha! I'm running this inside Apache 2.0.52. I have two different version of
> Apache - one that runs a worker model (multi threaded) and one that runs
> prefork (not multi threaded). I'll compare and let you know.

worker -> segfaults
prefork -> doesn't segfault

So, it this is all thread-safe related, which means that 1.4 is the
right fix. The memory still leaks though, which would be nice if it were
fixed in 1.3.x.

Do you still want those stack traces now that we know that segfaults are
related to the design issues?

--
Bojan
To: rt@krbdev.mit.edu
Subject: Re: [krbdev.mit.edu #2862] Memory leak in krb5-libs up to 1.3.6
From: Tom Yu <tlyu@mit.edu>
Date: Tue, 11 Jan 2005 14:24:05 -0500
RT-Send-Cc:
Show quoted text
>>>>> "Bojan" == Bojan Smojver via RT <rt-comment@krbdev.mit.edu> writes:

Show quoted text
Bojan> So, it this is all thread-safe related, which means that 1.4 is
Bojan> the right fix. The memory still leaks though, which would be
Bojan> nice if it were fixed in 1.3.x.

I took a closer look at
postgresql-7.4.5/src/interfaces/libpq/fe-auth.c and discovered that a
lot of krb5 state is kept as file-scope static variables. The problem
is therefore mostly postgresql's fault, since it looks like libpq
itself isn't thread-safe.

---Tom
Subject: Re: [krbdev.mit.edu #2862] Memory leak in krb5-libs up to 1.3.6
From: Bojan Smojver <bojan@rexursive.com>
To: rt-comment@krbdev.mit.edu
Date: Wed, 12 Jan 2005 20:30:14 +1100
RT-Send-Cc:
On Tue, 2005-01-11 at 14:24 -0500, Tom Yu via RT wrote:
Show quoted text
> >>>>> "Bojan" == Bojan Smojver via RT <rt-comment@krbdev.mit.edu> writes:
>
> Bojan> So, it this is all thread-safe related, which means that 1.4 is
> Bojan> the right fix. The memory still leaks though, which would be
> Bojan> nice if it were fixed in 1.3.x.
>
> I took a closer look at
> postgresql-7.4.5/src/interfaces/libpq/fe-auth.c and discovered that a
> lot of krb5 state is kept as file-scope static variables. The problem
> is therefore mostly postgresql's fault, since it looks like libpq
> itself isn't thread-safe.

OK. I'll simply mention thread-safety issues in the documentation of my
software for now.

I'm guessing 1.4 is the way to go for memory leaks as well. So, all I
need to do is convince Fedora folks to jump on the 1.4 bandwagon, which
shouldn't be difficult if it's backwards compatible with 1.3 series.

--
Bojan
Closing. Known thread-safety issues in libpq and libkrb5; known leak in
cc_file.c in 1.3.6 fixed in 1.4.