Skip Menu |
 

From: Richard Basch <basch@alum.mit.edu>
Subject: kinit -kt KDB: Cannot find/read stored master key
Date: Sat, 17 Mar 2018 19:39:48 -0400
To: krb5-bugs@mit.edu, krbdev@mit.edu
I have found automated jobs that are executed on a KDC using "kinit -kt KDB:" may sometimes fail with:

kinit: Cannot find/read stored master key while setting up KDB key tab for realm XXX

However,if the script is retried, it invariably works. I suspect there is a transient locking condition which may sporadically cause a failure. The k5stash file path is local and the “ctime” has not changed anytime within the intervals of the run.

FYI - KDB: offers a great way to authenticate using a Kerberos-internal principal (e.g. kadmin/admin) to prove it is the KDC infrastructure, without having to create secondary files which can be copied out-of-band or for which their distribution cannot be deterministically sync’d with respect to Kerberos iprop propagation. For most use-cases, I prefer keytabs but to prove Kerberos infrastructure identity, I prefer not to create extra keytabs and to rotate the keys aggressively to mitigate impact from any unauthorized extraction of Kerberos’ keys.
Download (untitled) / with headers
text/plain 1.4KiB
[I moderated through the first copy to krb5-bugs, but not the second
copy, and I didn't moderate through to krbdev. I prefer if messages
go to just one project list, so that the follow-up conversation
doesn't get fragmented.]

The cause of this bug is obscured by an error handling flaw in kinit
-t KDB:. When kinit calls kinit_kbd_init(), it destroys and
recreates the krb5_context, but doesn't reset the global variable
errctx which is used by extended_com_err_fn(). That can be fixed
with the following patch:

diff --git a/src/clients/kinit/kinit.c b/src/clients/kinit/kinit.c
index a518284..3fdae28 100644
--- a/src/clients/kinit/kinit.c
+++ b/src/clients/kinit/kinit.c
@@ -718,6 +718,7 @@ k5_kinit(struct k_opts *opts, struct k5_data *k5)
#ifndef _WIN32
if (strncmp(opts->keytab_name, "KDB:", 4) == 0) {
ret = kinit_kdb_init(&k5->ctx, k5->me->realm.data);
+ errctx = k5->ctx;
if (ret) {
com_err(progname, ret,
_("while setting up KDB keytab for realm
%s"),

Without that fix, we just have the error string for
KRB5_KDB_CANTREAD_STORED, not the extended error message set in
krb5_db_def_fetch_mkey(). So all we know for now is that
krb5_db_def_fetch_mkey_keytab() didn't succeed. It's hard to
theorize beyond that. kt_file.c does contain locking calls, but they
are straightforward blocking POSIX file locks which shouldn't fail.
Subject: Re: [krbdev.mit.edu #8651] kinit -kt KDB: Cannot find/read stored master key
From: Richard Basch <basch@alum.mit.edu>
Date: Sun, 18 Mar 2018 12:52:15 -0400
CC: Richard Basch <richard.basch@gs.com>
To: rt-comment@KRBDEV-PROD-APP-1.mit.edu
RT-Send-Cc:
Download (untitled) / with headers
text/plain 2.2KiB
I ran a tight loop and simulated the issue.

In my test, the following conditions existed:
- The stash file is a non-standard location (and defined in the KRB5_KDC_PROFILE config (key_stash_file).
- Standard location stash file did not exist (not even as a symlink)

All my test essentially did was:
while true ; do
rm -f /tmp/ktrace.out /tmp/strace.out
KRB5_KDC_PROFILE=… KRB5_CONFIG=… KRB5_TRACE=/tmp/ktrace.out strace -f -o /tmp/strace.out kinit -kt KDB: kadmin/admin || break
done
The final run (after thousands of success) showed it looking for the STANDARD location stash file.


Show quoted text
> On Mar 17, 2018, at 9:51 PM, Greg Hudson via RT <rt-comment@KRBDEV-PROD-APP-1.mit.edu> wrote:
>
> [I moderated through the first copy to krb5-bugs, but not the second
> copy, and I didn't moderate through to krbdev. I prefer if messages
> go to just one project list, so that the follow-up conversation
> doesn't get fragmented.]
>
> The cause of this bug is obscured by an error handling flaw in kinit
> -t KDB:. When kinit calls kinit_kbd_init(), it destroys and
> recreates the krb5_context, but doesn't reset the global variable
> errctx which is used by extended_com_err_fn(). That can be fixed
> with the following patch:
>
> diff --git a/src/clients/kinit/kinit.c b/src/clients/kinit/kinit.c
> index a518284..3fdae28 100644
> --- a/src/clients/kinit/kinit.c
> +++ b/src/clients/kinit/kinit.c
> @@ -718,6 +718,7 @@ k5_kinit(struct k_opts *opts, struct k5_data *k5)
> #ifndef _WIN32
> if (strncmp(opts->keytab_name, "KDB:", 4) == 0) {
> ret = kinit_kdb_init(&k5->ctx, k5->me->realm.data);
> + errctx = k5->ctx;
> if (ret) {
> com_err(progname, ret,
> _("while setting up KDB keytab for realm
> %s"),
>
> Without that fix, we just have the error string for
> KRB5_KDB_CANTREAD_STORED, not the extended error message set in
> krb5_db_def_fetch_mkey(). So all we know for now is that
> krb5_db_def_fetch_mkey_keytab() didn't succeed. It's hard to
> theorize beyond that. kt_file.c does contain locking calls, but they
> are straightforward blocking POSIX file locks which shouldn't fail.
> _______________________________________________
> krb5-bugs mailing list
> krb5-bugs@mit.edu
> https://mailman.mit.edu/mailman/listinfo/krb5-bugs
I tried running the same script (inside "make testrealm", and gwithout
the env var settings since that's already taken care of) and
unfortunately couldn't reproduce the issue, with either master or 1.15.

Knowing that it was trying to read the stash file from the unconfigured
location is interesting; it suggests perhaps an earlier failure to read
$KRB5_KDC_CONFIG or something. But I still can't come up with a theory
as to what the bug might be.
Subject: Re: [krbdev.mit.edu #8651] kinit -kt KDB: Cannot find/read stored master key
From: Richard Basch <basch@alum.mit.edu>
Date: Sun, 18 Mar 2018 15:44:59 -0400
To: rt-comment@KRBDEV-PROD-APP-1.mit.edu
RT-Send-Cc:
Download (untitled) / with headers
text/plain 1.3KiB
There was one other small nuance in my test… I used a custom kdc.conf, specified as a bash sub-command:

KRB5_KDC_PROFILE=<(printf …) …

That said, the strace showed it reading from /dev/fd/63 with data returned such as the database_name, so there is no obvious failure on the shell to provide the fd or that the fd was inaccessible. (The only reason I use this syntax to generate a custom KRB5_KDC_PROFILE is to circumvent defining the kadm5_dict_file and loading some of the kadm5 hooks which don’t apply for the kinit operations - this method significantly improves the performance, when it doesn’t sporadically fail.)


Show quoted text
> On Mar 18, 2018, at 3:31 PM, Greg Hudson via RT <rt-comment@KRBDEV-PROD-APP-1.mit.edu> wrote:
>
> I tried running the same script (inside "make testrealm", and gwithout
> the env var settings since that's already taken care of) and
> unfortunately couldn't reproduce the issue, with either master or 1.15.
>
> Knowing that it was trying to read the stash file from the unconfigured
> location is interesting; it suggests perhaps an earlier failure to read
> $KRB5_KDC_CONFIG or something. But I still can't come up with a theory
> as to what the bug might be.
> _______________________________________________
> krb5-bugs mailing list
> krb5-bugs@mit.edu
> https://mailman.mit.edu/mailman/listinfo/krb5-bugs
From: Richard Basch <basch@alum.mit.edu>
Subject: Re: [krbdev.mit.edu #8651] kinit -kt KDB: Cannot find/read stored master key
Date: Sun, 18 Mar 2018 15:51:18 -0400
To: rt-comment@KRBDEV-PROD-APP-1.mit.edu
RT-Send-Cc:
Download (untitled) / with headers
text/plain 1.5KiB
This might indicate it did a double read of the config file where the data was already consumed… second read may have failed.
But why sporadically?

Show quoted text
> On Mar 18, 2018, at 3:44 PM, Richard Basch <basch@alum.mit.edu> wrote:
>
> There was one other small nuance in my test… I used a custom kdc.conf, specified as a bash sub-command:
>
> KRB5_KDC_PROFILE=<(printf …) …
>
> That said, the strace showed it reading from /dev/fd/63 with data returned such as the database_name, so there is no obvious failure on the shell to provide the fd or that the fd was inaccessible. (The only reason I use this syntax to generate a custom KRB5_KDC_PROFILE is to circumvent defining the kadm5_dict_file and loading some of the kadm5 hooks which don’t apply for the kinit operations - this method significantly improves the performance, when it doesn’t sporadically fail.)
>
>
>> On Mar 18, 2018, at 3:31 PM, Greg Hudson via RT <rt-comment@KRBDEV-PROD-APP-1.mit.edu> wrote:
>>
>> I tried running the same script (inside "make testrealm", and gwithout
>> the env var settings since that's already taken care of) and
>> unfortunately couldn't reproduce the issue, with either master or 1.15.
>>
>> Knowing that it was trying to read the stash file from the unconfigured
>> location is interesting; it suggests perhaps an earlier failure to read
>> $KRB5_KDC_CONFIG or something. But I still can't come up with a theory
>> as to what the bug might be.
>> _______________________________________________
>> krb5-bugs mailing list
>> krb5-bugs@mit.edu
>> https://mailman.mit.edu/mailman/listinfo/krb5-bugs
>
From: Richard Basch <basch@alum.mit.edu>
Subject: Re: [krbdev.mit.edu #8651] kinit -kt KDB: Cannot find/read stored master key
Date: Sun, 18 Mar 2018 15:53:29 -0400
To: rt-comment@KRBDEV-PROD-APP-1.mit.edu
RT-Send-Cc:
Download (untitled) / with headers
text/plain 1.7KiB
I see a second open of the fd in the strace… apparently, in some circumstances, it is resulting in a double-read.

Show quoted text
> On Mar 18, 2018, at 3:51 PM, Richard Basch <basch@alum.mit.edu> wrote:
>
> This might indicate it did a double read of the config file where the data was already consumed… second read may have failed.
> But why sporadically?
>
>> On Mar 18, 2018, at 3:44 PM, Richard Basch <basch@alum.mit.edu> wrote:
>>
>> There was one other small nuance in my test… I used a custom kdc.conf, specified as a bash sub-command:
>>
>> KRB5_KDC_PROFILE=<(printf …) …
>>
>> That said, the strace showed it reading from /dev/fd/63 with data returned such as the database_name, so there is no obvious failure on the shell to provide the fd or that the fd was inaccessible. (The only reason I use this syntax to generate a custom KRB5_KDC_PROFILE is to circumvent defining the kadm5_dict_file and loading some of the kadm5 hooks which don’t apply for the kinit operations - this method significantly improves the performance, when it doesn’t sporadically fail.)
>>
>>
>>> On Mar 18, 2018, at 3:31 PM, Greg Hudson via RT <rt-comment@KRBDEV-PROD-APP-1.mit.edu> wrote:
>>>
>>> I tried running the same script (inside "make testrealm", and gwithout
>>> the env var settings since that's already taken care of) and
>>> unfortunately couldn't reproduce the issue, with either master or 1.15.
>>>
>>> Knowing that it was trying to read the stash file from the unconfigured
>>> location is interesting; it suggests perhaps an earlier failure to read
>>> $KRB5_KDC_CONFIG or something. But I still can't come up with a theory
>>> as to what the bug might be.
>>> _______________________________________________
>>> krb5-bugs mailing list
>>> krb5-bugs@mit.edu
>>> https://mailman.mit.edu/mailman/listinfo/krb5-bugs
>>
>
Download (untitled) / with headers
text/plain 1.1KiB
The profile library is designed to automatically pick up changes to
the profile each time a profile_get_* call is made. This feature is
of dubious usefulness, because our code base is inconsistent about
how long it caches profile results for, but I've been reluctant to
simplify it out.

To do this, it calls stat() and compares the mtime to the mtime from
the previous stat() result. It does compare fractional timestamps,
so you would think the double-read would happen all the time (since
stat() on /dev/fd/NN appears to just give the current time).
However, there is also a guard against issuing too many stat() system
calls; at prof_file.c:316 we compare the current time to the last
time we read the filed or called stat(), and only issue a new stat()
call if the second has ticked over.

So I think in your scenario a failure only happens if the second
ticks over between the initial profile open and the profile read to
get the stash filename.

I think we could fix this scenario from our side by disabling the
stat-and-reread behavior for things that aren't regular files as
determined by S_ISREG(). From your side, you could of course work
around the issue by using a regular file.
From: Richard Basch <basch@alum.mit.edu>
Subject: Re: [krbdev.mit.edu #8651] kinit -kt KDB: Cannot find/read stored master key
Date: Sun, 18 Mar 2018 16:46:31 -0400
To: Greg Hudson via RT <rt-comment@krbdev.mit.edu>
RT-Send-Cc:
Download (untitled) / with headers
text/plain 1.5KiB
I already implementing the workaround to use a file-based config after discovering the double-read.
I think the test for S_ISREG seems appropriate (just don’t use lstat since you want to follow symlinks).


Show quoted text
> On Mar 18, 2018, at 4:05 PM, Greg Hudson via RT <rt-comment@krbdev.mit.edu> wrote:
>
> The profile library is designed to automatically pick up changes to
> the profile each time a profile_get_* call is made. This feature is
> of dubious usefulness, because our code base is inconsistent about
> how long it caches profile results for, but I've been reluctant to
> simplify it out.
>
> To do this, it calls stat() and compares the mtime to the mtime from
> the previous stat() result. It does compare fractional timestamps,
> so you would think the double-read would happen all the time (since
> stat() on /dev/fd/NN appears to just give the current time).
> However, there is also a guard against issuing too many stat() system
> calls; at prof_file.c:316 we compare the current time to the last
> time we read the filed or called stat(), and only issue a new stat()
> call if the second has ticked over.
>
> So I think in your scenario a failure only happens if the second
> ticks over between the initial profile open and the profile read to
> get the stash filename.
>
> I think we could fix this scenario from our side by disabling the
> stat-and-reread behavior for things that aren't regular files as
> determined by S_ISREG(). From your side, you could of course work
> around the issue by using a regular file.
I think I was wrong about stat("/dev/fd/NN") returning the current
time as mtime for a pipe. I think to manifest this problem, kinit
has to initially open() and stat() the profile before printf has
finished writing to the pipe. (It will then block reading the data
from printf, but it already recorded the timestamp, and it doesn't
stat() again afterwards.) That's much less likely than the clock
second rolling over between profile_init and profile_get.

I was able to produce a failure using many repeated invocations of
KRB5_KDC_PROFILE=<(printf ...) kinit -k -t KDB:, but it took a long
time. To produce a failure in just one try, I wrote a simple test
program:

int
main(int argc, char **argv)
{
profile_t profile;
char *str;

if (profile_init_path(argv[1], &profile) != 0)
abort();
if (profile_get_string(profile, "a", "b", NULL, "no", &str) != 0)
abort();
printf("%s\n", str);
profile_release_string(str);
return 0;
}

and ran it with:

./a.out <(sleep 1; printf "[a]\nb=yes\n")
From: ghudson@mit.edu
Subject: git commit

Avoid rereading non-regular profile files

If a profile file is a special device such as a pipe, then reloading
it will cause us to discard its contents. Repurpose the first
PROFILE_FILE flag bit to indicate that the file should not be
reloaded. In profile_update_file_data_locked(), set the flag on the
first load if stat() doesn't indicate a regular file, and check the
flag before testing whether we need to perform any subsequent reload.

Later on in profile_update_file_data_locked(), specifically unset the
DIRTY flag rather than unsetting all flags other than SHARED, so that
we don't clear the NO_RELOAD flag.

https://github.com/krb5/krb5/commit/2c26419ad7ea4800995180292ce39db863672fa9
Author: Greg Hudson <ghudson@mit.edu>
Commit: 2c26419ad7ea4800995180292ce39db863672fa9
Branch: master
src/util/profile/prof_file.c | 9 ++++++++-
src/util/profile/prof_int.h | 5 +----
2 files changed, 9 insertions(+), 5 deletions(-)