Date: | Tue, 25 Sep 2012 16:33:22 -0500 |
Subject: | kdb5_util dump is racy |
From: | Nico Williams <nico@cryptonector.com> |
To: | krb5-bugs@mit.edu |
kdb5_util dump does the following:
unlink(2) the dump file
fopen(3C) the dump file to create and open it
krb5_lock_file() the dump file
iterate principals and write entries to the dump file
iterate policies and write entries to the dump file
open(2) the .dump_ok file with O_WRONLY|O_CREAT|O_TRUNC
write(2) a single 0-valued byte to the .dump_ok file
Quite clearly this is racy: N>1 dump processes will all get past the
krb5_lock_file() successfully since they'll be locking *different* files,
with all but one locking files that have been unlinked!
It gets worse: the policy iteration actually tries to get an exclusive
lock on the policy DB. Before the change to use blocking locks this
meant that all but one of the racing dump processes might fail. But if
events happen just so then it will look like the dump file is valid when
in fact it is missing all policy entries.
Assuming that mixed versions of MIT Kerberos on a KDC must be supported
during a transition, the best fix would be to create/truncate the .dump_ok
file first, then lock it, then create a tmp file for the dump, dump,
rename the tmp file into place, write the NUL to the .dump_ok file,
and finally unlock the .dump_ok file.
(There is no obvious reason that iterating policies should require an
exclusive lock, but that's fodder for a separate bug report.)
unlink(2) the dump file
fopen(3C) the dump file to create and open it
krb5_lock_file() the dump file
iterate principals and write entries to the dump file
iterate policies and write entries to the dump file
open(2) the .dump_ok file with O_WRONLY|O_CREAT|O_TRUNC
write(2) a single 0-valued byte to the .dump_ok file
Quite clearly this is racy: N>1 dump processes will all get past the
krb5_lock_file() successfully since they'll be locking *different* files,
with all but one locking files that have been unlinked!
It gets worse: the policy iteration actually tries to get an exclusive
lock on the policy DB. Before the change to use blocking locks this
meant that all but one of the racing dump processes might fail. But if
events happen just so then it will look like the dump file is valid when
in fact it is missing all policy entries.
Assuming that mixed versions of MIT Kerberos on a KDC must be supported
during a transition, the best fix would be to create/truncate the .dump_ok
file first, then lock it, then create a tmp file for the dump, dump,
rename the tmp file into place, write the NUL to the .dump_ok file,
and finally unlock the .dump_ok file.
(There is no obvious reason that iterating policies should require an
exclusive lock, but that's fodder for a separate bug report.)