Skip Menu |
 

Subject: Credential cache API does not support atomic reinitialization
Download (untitled) / with headers
text/plain 1.9KiB
If a group of processes are using a credential cache for client
operations which require service tickets, and another process
reinitializes the cache, there is a brief window of time where the
former group of processes will fail because the cache is empty.

This failure window occurs because the ccache API does not provide any
way to initialize a cache and place credentials in it atomically.
There is a workaround for FILE ccaches on Unix-like systems where you
create a ccache in the same directory and rename it into place, but it
violates the ccache abstraction barrier.

Here are several candidate solutions along with reasons why some of them
might not work:

1. Provide a krb5_cc_init_with_cred() which is defined to atomically
reinitialize the ccache and place one specified credential in it,
presumably a TGT. This would be easy for a caller to use, but would
still leave open a window where the TGT is present but config entries
(such as "FAST is available") are not. So it's not a complete solution.

2. Define krb5_cc_lock() and krb5_cc_unlock() to lock the ccache across
processes (currently it just locks a mutex in the handle for most
types), and provide an API to initialize a ccache with the lock held.
This might be difficult to implement for some ccache types, such as
KEYRING.

3. Provide a way for krb5_cc_new_unique() to make a ccache within the
same atomic-replacement domain as an existing ccache, perhaps via the
currently-unused hint parameter, and then make krb5_cc_move() try to use
atomic replacement when possible.

4. Define krb5_cc_move() to be atomic in all cases (or perhaps only when
the source ccache type is MEMORY, although it shouldn't really matter),
whether the implementation uses locks or atomic replacement.

There are some related credential cache concurrency issues, such as the
behavior of iteration and the IAKERB state machine across cache
reinitialization. Those are separate issues; this ticket is only about
supporting atomic reinitialization in the API.
Download (untitled) / with headers
text/plain 1.5KiB
In IRC Nico suggested a fifth option, which I think is a little
outside the box but still worth considering. My interpretation is:

5. On krb5_cc_initialize(), do nothing externally visible (or at
least no more visible than creating a temporary file). When the
first non-config credential entry is stored, then atomically replace
the ccache as it is visible to other handles.

There are two caveats to this approach:

A. For this approach to work, the first non-config credential entry
to be stored must be the TGT, or the sole service ticket for non-TGT
ccaches, and important config entries must be written before that
credential is written.

This property is likely already true for creating uses like kinit and
gss_acquire_cred_with_password(). But when the cache is created by a
copy operation, as it is with gss_store_cred_into(), this property
relies on iteration (from the source ccache) preserving order, at
least loosely. Our implementation of the MEMORY ccache type
currently reverses order during iteration; that would need to be
changed, and other ccache types such as KEYRING would need to be
inventoried to determine if they preserve order.

B. This approach does requires the cache to be created via a single
handle, up through storing the first non-cred entry. For example, a
caller who creates a cache, initializes it, closes it, and then
passes the name to some other agent to fill in credentials will no
longer work. A review of known Kerberos applications like kstart
would be indicated to make sure no one does this.
Simo suggests an option 6, applying only to collection types:

6. Create a new ccache within the collection to hold the new creds, and remove the old one once the new one is populated.  If the old one was the primary, set the primary to the new one before removing the old one.

Caveats of this approach:

A. krb5_cc_cache_match() needs to prefer the old one until the new one is populated.  If collection iteration respects creation order, this is automatic, but (as far as I know) no collection type currently provides that guarantee.  Otherwise krb5_cc_cache_match() needs a way to detect whether a cache is populated.

B. We need a way to remove the old one.  A krb5_cc_cache_match() before initialize the new cache could work, more or less.  (There's a danger that the old cache is destroyed by another process doing the same refresh, and then the cache name is reused for another purpose, and then we wind up destroying unrelated creds.  Unlikely if cache names are sufficiently random.)

C. Resetting the primary to reflect the update could race with another process setting the primary (either to refresh the same cache, or for another purpose entirely).

D. Nico has suggested that cache names within a collection should use the principal name rather than a random name.  That itself requires some changes to the API, but is also incompatible with this approach unless the unpopulated cache can temporarily use a different name and then be renamed.
 
However we do this, it would be good if callers had to go to minimum effort to atomically refresh creds for a client principal.

One approach is a gic option to atomically store creds obtained by krb5_get_init_creds_*(), to be used instead of krb5_get_init_creds_opt_set_out_ccache().  This option could perhaps accept an optional string argument to name the collection or ccache to refresh, and use the default cache or collection otherwise.