My concern was within restore_dump()... it calls krb5_db_put_principal()
which in turn calls ulog_lock for each record (this was a major performance
hit especially with a database with about a half-million principals). Under
Kerberos 1.10, I found it only took 2-3 minutes to load the database (and
another 2-3 to send it over the wire), but I was seeing inordinately long
load times which I could trace back to the ulog_lock/unlock for every
record. I will have to review the master branch carefully, but if that code
also exists in the master branch, see how long it takes to do a dump/restore
for a database with about 500,000 principals (and compare it with the time
it took under Kerberos 1.10.x).
And I agree with you that I might have introduced the performance issue with
my first patch (when I moved the iproprole setting, which is why I set it to
IPROP_NULL as it was in the original state before I patched it).
I am confused how you say the third patch is broken, as by this point, the
ulog is mapped and I am setting the ulog at the end of the restore. During
the restore, you do not want to be touching the ulog at all (e.g.
IPROP_NULL) because the database being restored is "on the side". Once it is
promoted, one might argue the first thing should be to set it to
IPROP_SLAVE, then set the information, but that is not what was in the code
originally either.
The goal should be not to allow any ulog processing or locking during the
restore (which is to a temp db anyway), and then set the ulog state after
the restore.
I will play around with the master branch (what I supplied was based on the
1.11 branch and empirical testing, but the two branches have deviated enough
at this point).
Show quoted text
-----Original Message-----
From: Greg Hudson via RT [mailto:rt-comment@krbdev.mit.edu]
Sent: Wednesday, August 28, 2013 6:07 PM
To: basch@alum.mit.edu
Subject: [krbdev.mit.edu #7695] krb5-1.11.3/1.10.6 - full resync may fail
and still result in ulog being updated
Please send follow-ups by replying (without quoting) to a message you
receive from rt@krbdev.mit.edu, not by forwarding them to
krbdev@mit.edu. We've been scooping these follow-ups out of the krbdev
moderation queue and appending them to the ticket.
I believe the performance problem you noted was actually introduced by
your first patch. If you look at the code you moved in that patch,
you'll see that one of the things it does is set log_ctx->iproprole =
IPROP_NULL before restore_dump(). The other part of the third patch
should be unnecessary, because ulog_lock() does nothing if iproprole is
IPROP_NULL.
Your third patch also seems broken, in that setting log_ctx->iproprole =
IPROP_NULL before restore_dump() will prevent the ulog from being
updated afterwards. On master
(
https://github.com/krb5/krb5/commit/825fa2be6f119677a09acccb109ab976cfc601f8) I worked around this by setting iproprole to IPROP_SLAVE instead
of IPROP_NULL, but that only works because in master, kdb5.c only calls
ulog_lock() if iproprole is IPROP_MASTER.