Return-Path: Received: from mta1.srv.hcvlny.cv.net (mta1.srv.hcvlny.cv.net [167.206.4.196]) by krbdev.mit.edu (Postfix) with ESMTP id 4CA853F28B for ; Wed, 28 Aug 2013 19:54:49 -0400 (EDT) Received: from tardis.internal.bright-prospects.com (ool-4a5a27d7.dyn.optonline.net [74.90.39.215]) by mta1.srv.hcvlny.cv.net (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) with ESMTP id <0MS9007G6MFCBZ50@mta1.srv.hcvlny.cv.net> for rt-comment@krbdev.mit.edu; Wed, 28 Aug 2013 19:54:48 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by tardis.internal.bright-prospects.com (Postfix) with ESMTP id 429258B889 for ; Wed, 28 Aug 2013 19:54:48 -0400 (EDT) Received: from tardis.internal.bright-prospects.com ([127.0.0.1]) by localhost (tardis.internal.bright-prospects.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vDMvveb1va1C for ; Wed, 28 Aug 2013 19:54:47 -0400 (EDT) Received: from BASCHT520 (basch-t520.internal.bright-prospects.com [192.168.15.61]) by tardis.internal.bright-prospects.com (Postfix) with ESMTPS id 77BCA8B886 for ; Wed, 28 Aug 2013 19:54:47 -0400 (EDT) Date: Wed, 28 Aug 2013 19:54:47 -0400 From: Richard Basch Subject: RE: [krbdev.mit.edu #7695] krb5-1.11.3/1.10.6 - full resync may fail and still result in ulog being updated In-Reply-To: To: rt-comment@krbdev.mit.edu Message-ID: <07b001cea449$fa048090$ee0d81b0$@mit.edu> MIME-Version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-Type: text/plain; charset=us-ascii Content-Language: en-us Content-Transfer-Encoding: 7BIT Thread-Index: Ac6kR2mATGJs5kRfQuqdjRSsJJfRKQAADHRQ X-Virus-Scanned: amavisd-new at mail.bright-prospects.com References: RT-Send-Cc: X-RT-Original-Encoding: us-ascii Content-Length: 2061 Agreed... I'll fix my 1.11 branch. But also fix the conditional dump, which is seriously broken (this was one of the 3 GIT commits I supplied). I had a QA environment where I restored a dump, but the old from_master had a later serial number than what was present in the new ulog. It doesn't do a proper sanity check and thought the from_master was ok to use. I fixed the logic to actually search through the ulog for the corresponding entry to determine if the dump header really does match a ulog entry. I first perform a quick check if the latest dump matches the latest ulog header (which dovetails with my hierarchical tree propagation patches which I submitted earlier, but the check is agnostic as to whether that patch is applied or not) -- this is the only optimization which is "safe" and guaranteed. For instance, if a ulog is reinitialized, the new ulog may not have any entries to match against, but the slaves should not defer getting a database resync from the master until an update is performed solely because of this condition (that was the theoretical driver behind retaining the optimization). I did not confirm whether it might potentially result in a second full resync after the first update, but with or without conditional dumps, the issue already would have existed (so I wasn't worried, and for this section of code, it seemed to be the logical action and if the theoretical double-sync issue exists, the problem should be fixed elsewhere). The only other optimization would possibly have been to match against the first sno entry in the ulog header, since that might be the only other condition where a dump is not required. The only reason I did not do that initially is for busy sites, by the time you made the dump and sent it, the other side may already need another one (it is a fringe case, where accepting the first entry without a ulog history vs. not accepting it both have potential consequences of requiring another full resync anyway; I opted to simply let it do another dump if it wasn't in the actual ulog history).