fix(dbf): max-merge header on shared-mode Close to preserve peer Append
Third layer of the multi-session concurrency story. After Layers 1+2 (67cd8f2— shared DATA-INIT hash + recCount cache invalidation), the residual flake had this exact failure mode: goroutine A: OPEN -> Append (recCount→1, hdr=1) -> ... goroutine B: OPEN -> Append (refresh→1, bump to 2, hdr=2) -> ... goroutine B: Close -> flushRecord -> updateHeader (writes 2) goroutine A: Close -> flushRecord -> updateHeader (writes 1) ← clobbers! A's updateHeader unconditionally wrote a.recCount back to disk, even when the disk header had been bumped by B's append-intent- locked Append in between. Subsequent peer SELECTs then read hdr=1 and iterated only as far as slot 1, missing B's row that was physically present at slot 2. Fix: in shared mode, updateHeader re-reads the disk header first and writes back max(disk.RecCount, a.recCount). Correct under the existing append-intent-lock invariant (the disk count is monotonically nondecreasing across all peers); cheap (~1 stat- sized read per close, never on the hot append path). EXCLUSIVE mode keeps the old unconditional write — no peer can have bumped the header, so the read+max is pure overhead with no upside. Measured impact (3-worker concurrent insert+select+commit × 20 runs): pre-67cd8f2: ~60% pass, occasional Go panic after67cd8f2: 80% pass, no panics after THIS: 80% pass, no panics (3-worker stable) after THIS: 50% pass (5-worker — higher load uncovers additional races at the multi-area mmap layer) The remaining 5-worker flake points at a deeper issue: peer DBFArea instances on the same file each hold their own mmap, and the mmap snapshot taken at Open time doesn't track grow-by- peer events between mmap-time and the next read. loadRecord falls back to ReadAt when offset > len(mmap), so reads themselves work — but the per-area appendBuf interaction with peer-bumped header values needs more thought. Tracked as a proper follow-up; the architectural shape is "every shared DBFArea registers in a per-path mmap-gen registry that broadcasts grow-events". All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 6/6 ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1115,6 +1115,30 @@ func (a *DBFArea) drainPendingIndexInserts() {
|
||||
}
|
||||
|
||||
func (a *DBFArea) updateHeader() {
|
||||
// Shared-mode max-merge. A pgserver-style multi-connection
|
||||
// scenario has every peer call Close → updateHeader in arbitrary
|
||||
// order. Each peer's `a.recCount` reflects its own view at the
|
||||
// time of its last Append; if the file has grown since (because
|
||||
// another peer's Append took the append-intent lock and bumped
|
||||
// the disk header), naively writing a.recCount back would
|
||||
// roll the on-disk count BACKWARDS — and subsequent peer SELECTs
|
||||
// would iterate only as far as our stale count, missing rows
|
||||
// that are demonstrably on disk.
|
||||
//
|
||||
// Re-read the disk header and pick the max. This is correct under
|
||||
// the append-intent lock invariant (the bumped count is always
|
||||
// monotonic) and cheap (one stat-sized read). Single-process /
|
||||
// EXCLUSIVE mode still writes its local count unconditionally,
|
||||
// since no peer can have bumped it.
|
||||
if a.shared {
|
||||
if _, err := a.dataFile.Seek(0, 0); err == nil {
|
||||
if hdr, err := ReadHeader(a.dataFile); err == nil {
|
||||
if hdr.RecCount > a.recCount {
|
||||
a.recCount = hdr.RecCount
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
a.header.RecCount = a.recCount
|
||||
a.header.UpdateDate()
|
||||
|
||||
|
||||
Reference in New Issue
Block a user