From 4fd14f63efaea0ec42ffb84ecc4aada4f7dde7d4 Mon Sep 17 00:00:00 2001 From: CharlesKWON Date: Wed, 20 May 2026 08:30:01 +0900 Subject: [PATCH] fix(dbf): max-merge header on shared-mode Close to preserve peer Append MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Third layer of the multi-session concurrency story. After Layers 1+2 (67cd8f2 — shared DATA-INIT hash + recCount cache invalidation), the residual flake had this exact failure mode: goroutine A: OPEN -> Append (recCount→1, hdr=1) -> ... goroutine B: OPEN -> Append (refresh→1, bump to 2, hdr=2) -> ... goroutine B: Close -> flushRecord -> updateHeader (writes 2) goroutine A: Close -> flushRecord -> updateHeader (writes 1) ← clobbers! A's updateHeader unconditionally wrote a.recCount back to disk, even when the disk header had been bumped by B's append-intent- locked Append in between. Subsequent peer SELECTs then read hdr=1 and iterated only as far as slot 1, missing B's row that was physically present at slot 2. Fix: in shared mode, updateHeader re-reads the disk header first and writes back max(disk.RecCount, a.recCount). Correct under the existing append-intent-lock invariant (the disk count is monotonically nondecreasing across all peers); cheap (~1 stat- sized read per close, never on the hot append path). EXCLUSIVE mode keeps the old unconditional write — no peer can have bumped the header, so the read+max is pure overhead with no upside. Measured impact (3-worker concurrent insert+select+commit × 20 runs): pre-67cd8f2: ~60% pass, occasional Go panic after 67cd8f2: 80% pass, no panics after THIS: 80% pass, no panics (3-worker stable) after THIS: 50% pass (5-worker — higher load uncovers additional races at the multi-area mmap layer) The remaining 5-worker flake points at a deeper issue: peer DBFArea instances on the same file each hold their own mmap, and the mmap snapshot taken at Open time doesn't track grow-by- peer events between mmap-time and the next read. loadRecord falls back to ReadAt when offset > len(mmap), so reads themselves work — but the per-area appendBuf interaction with peer-bumped header values needs more thought. Tracked as a proper follow-up; the architectural shape is "every shared DBFArea registers in a per-path mmap-gen registry that broadcasts grow-events". All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 6/6 ✓ Co-Authored-By: Claude Opus 4.7 (1M context) --- hbrdd/dbf/dbf.go | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/hbrdd/dbf/dbf.go b/hbrdd/dbf/dbf.go index dc834d7..1034188 100644 --- a/hbrdd/dbf/dbf.go +++ b/hbrdd/dbf/dbf.go @@ -1115,6 +1115,30 @@ func (a *DBFArea) drainPendingIndexInserts() { } func (a *DBFArea) updateHeader() { + // Shared-mode max-merge. A pgserver-style multi-connection + // scenario has every peer call Close → updateHeader in arbitrary + // order. Each peer's `a.recCount` reflects its own view at the + // time of its last Append; if the file has grown since (because + // another peer's Append took the append-intent lock and bumped + // the disk header), naively writing a.recCount back would + // roll the on-disk count BACKWARDS — and subsequent peer SELECTs + // would iterate only as far as our stale count, missing rows + // that are demonstrably on disk. + // + // Re-read the disk header and pick the max. This is correct under + // the append-intent lock invariant (the bumped count is always + // monotonic) and cheap (one stat-sized read). Single-process / + // EXCLUSIVE mode still writes its local count unconditionally, + // since no peer can have bumped it. + if a.shared { + if _, err := a.dataFile.Seek(0, 0); err == nil { + if hdr, err := ReadHeader(a.dataFile); err == nil { + if hdr.RecCount > a.recCount { + a.recCount = hdr.RecCount + } + } + } + } a.header.RecCount = a.recCount a.header.UpdateDate()