fix(dbf): max-merge header on shared-mode Close to preserve peer Append

Third layer of the multi-session concurrency story. After Layers 1+2 (67cd8f2 — shared DATA-INIT hash + recCount cache invalidation), the residual flake had this exact failure mode: goroutine A: OPEN -> Append (recCount→1, hdr=1) -> ... goroutine B: OPEN -> Append (refresh→1, bump to 2, hdr=2) -> ... goroutine B: Close -> flushRecord -> updateHeader (writes 2) goroutine A: Close -> flushRecord -> updateHeader (writes 1) ← clobbers! A's updateHeader unconditionally wrote a.recCount back to disk, even when the disk header had been bumped by B's append-intent- locked Append in between. Subsequent peer SELECTs then read hdr=1 and iterated only as far as slot 1, missing B's row that was physically present at slot 2. Fix: in shared mode, updateHeader re-reads the disk header first and writes back max(disk.RecCount, a.recCount). Correct under the existing append-intent-lock invariant (the disk count is monotonically nondecreasing across all peers); cheap (~1 stat- sized read per close, never on the hot append path). EXCLUSIVE mode keeps the old unconditional write — no peer can have bumped the header, so the read+max is pure overhead with no upside. Measured impact (3-worker concurrent insert+select+commit × 20 runs): pre-67cd8f2: ~60% pass, occasional Go panic after 67cd8f2: 80% pass, no panics after THIS: 80% pass, no panics (3-worker stable) after THIS: 50% pass (5-worker — higher load uncovers additional races at the multi-area mmap layer) The remaining 5-worker flake points at a deeper issue: peer DBFArea instances on the same file each hold their own mmap, and the mmap snapshot taken at Open time doesn't track grow-by- peer events between mmap-time and the next read. loadRecord falls back to ReadAt when offset > len(mmap), so reads themselves work — but the per-area appendBuf interaction with peer-bumped header values needs more thought. Tracked as a proper follow-up; the architectural shape is "every shared DBFArea registers in a per-path mmap-gen registry that broadcasts grow-events". All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 6/6 ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 08:30:01 +09:00
parent 67cd8f2306
commit 4fd14f63ef
1 changed files with 24 additions and 0 deletions
--- a/hbrdd/dbf/dbf.go
+++ b/hbrdd/dbf/dbf.go
@@ -1115,6 +1115,30 @@ func (a *DBFArea) drainPendingIndexInserts() {
 }

 func (a *DBFArea) updateHeader() {
+	// Shared-mode max-merge. A pgserver-style multi-connection
+	// scenario has every peer call Close → updateHeader in arbitrary
+	// order. Each peer's `a.recCount` reflects its own view at the
+	// time of its last Append; if the file has grown since (because
+	// another peer's Append took the append-intent lock and bumped
+	// the disk header), naively writing a.recCount back would
+	// roll the on-disk count BACKWARDS — and subsequent peer SELECTs
+	// would iterate only as far as our stale count, missing rows
+	// that are demonstrably on disk.
+	//
+	// Re-read the disk header and pick the max. This is correct under
+	// the append-intent lock invariant (the bumped count is always
+	// monotonic) and cheap (one stat-sized read). Single-process /
+	// EXCLUSIVE mode still writes its local count unconditionally,
+	// since no peer can have bumped it.
+	if a.shared {
+		if _, err := a.dataFile.Seek(0, 0); err == nil {
+			if hdr, err := ReadHeader(a.dataFile); err == nil {
+				if hdr.RecCount > a.recCount {
+					a.recCount = hdr.RecCount
+				}
+			}
+		}
+	}
 	a.header.RecCount = a.recCount
 	a.header.UpdateDate()