Files
five/hbrdd/dbf/dbf.go
CharlesKWON cde86730b8 fix(compiler,hbrt,hbrdd,cli): pre-1.0 audit — 13 critical fixes
Senior-engineer / QA audit landed 13 silent-miscompile and data-
integrity fixes spanning the whole compiler+runtime+storage stack.
Each fix is paired with either an integration test in the suite or
a focused regression check; all 6 release gates stay green:
go test ./..., FiveSql2 43/43, Harbour compat 56/56, std.ch 17/17,
FRB 7/7, examples 65/71.

Compiler
--------

* genpc IF/ELSEIF jumpEnd2 patching (compiler/genpc/genpc.go).
  Per-ELSEIF branch terminators were stashed into `_ = jumpEnd2`
  and never patched — the relative offset stayed 0 and the runtime
  walked the next ELSEIF's PcOpJumpFalse opcode as if it were
  jump-offset data. Bytecode-level corruption in pcode mode. Now
  collected into a slice and patched at end-of-IF. Verified via
  Grade(95..50) cases 11a-e added to tests/frb/test_frb_pcode_sweep.

* countLocalsInStmts / scanBodyLocals missing bodies
  (compiler/gengo/gen_util.go, compiler/gengo/gengo.go). Frame-size
  counter skipped WATCH/TIMEOUT/PARALLEL FOR bodies, so a LOCAL
  declared inside one of those constructs got a slot index past
  the runtime's allocated count — silent NIL reads or out-of-range
  stomps.

* emitMethodDeclStandalone nested LOCAL (compiler/gengo/gen_class.go).
  Same bug class but on the *method* side. Pre-fix repro:

      METHOD Stomp(n) CLASS T
         LOCAL a := 1, b := 2
         IF n > 0
            LOCAL c := 30, d := 40, e := 50, f := 60
            Inner( n )
            IF c != 30 .OR. d != 40 .OR. e != 50 .OR. f != 60 ...

  printed `c, d, e, f = 5, NIL, NIL, NIL` because Inner's frame
  collided with Stomp's underallocated slot range. Now counts
  body-nested LOCALs into the frame and pre-allocates indices via
  scanBodyLocals.

* genpc unsupported-AST diagnostic surface (compiler/genpc/genpc.go,
  hbrt/pcode.go, cmd/five/main.go, hbrtl/frb.go). The `default`
  cases in emitStmt / emitExpr silently emitted PushNil / no-op
  for nodes the pcode generator doesn't implement (ClassDecl,
  MethodDecl, xBase commands, concurrency primitives, …). Added
  `PcodeModule.Warnings []string` populated by noteUnsupported,
  surfaced on stderr from the build pipeline. Users now see
  "pcode: AST node not supported in --pcode/FRB-pcode mode: stmt
  *ast.GoBlockStmt" instead of getting a silently broken module.

Runtime
-------

* class.go Send/tryBinaryOp t.self defer-restore (hbrt/class.go).
  Restoration was a plain `t.self = oldSelf` after `fn(t)`. Any
  panic in the method body skipped the line, so the next BEGIN
  SEQUENCE / RECOVER handler ran with the THROWING object's Self
  — `::field` resolved against the wrong receiver. Wrapped both
  restore sites in `defer func() { t.self = oldSelf }()`.
  Verified: pre-fix RECOVER saw "THROWER", post-fix "OUTER".

* hbfunc.go HB_FUNC parameter Frame() (hbrt/hbfunc.go). The
  RegisterDynamicFunc wrapper called `fn(ctx)` without ever
  calling Frame, so `ctx.ParC(1)` / `ctx.Local(n)` read through
  `t.curFrame.localBase + n - 1` against the *caller's* frame.
  Every #pragma BEGINDUMP HB_FUNC taking parameters silently
  returned "" / 0 / "" for them — masked by ParNIDef-style
  defaults. Wrapper now does `t.Frame(t.pendingParams, 0); defer
  t.EndProc()` before dispatch.

* pcode codeblock closure capture (hbrt/pcinterp.go, hbrt/pcode.go,
  hbrt/thread.go, compiler/genpc/genpc.go). PcOpPushBlock recorded
  `nDetached` but never copied enclosing locals; free vars in the
  block body fell through to memvar lookup → NIL. Wired full
  capture pipeline:
  - New opcodes PcOpPushDetached (0x59) / PcOpPopDetached (0x5A).
  - PushBlock now reads per-slot source-local indices and
    snapshots into bb.Detached at construction time.
  - New detachedMap in genpc auto-promotes any free var that
    resolves to an enclosing-frame local into a capture slot.
  - emitAssignAsExpr leaves the assigned value on the eval stack
    so SeqExpr items like `{|v| acc += v, acc }` work.
  - Thread tracks curBlock with paired Set/restore in the block's
    Fn wrapper for nested-block evaluation.
  Mutating capture (acc += v across successive Evals) now works.

* vm.NewThread statics + waFactory propagation (hbrt/vm.go).
  GoLaunch / GoLaunchBlock call NewThread directly. Previously
  the statics map and WA factory were applied only in Run(), so
  goroutine-spawned PRG code panicked on STATIC access ("static
  index out of range") and crashed dereferencing nil WA on any
  DB call. Both now happen inside NewThread under the same lock
  as TID assignment.

Data layer
----------

* dbf concurrent Append lock (hbrdd/dbf/dbf.go,
  hbrdd/dbf/locks_posix.go, hbrdd/dbf/locks_windows.go). Append
  bumped a local recCount with no file-system serialization. Two
  shared-mode processes both wrote at the same RecordOffset; one
  record silently overwrote the other. Added an append-intent
  byte-range lock at offset 0x7FFFFFFE + bounded retry, on-disk
  header refresh inside the locked region, and immediate header
  write so peers refresh past our slot.

* indexer negative numeric key encoding (hbrdd/dbf/indexer.go +
  new hbrdd/dbf/encode_numeric_test.go). `%20.10f` formats `-100`
  as `"     -100.0000000000"` and `99` as `"        99.0000000000"`.
  ASCII ' ' (0x20) < '-' (0x2D), so `99` lex-compared LESS than
  `-100` — every NTX/CDX index over a column that ever held a
  negative number returned wrong rows for SEEK / range scans.
  Replaced with a 1-byte sign prefix + 21-byte zero-padded
  magnitude (negatives use digit-complement) so byte order
  matches numeric order across signs and magnitudes. Format
  change: existing indexes built with the old encoding must be
  REINDEXed. Three unit tests pin the order.

* dbf Append index maintenance hooks (hbrdd/dbf/dbf.go,
  hbrdd/dbf/indexer.go). Append never inserted into open NTX/CDX
  indexes — the audit's canonical scenario `SET INDEX TO …;
  APPEND BLANK; REPLACE …; dbSeek …` silently missed the new
  record. Added optional IndexWriter interface, queue the new
  recNo in pendingIdxInserts, drain after flushRecord by calling
  InsertKey on every open writer-supporting engine. NTX
  participates (its existing rebuild-on-insert is correct);
  CDX online maintenance is deferred to a follow-up — those
  indexes still need REINDEX. Verified: post-fix SEEK("Charlie")
  after APPEND BLANK + REPLACE finds the new record.

* dbf PACK crash-safety (hbrdd/dbf/dbf.go). The old in-place
  rewrite read record N, overwrote slot M<N, then truncated.
  Power loss after partial loop left a file with overwritten
  prefix and no original copies of the records already advanced
  past — silent data loss. Rewrote to:
    1) drop mmap, build `<file>.pack.tmp` with all surviving
       records,
    2) Sync(),
    3) close original handle + os.Rename(tmp, orig) (atomic on
       same FS),
    4) reopen + re-mmap.
  TestComp_Pack passes; readers always see either the pre-PACK
  or post-PACK contents, never a half-state.

* mem RDD torn reads (hbrdd/mem/memrdd.go). The comment claimed
  in-place PutValue was safe because hbrt.Value "fits in a
  single machine word + pointer". hbrt.Value is 24 bytes (3
  words) — a concurrent reader could observe new type tag with
  stale scalar/ptr and type-confuse on the next AsXxx() call.
  Switched mu to sync.RWMutex; GetValue takes RLock,
  Append/PutValue/Delete/Recall take Lock. `go test -race
  ./hbrdd/mem/` clean.

Files touched
-------------

  compiler/gengo/gen_class.go, gen_util.go, gengo.go
  compiler/genpc/genpc.go
  hbrt/class.go, hbfunc.go, pcinterp.go, pcode.go, thread.go, vm.go
  hbrdd/dbf/dbf.go, indexer.go, locks_posix.go, locks_windows.go
  hbrdd/dbf/encode_numeric_test.go  (new)
  hbrdd/mem/memrdd.go
  cmd/five/main.go
  hbrtl/frb.go
  tests/frb/test_frb_pcode_sweep.prg

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 05:29:56 +09:00

1170 lines
30 KiB
Go

// Copyright (c) 2026 Charles KWON OhJun (charleskwonohjun@gmail.com)
// All rights reserved.
// DBFArea — the core DBF file driver.
// Byte-compatible with Harbour/Clipper DBF files.
//
// Harbour equivalent: DBFAREA in dbf1.c
// Inherits from BaseArea (WAAREA), implements Area + RecordManager + Locker.
//
// Reference:
// /mnt/d/harbour-core/src/rdd/dbf1.c
// docs/dbf-engine-spec.md
package dbf
import (
"five/hbrt"
"five/hbrdd"
"fmt"
"os"
"strings"
)
// DBFArea implements the DBF database driver.
// Harbour: struct DBFAREA in dbf1.c
type DBFArea struct {
hbrdd.BaseArea // embed WAAREA defaults
// File handles
dataFile *os.File
filePath string
shared bool
readOnly bool
// Header
header Header
fieldDescs []FieldDesc
offsets []uint16 // field byte offsets within record
// Record buffer — COW: recBuf may point into mmap (read-only) or ownBuf (writable)
recBuf []byte // current record view (mmap slice or ownBuf)
ownBuf []byte // owned writable buffer (allocated once)
recNo uint32 // current record number (1-based)
dirty bool // record buffer modified
recOwned bool // true = recBuf is ownBuf (writable), false = mmap slice (read-only)
// State
recCount uint32
ghost bool // at phantom record (after APPEND)
recLoaded bool // false = recBuf stale, need loadRecord()
// RecCount cache — skip the Seek-to-end syscall when nothing this
// process did has changed and no external invalidation has fired.
// See RecCount() + InvalidateRecCountCache().
recCountCached bool
recCountGen uint64
// Append batch buffer — accumulates records for single write at flush
appendBuf []byte // buffered appended records (not yet written to disk)
appendStart uint32 // first recNo in appendBuf (1-based)
// Pending index inserts for records that were appended but not
// yet indexed. flushRecord walks this set after the body bytes
// reach disk and calls InsertKey on every open index that
// implements IndexWriter — so a follow-up dbSeek finds the new
// record without a manual REINDEX. The set is populated by
// Append() and drained by flushRecord(); typed as a slice (not
// map) because append order matches the desired index walk.
pendingIdxInserts []uint32
// mmap for zero-copy record reads
mmapData []byte
// Memo file (FPT)
memoFile *FPTFile
// Index integration (NTX/CDX)
idxState *indexState
// File locking state (byte-range locks via fcntl)
fileLocked bool // FLOCK() held
lockedRecs map[uint32]bool // records locked by DBRLOCK()
// Field position cache — UPPER(name) → 1-based index.
// Built lazily on first FieldPosCache() call.
// SQLite: "column affinity binding" — O(1) vs O(n) linear scan.
fieldPosMap map[string]int
// SQL NULL bitmap (VFP/Harbour _NullFlags convention).
// nullFieldsIdx: descriptor index of the hidden _NullFlags field,
// or -1 if the table has no nullable columns.
// nullBitOf: user-field descriptor index → bit position within
// the _NullFlags byte range.
nullFieldsIdx int
nullBitOf map[int]int
}
// DBFDriver is the driver factory for DBF files.
type DBFDriver struct{}
func (d *DBFDriver) Name() string { return "DBF" }
func (d *DBFDriver) Open(params hbrdd.OpenParams) (hbrdd.Area, error) {
return openDBF(d, params)
}
func (d *DBFDriver) Create(params hbrdd.CreateParams) (hbrdd.Area, error) {
return createDBF(d, params)
}
func init() {
hbrdd.RegisterDriver(&DBFDriver{})
// Register aliases used by Harbour
hbrdd.RegisterDriver(&dbfAliasDriver{name: "DBFNTX"})
hbrdd.RegisterDriver(&dbfAliasDriver{name: "DBFCDX"})
hbrdd.RegisterDriver(&dbfAliasDriver{name: "DBFFPT"})
// SIX compatible drivers — same DBF engine, different semantics handled at higher level
hbrdd.RegisterDriver(&dbfAliasDriver{name: "SIXCDX"})
hbrdd.RegisterDriver(&dbfAliasDriver{name: "DBFNSX"})
hbrdd.RegisterDriver(&dbfAliasDriver{name: "DBFSIX"})
// Transfer format aliases
hbrdd.RegisterDriver(&dbfAliasDriver{name: "DBFDBT"})
// Bitmap/Rushmore RDD variants
hbrdd.RegisterDriver(&dbfAliasDriver{name: "BMDBFNTX"})
hbrdd.RegisterDriver(&dbfAliasDriver{name: "BMDBFCDX"})
hbrdd.RegisterDriver(&dbfAliasDriver{name: "BMDBFNSX"})
}
// dbfAliasDriver wraps DBFDriver with a different name.
type dbfAliasDriver struct {
name string
}
func (d *dbfAliasDriver) Name() string { return d.name }
func (d *dbfAliasDriver) Open(params hbrdd.OpenParams) (hbrdd.Area, error) {
return openDBF(&DBFDriver{}, params)
}
func (d *dbfAliasDriver) Create(params hbrdd.CreateParams) (hbrdd.Area, error) {
return createDBF(&DBFDriver{}, params)
}
// fptPathFromDBF returns the FPT memo file path for a given DBF path.
func fptPathFromDBF(dbfPath string) string {
base := dbfPath
if idx := strings.LastIndex(base, "."); idx >= 0 {
base = base[:idx]
}
return base + ".fpt"
}
// hasMemoField checks if any field descriptor is a MEMO type.
func hasMemoField(fields []FieldDesc) bool {
for _, f := range fields {
if f.Type == 'M' || f.Type == 'm' {
return true
}
}
return false
}
// --- Open ---
// Harbour: hb_dbfOpen in dbf1.c
func openDBF(drv *DBFDriver, params hbrdd.OpenParams) (*DBFArea, error) {
path := params.Path
if !hasExtension(path) {
path += ".dbf"
}
flag := os.O_RDWR
if params.ReadOnly {
flag = os.O_RDONLY
}
f, err := os.OpenFile(path, flag, 0)
if err != nil {
return nil, fmt.Errorf("open %s: %w", path, err)
}
area := &DBFArea{
dataFile: f,
filePath: path,
shared: params.Shared,
readOnly: params.ReadOnly,
}
area.BaseArea = hbrdd.BaseArea{}
// Step 1: Read header (32 bytes)
hdr, err := ReadHeader(f)
if err != nil {
f.Close()
return nil, err
}
area.header = *hdr
// Step 2: Read field descriptors
fieldCount := hdr.FieldCount()
if fieldCount <= 0 {
f.Close()
return nil, fmt.Errorf("invalid field count: %d", fieldCount)
}
fields, err := ReadFieldDescs(f, fieldCount)
if err != nil {
f.Close()
return nil, err
}
area.fieldDescs = fields
// Step 3: Build field offsets
area.offsets = BuildFieldOffsets(fields)
// Step 4: Validate record length
expectedRecLen := area.offsets[len(fields)]
if uint16(expectedRecLen) != hdr.RecordLen {
// Allow minor discrepancy (some DBF writers add extra bytes)
if uint16(expectedRecLen) > hdr.RecordLen {
f.Close()
return nil, fmt.Errorf("field offsets (%d) exceed record length (%d)",
expectedRecLen, hdr.RecordLen)
}
}
// Step 5: Allocate record buffer
area.ownBuf = make([]byte, hdr.RecordLen)
area.recBuf = area.ownBuf
area.recOwned = true
// Step 6: Set record count (shared mode: recalculate from file size)
if params.Shared {
fileSize, _ := f.Seek(0, 2)
area.recCount = uint32((fileSize - int64(hdr.HeaderLen)) / int64(hdr.RecordLen))
} else {
area.recCount = hdr.RecCount
}
// Step 7: Build FieldInfo for BaseArea. System fields (notably
// the hidden _NullFlags column carrying the SQL NULL bitmap) are
// held in fieldDescs for storage but filtered out of the public
// FieldInfo slice — user-visible counts stay stable.
fieldInfos := make([]hbrdd.FieldInfo, 0, fieldCount)
for _, fd := range fields {
if fd.Flags&FieldFlagSystem != 0 {
continue
}
fieldInfos = append(fieldInfos, hbrdd.FieldInfo{
Name: fd.GetName(),
Type: fd.Type,
Len: int(fd.Len),
Dec: int(fd.Dec),
Flags: fd.Flags,
})
}
area.InitFields(fieldInfos)
area.buildNullIndex()
// Step 8: Auto-open FPT if memo fields exist
if hasMemoField(fields) {
fptPath := fptPathFromDBF(path)
if fpt, err := OpenFPT(fptPath); err == nil {
area.memoFile = fpt
}
// If FPT doesn't exist, memo reads return empty string
}
// Step 9: mmap DBF for zero-copy record reads
area.mmapDBF()
// Step 10: Position at first record
area.FEof = (area.recCount == 0)
if area.recCount > 0 {
area.GoTo(1)
}
return area, nil
}
// --- Create ---
func createDBF(drv *DBFDriver, params hbrdd.CreateParams) (*DBFArea, error) {
path := params.Path
if !hasExtension(path) {
path += ".dbf"
}
f, err := os.Create(path)
if err != nil {
return nil, fmt.Errorf("create %s: %w", path, err)
}
// Build field descriptors
fieldDescs := make([]FieldDesc, len(params.Fields))
for i, fi := range params.Fields {
fieldDescs[i].SetName(fi.Name)
fieldDescs[i].Type = fi.Type
fieldDescs[i].Len = byte(fi.Len)
fieldDescs[i].Dec = byte(fi.Dec)
fieldDescs[i].Flags = fi.Flags
}
// If any user field is nullable, append the hidden _NullFlags
// system column. Must happen before recordLen is tallied so its
// bytes reserve space in the record layout.
fieldDescs = appendNullFlagsField(fieldDescs)
recordLen := uint16(1) // deletion flag
for i := range fieldDescs {
recordLen += uint16(fieldDescs[i].Len)
}
// Build header
hasMemo := hasMemoField(fieldDescs)
headerLen := uint16(HeaderSize + len(fieldDescs)*FieldDescSize + 1) // +1 for terminator
version := byte(VersionDBF3)
if hasMemo {
version = VersionFPT
}
hdr := Header{
Version: version,
RecCount: 0,
HeaderLen: headerLen,
RecordLen: recordLen,
}
hdr.UpdateDate()
// Write header
if err := WriteHeader(f, &hdr); err != nil {
f.Close()
return nil, err
}
// Write field descriptors
if err := WriteFieldDescs(f, fieldDescs); err != nil {
f.Close()
return nil, err
}
// Write header terminator
f.Write([]byte{HeaderTerminator})
// Write EOF marker
f.Write([]byte{EOFMarker})
f.Seek(0, 0)
area := &DBFArea{
dataFile: f,
filePath: path,
header: hdr,
fieldDescs: fieldDescs,
offsets: BuildFieldOffsets(fieldDescs),
ownBuf: make([]byte, recordLen),
recOwned: true,
recCount: 0,
}
area.recBuf = area.ownBuf
fieldInfos := make([]hbrdd.FieldInfo, len(params.Fields))
copy(fieldInfos, params.Fields)
area.InitFields(fieldInfos)
area.buildNullIndex()
area.FEof = true
// Auto-create FPT if memo fields exist
if hasMemo {
fptPath := fptPathFromDBF(path)
fpt, err := CreateFPT(fptPath, FPTDefaultBlock)
if err != nil {
f.Close()
return nil, fmt.Errorf("create memo file: %w", err)
}
area.memoFile = fpt
}
return area, nil
}
// --- Area interface ---
func (a *DBFArea) Driver() hbrdd.Driver { return &DBFDriver{} }
func (a *DBFArea) Close() error {
a.unmapDBF() // unmap before writing
if a.dirty {
a.flushRecord()
}
a.dataFile.WriteAt([]byte{EOFMarker}, a.header.EOFOffset())
a.updateHeader()
// Release any held byte-range locks before closing the fd — POSIX
// drops them implicitly on close, but being explicit avoids races
// with other workareas sharing the same underlying file.
a.releaseAllLocks()
if a.memoFile != nil {
a.memoFile.Close()
a.memoFile = nil
}
err := a.dataFile.Close()
a.BaseArea.Close()
return err
}
// MemoFile returns the FPT memo file, or nil if no memo fields.
func (a *DBFArea) MemoFile() *FPTFile { return a.memoFile }
// FullPath returns the on-disk path of the DBF. For dbInfo(DBI_FULLPATH).
func (a *DBFArea) FullPath() string { return a.filePath }
// IsShared returns true if the area was opened shared.
// For dbInfo(DBI_SHARED).
func (a *DBFArea) IsShared() bool { return a.shared }
// IsReadOnly returns true if the area was opened read-only.
// For dbInfo(DBI_ISREADONLY).
func (a *DBFArea) IsReadOnly() bool { return a.readOnly }
// FieldPosCache returns the 1-based field position for a field name.
// Uses a lazily-built hash map for O(1) lookup instead of O(n) linear scan.
// SQLite: "column affinity binding" — critical for SQL engines that call
// FieldPos() hundreds of thousands of times per query.
func (a *DBFArea) FieldPosCache(name string) int {
if a.fieldPosMap == nil {
a.fieldPosMap = make(map[string]int, len(a.fieldDescs))
for i, fd := range a.fieldDescs {
// Trim null bytes + spaces from the [11]byte field name
n := 0
for n < 11 && fd.Name[n] != 0 {
n++
}
fname := strings.ToUpper(strings.TrimSpace(string(fd.Name[:n])))
a.fieldPosMap[fname] = i + 1
}
}
if pos, ok := a.fieldPosMap[name]; ok {
return pos
}
return 0
}
func (a *DBFArea) Flush() error {
if a.dirty {
if err := a.flushRecord(); err != nil {
return err
}
}
a.unmapDBF()
a.dataFile.WriteAt([]byte{EOFMarker}, a.header.EOFOffset())
a.updateHeader()
err := a.dataFile.Sync()
a.mmapDBF() // re-mmap after flush
return err
}
// --- Movement ---
func (a *DBFArea) RecNo() uint32 { return a.recNo }
func (a *DBFArea) RecCount() (uint32, error) {
if a.shared {
// Shared-mode recount — file size may have grown from another
// process's Append. Skip the syscall on an opt-in cache window
// controlled by recCountCacheGen: callers that don't need
// cross-process freshness (e.g. SqlScan's one-shot row-count
// estimate on a workarea we opened this session) can leave the
// cache warm. Invalidate on our own Append and dbCloseAll.
if a.recCountCached && a.recCountGen == recCountCacheGen {
return a.recCount, nil
}
size, err := a.dataFile.Seek(0, 2)
if err != nil {
return a.recCount, err
}
a.recCount = uint32((size - int64(a.header.HeaderLen)) / int64(a.header.RecordLen))
a.recCountCached = true
a.recCountGen = recCountCacheGen
}
return a.recCount, nil
}
// recCountCacheGen — monotonic generation counter. Bumped by
// InvalidateRecCountCache() so callers that know they've performed
// cross-process-visible writes (or want a fresh sample) can force
// the next RecCount() to re-stat. Default semantics are "fresh is
// not required"; the cache is a hot-path optimization for workloads
// that don't share the file with another writer.
var recCountCacheGen uint64 = 1
// InvalidateRecCountCache bumps the generation counter so every DBFArea's
// cached count becomes stale and the next RecCount() call re-queries the
// filesystem.
func InvalidateRecCountCache() {
recCountCacheGen++
}
func (a *DBFArea) Deleted() bool {
a.loadRecord()
if len(a.recBuf) > 0 {
return a.recBuf[0] == RecordDeleted
}
return false
}
// GoTo positions the cursor at a specific record number.
// Harbour: hb_dbfGoTo in dbf1.c — lazy read: record loaded on first access.
func (a *DBFArea) GoTo(recNo uint32) error {
if a.dirty {
a.flushRecord()
}
a.FFound = false
if recNo == 0 || recNo > a.recCount {
a.recNo = a.recCount + 1
a.FEof = true
a.FBof = (recNo == 0)
a.recLoaded = false
a.recBuf = a.ownBuf
a.recOwned = true
for i := range a.recBuf {
a.recBuf[i] = ' '
}
return nil
}
// Read record — COW: mmap slice reference (zero-copy), fallback to file read
offset := a.header.RecordOffset(recNo)
recLen := int(a.header.RecordLen)
if a.mmapData != nil && int(offset)+recLen <= len(a.mmapData) {
// Zero-copy: recBuf points into mmap (read-only until PutValue)
a.recBuf = a.mmapData[offset : offset+int64(recLen)]
a.recOwned = false
} else if _, err := a.dataFile.ReadAt(a.ownBuf, offset); err != nil {
a.FEof = true
return fmt.Errorf("read record %d: %w", recNo, err)
} else {
a.recBuf = a.ownBuf
a.recOwned = true
}
a.recNo = recNo
a.recLoaded = true
a.FEof = false
a.FBof = false
a.dirty = false
a.ghost = false
return nil
}
// GoTop positions at the first record.
// Harbour: hb_dbfGoTop
func (a *DBFArea) GoTop() error {
// Use index order if active
if a.idxState != nil && a.idxState.current >= 0 {
a.FTop = true
a.FBottom = false
return a.GoTopIndexed()
}
a.FTop = true
a.FBottom = false
if a.recCount == 0 {
a.FEof = true
a.recNo = 1
return nil
}
if err := a.GoTo(1); err != nil {
return err
}
// Skip filtered/deleted records
return a.skipFilter(1)
}
// GoBottom positions at the last record.
// Harbour: hb_dbfGoBottom
func (a *DBFArea) GoBottom() error {
// Use index order if active
if a.idxState != nil && a.idxState.current >= 0 {
a.FTop = false
a.FBottom = true
return a.GoBottomIndexed()
}
a.FTop = false
a.FBottom = true
if a.recCount == 0 {
a.FEof = true
a.recNo = 1
return nil
}
if err := a.GoTo(a.recCount); err != nil {
return err
}
return a.skipFilter(-1)
}
// Skip moves the cursor by count records.
// Harbour: hb_dbfSkip
func (a *DBFArea) Skip(count int64) error {
// Use index order if active
if a.idxState != nil && a.idxState.current >= 0 {
a.FTop = false
a.FBottom = false
return a.SkipIndexed(count)
}
a.FTop = false
a.FBottom = false
if count == 0 {
// Skip 0 = re-evaluate filter at current position
return a.skipFilter(1)
}
if count > 0 {
for i := int64(0); i < count; i++ {
if a.FEof {
break
}
newRec := a.recNo + 1
if newRec > a.recCount {
if a.dirty {
a.flushRecord()
}
a.FEof = true
a.recNo = a.recCount + 1
break
}
if err := a.GoTo(newRec); err != nil {
return err
}
if err := a.skipFilter(1); err != nil {
return err
}
}
} else {
for i := int64(0); i > count; i-- {
if a.recNo <= 1 {
a.FBof = true
break
}
if err := a.GoTo(a.recNo - 1); err != nil {
return err
}
if err := a.skipFilter(-1); err != nil {
return err
}
}
}
return nil
}
// mmapDBF / unmapDBF live in mmap_posix.go (Linux/Darwin, syscall.Mmap)
// and mmap_windows.go (CreateFileMapping / MapViewOfFile). Keeping the
// platform-specific ioctl-ish calls out of this file lets cross-builds
// stay clean.
// loadRecord reads the current record — from mmap or file fallback.
func (a *DBFArea) loadRecord() {
if a.recLoaded || a.FEof || a.recNo == 0 || a.recNo > a.recCount {
return
}
offset := a.header.RecordOffset(a.recNo)
a.dataFile.ReadAt(a.recBuf, offset)
a.recLoaded = true
}
// --- Field access ---
func (a *DBFArea) GetValue(fieldIndex int) (hbrt.Value, error) {
a.loadRecord()
if fieldIndex < 0 || fieldIndex >= len(a.fieldDescs) {
return hbrt.MakeNil(), fmt.Errorf("field index out of range: %d", fieldIndex)
}
if a.FEof {
return hbrt.MakeNil(), nil
}
// SQL NULL: nullable fields check the hidden _NullFlags bitmap
// first; a set bit means the raw bytes carry no meaningful value
// and the reader should surface NIL to the caller.
if a.isFieldNull(fieldIndex) {
return hbrt.MakeNil(), nil
}
fd := &a.fieldDescs[fieldIndex]
// MEMO field: read from FPT and return string
if (fd.Type == 'M' || fd.Type == 'm') && a.memoFile != nil {
blockVal := GetFieldValue(a.recBuf, a.offsets[fieldIndex], fd)
blockNo := uint32(blockVal.AsNumInt())
if blockNo == 0 {
return hbrt.MakeString(""), nil
}
data, err := a.memoFile.ReadMemo(blockNo)
if err != nil {
return hbrt.MakeString(""), nil
}
return hbrt.MakeString(string(data)), nil
}
return GetFieldValue(a.recBuf, a.offsets[fieldIndex], fd), nil
}
func (a *DBFArea) PutValue(fieldIndex int, val hbrt.Value) error {
a.loadRecord()
// COW: promote read-only mmap slice to writable owned buffer
if !a.recOwned {
copy(a.ownBuf, a.recBuf)
a.recBuf = a.ownBuf
a.recOwned = true
}
if a.readOnly {
return fmt.Errorf("table is read-only")
}
if fieldIndex < 0 || fieldIndex >= len(a.fieldDescs) {
return fmt.Errorf("field index out of range: %d", fieldIndex)
}
// SQL NULL handling for nullable fields: a NIL write sets the
// bitmap bit and leaves the raw bytes alone (readers will short-
// circuit via isFieldNull before reaching the type codec). A
// non-NIL write clears the bit so the raw value surfaces again.
if _, nullable := a.nullBitOf[fieldIndex]; nullable {
if val.IsNil() {
a.setFieldNull(fieldIndex, true)
a.dirty = true
return nil
}
a.setFieldNull(fieldIndex, false)
}
fd := &a.fieldDescs[fieldIndex]
// MEMO field: write string to FPT, store block number in DBF
if (fd.Type == 'M' || fd.Type == 'm') && a.memoFile != nil && val.IsString() {
data := []byte(val.AsString())
blockNo, err := a.memoFile.WriteMemo(data)
if err != nil {
return fmt.Errorf("write memo: %w", err)
}
putMemoRef(a.recBuf[a.offsets[fieldIndex]:a.offsets[fieldIndex]+uint16(fd.Len)], fd.Len, blockNo)
a.dirty = true
return nil
}
PutFieldValue(a.recBuf, a.offsets[fieldIndex], fd, val)
a.dirty = true
return nil
}
// --- Record operations ---
// Append adds a new blank record.
// Harbour: hb_dbfAppend — writes are deferred until Flush/Close/GoTo.
func (a *DBFArea) Append() error {
if a.readOnly {
return fmt.Errorf("table is read-only")
}
// Batch consecutive APPENDs: save current dirty record to appendBuf instead of writing to disk.
if a.dirty && a.ghost {
// Previous was also an APPEND — accumulate in batch buffer (no syscall)
recLen := int(a.header.RecordLen)
if a.appendBuf == nil {
a.appendStart = a.recNo
a.appendBuf = make([]byte, 0, recLen*256) // pre-alloc for ~256 records
}
a.appendBuf = append(a.appendBuf, a.recBuf[:recLen]...)
} else if a.dirty {
a.flushRecord()
}
// Unmap — file will grow
if a.mmapData != nil {
a.unmapDBF()
}
// Shared mode: serialize concurrent appends across processes and
// re-read the header so we increment past whatever any peer has
// already committed. Without this, two processes racing
// `dbAppend` both bumped their *local* recCount from the stale
// value they cached at Open time and wrote to the same byte
// range — one record silently overwrote the other.
//
// We also immediately write the updated header to disk *inside*
// the locked region so the next peer to acquire the lock sees
// our reservation. The record body itself is still flushed
// lazily by flushRecord; subsequent peers will pick up the
// reserved slot via the bumped RecCount before our record body
// reaches disk.
if a.shared {
if err := a.lockAppendIntent(); err != nil {
return err
}
defer a.unlockAppendIntent()
if _, err := a.dataFile.Seek(0, 0); err == nil {
if hdr, err := ReadHeader(a.dataFile); err == nil {
if hdr.RecCount > a.recCount {
a.recCount = hdr.RecCount
}
}
}
}
a.recCount++
a.recNo = a.recCount
a.header.RecCount = a.recCount
if a.shared {
// Persist the bumped RecCount immediately so a concurrent
// appender refreshes past our slot in its own locked region.
a.header.UpdateDate()
if _, err := a.dataFile.Seek(0, 0); err == nil {
_ = WriteHeader(a.dataFile, &a.header)
}
}
// Promote to owned buffer for writing
a.recBuf = a.ownBuf
a.recOwned = true
for i := range a.recBuf {
a.recBuf[i] = ' '
}
a.FEof = false
a.FBof = false
a.dirty = true
a.ghost = true
a.recLoaded = true
// Queue the new record for index maintenance. flushRecord drains
// the queue after the body bytes are durable — by that point the
// user has had a chance to PutValue into the appended record
// (the common APPEND BLANK → REPLACE → dbSeek pattern), and the
// key expression resolves against the final field values.
if a.idxState != nil && len(a.idxState.indexes) > 0 {
a.pendingIdxInserts = append(a.pendingIdxInserts, a.recNo)
}
return nil
}
// Delete marks the current record as deleted.
func (a *DBFArea) Delete() error {
if a.readOnly || a.FEof {
return nil
}
if !a.recOwned {
copy(a.ownBuf, a.recBuf)
a.recBuf = a.ownBuf
a.recOwned = true
}
a.recBuf[0] = RecordDeleted
a.dirty = true
return nil
}
// Recall undeletes the current record.
func (a *DBFArea) Recall() error {
if a.readOnly || a.FEof {
return nil
}
if !a.recOwned {
copy(a.ownBuf, a.recBuf)
a.recBuf = a.ownBuf
a.recOwned = true
}
a.recBuf[0] = RecordActive
a.dirty = true
return nil
}
// Pack removes all deleted records.
// Harbour: hb_dbfPack — requires exclusive access.
//
// Crash-safe: writes the surviving records into `<file>.pack.tmp`,
// fsyncs, then atomic-renames over the original. Power loss /
// kill -9 between Reads and Writes of the old in-place rewrite
// could leave the file with overwritten prefix and a tail that
// looked legitimate but contained no original copies of the
// records we'd already advanced past — silent data loss.
func (a *DBFArea) Pack() error {
if a.readOnly {
return fmt.Errorf("table is read-only")
}
if a.shared {
return fmt.Errorf("PACK requires exclusive access")
}
if a.dirty {
a.flushRecord()
}
// Temporarily disable index to avoid indexed navigation during PACK
var savedIdx *indexState
if a.idxState != nil {
savedIdx = a.idxState
a.idxState = nil
}
// Drop mmap so we can do clean reads + so the source file can
// be replaced atomically below.
if a.mmapData != nil {
a.unmapDBF()
}
origPath := a.dataFile.Name()
tmpPath := origPath + ".pack.tmp"
// Best-effort cleanup of any leftover tmp from a prior crash.
_ = os.Remove(tmpPath)
tmpFile, err := os.OpenFile(tmpPath, os.O_RDWR|os.O_CREATE|os.O_TRUNC, 0644)
if err != nil {
return fmt.Errorf("PACK: create temp: %w", err)
}
// Write the same header to the temp; RecCount is patched at the
// end once we know the survivor count.
if _, err := tmpFile.Seek(0, 0); err == nil {
_ = WriteHeader(tmpFile, &a.header)
}
// Pad up to HeaderLen with the original header bytes (field
// descriptors etc.) so the new file has a structurally
// identical envelope. We replay from the source.
hdrBytes := make([]byte, a.header.HeaderLen)
if _, err := a.dataFile.ReadAt(hdrBytes, 0); err == nil {
_, _ = tmpFile.WriteAt(hdrBytes, 0)
}
outRec := uint32(0)
buf := make([]byte, a.header.RecordLen)
for recNo := uint32(1); recNo <= a.recCount; recNo++ {
offset := a.header.RecordOffset(recNo)
if _, err := a.dataFile.ReadAt(buf, offset); err != nil {
tmpFile.Close()
os.Remove(tmpPath)
return err
}
if buf[0] != RecordDeleted {
outRec++
outOffset := a.header.RecordOffset(outRec)
if _, err := tmpFile.WriteAt(buf, outOffset); err != nil {
tmpFile.Close()
os.Remove(tmpPath)
return err
}
}
}
a.recCount = outRec
a.header.RecCount = outRec
// Write EOF marker on the temp.
eofOff := int64(a.header.HeaderLen) + int64(outRec)*int64(a.header.RecordLen)
if _, err := tmpFile.WriteAt([]byte{EOFMarker}, eofOff); err != nil {
tmpFile.Close()
os.Remove(tmpPath)
return err
}
// Patch the survivor count + date into the temp header.
a.header.UpdateDate()
if _, err := tmpFile.Seek(0, 0); err == nil {
_ = WriteHeader(tmpFile, &a.header)
}
// Make the survivor durable before the rename so a crash between
// rename and process exit doesn't leave a truncated temp.
if err := tmpFile.Sync(); err != nil {
tmpFile.Close()
os.Remove(tmpPath)
return err
}
tmpFile.Close()
// Close the current handle, then atomic-rename. POSIX rename
// guarantees readers/writers observe either the old or the
// new contents — never an in-progress half-state.
a.dataFile.Close()
if err := os.Rename(tmpPath, origPath); err != nil {
os.Remove(tmpPath)
return fmt.Errorf("PACK: rename temp: %w", err)
}
// Reopen the (now repacked) file in the same mode and re-mmap.
flag := os.O_RDWR
if a.readOnly {
flag = os.O_RDONLY
}
newF, err := os.OpenFile(origPath, flag, 0644)
if err != nil {
return fmt.Errorf("PACK: reopen: %w", err)
}
a.dataFile = newF
a.mmapDBF()
// Reposition (natural order, no index yet)
if a.recCount > 0 {
a.GoTo(1)
} else {
a.FEof = true
a.recNo = 1
}
// Rebuild indexes (record numbers changed after PACK)
if savedIdx != nil {
a.idxState = savedIdx
if err := a.OrderListRebuild(); err != nil {
return err
}
}
return nil
}
// Zap removes all records.
func (a *DBFArea) Zap() error {
if a.readOnly || a.shared {
return fmt.Errorf("ZAP requires exclusive access")
}
// Save index state
var savedIdx *indexState
if a.idxState != nil {
savedIdx = a.idxState
a.idxState = nil
}
a.recCount = 0
a.header.RecCount = 0
// Truncate to header + EOF
a.dataFile.Truncate(int64(a.header.HeaderLen) + 1)
a.dataFile.WriteAt([]byte{EOFMarker}, int64(a.header.HeaderLen))
a.updateHeader()
a.FEof = true
a.recNo = 1
// Rebuild indexes (empty after ZAP)
if savedIdx != nil {
a.idxState = savedIdx
a.OrderListRebuild() // rebuilds empty indexes
}
return nil
}
// --- Internal ---
func (a *DBFArea) flushRecord() error {
if !a.dirty || a.FEof {
return nil
}
// Flush any accumulated append batch first (single write for all buffered records)
if len(a.appendBuf) > 0 {
offset := a.header.RecordOffset(a.appendStart)
recLen := int(a.header.RecordLen)
// Append current dirty record to batch too
all := append(a.appendBuf, a.recBuf[:recLen]...)
_, err := a.dataFile.WriteAt(all, offset)
a.appendBuf = nil
a.appendStart = 0
if err == nil {
a.dirty = false
a.drainPendingIndexInserts()
}
return err
}
offset := a.header.RecordOffset(a.recNo)
_, err := a.dataFile.WriteAt(a.recBuf, offset)
if err == nil {
a.dirty = false
a.drainPendingIndexInserts()
}
return err
}
// drainPendingIndexInserts pushes any queued APPENDed records into
// every open IndexWriter so a follow-up dbSeek finds them. Engines
// that don't implement IndexWriter (CDX today) are skipped — those
// indexes remain stale and must be REINDEXed manually. Errors from
// InsertKey are swallowed because a partial index update is the
// strict-best worst case; reporting them up the flush stack would
// abort the user's transaction over a recoverable index issue.
func (a *DBFArea) drainPendingIndexInserts() {
if len(a.pendingIdxInserts) == 0 || a.idxState == nil {
a.pendingIdxInserts = nil
return
}
for _, recNo := range a.pendingIdxInserts {
for i, idx := range a.idxState.indexes {
writer, ok := idx.(IndexWriter)
if !ok {
continue
}
if i >= len(a.idxState.keyExprs) {
continue
}
key := a.evalKeyExpr(a.idxState.keyExprs[i], recNo)
_ = writer.InsertKey(key, recNo)
}
}
a.pendingIdxInserts = nil
}
func (a *DBFArea) updateHeader() {
a.header.RecCount = a.recCount
a.header.UpdateDate()
a.dataFile.Seek(0, 0)
WriteHeader(a.dataFile, &a.header)
}
// skipFilter advances cursor past deleted/filtered records.
// Harbour: hb_dbfSkipFilter in dbf1.c
// direction: 1 = forward, -1 = backward
func (a *DBFArea) skipFilter(direction int64) error {
if direction == 0 {
direction = 1
}
setDel := hbrdd.IsSetDeleted != nil && hbrdd.IsSetDeleted()
if !setDel {
return nil
}
for !a.FEof && !a.FBof {
if !a.Deleted() {
break
}
// Move to next/prev record
if direction > 0 {
newRec := a.recNo + 1
if newRec > a.recCount {
a.FEof = true
a.recNo = a.recCount + 1
break
}
if err := a.GoTo(newRec); err != nil {
return err
}
} else {
if a.recNo <= 1 {
a.FBof = true
if err := a.GoTo(1); err != nil {
return err
}
break
}
if err := a.GoTo(a.recNo - 1); err != nil {
return err
}
}
}
return nil
}
// --- Helpers ---
func hasExtension(path string) bool {
for i := len(path) - 1; i >= 0; i-- {
if path[i] == '.' {
return true
}
if path[i] == '/' || path[i] == '\\' {
return false
}
}
return false
}