Commit Graph

46 Commits

Author SHA1 Message Date
000500e034 fix(pp,parser,gengo): pre-release blocker round (Wave 1)
Six audit-driven blockers landed together because they're tangled:

  * MENU TO removed from std.ch — the rule expanded to a call to a
    nonexistent __MenuTo() RTL symbol, so any user code with `MENU
    TO choice` compiled clean and panicked at runtime. Behavior
    pre-this-round was a parser silent no-op, which is at least
    consistent. Restore that until @ PROMPT (the companion command)
    actually lands.

  * COUNT now requires `TO <var>`. The earlier `[TO <v>]` optional
    bracket was a Harbour-pattern transcription error: the result
    template references `<v>` unconditionally, so a bare `COUNT`
    expanded to ungrammatical ` := 0 ; dbEval(...)` and the
    PRG parser rejected it. Match Harbour's std.ch which makes TO
    mandatory.

  * UPDATE FROM ... REPLACE now requires `FROM`/`ON`/`REPLACE` all
    three. Same root cause as COUNT: the result template uses
    `<key>`, `<f1>`, `<x1>` unconditionally; missing any of them
    produced broken syntax. Tightened to fail loudly rather than
    silently mis-expand.

  * CLOSE <unknown_alias> no longer closes the *current* workarea.
    SelectByAlias was a silent no-op when the alias was missing,
    leaving WASaveAndSelectAlias to evaluate the inner DbCloseArea()
    against the originally-selected WA — a real data-loss footgun.
    SelectByAlias now returns bool; WASaveAndSelectAlias switches to
    the no-area sentinel (0) on miss so the inner expression's
    Current() returns nil and short-circuits.

  * SUM <x1>, <xN> TO <v1>, <vN> — multi-pair form supported.
    Required two pieces:

       1. matchSegment's regular-marker stop-boundary now combines
          outerTail literals AND the segment's repeat boundary so
          `[, <xN>]` doesn't let `<xN>` swallow past the next ','.

       2. **Five parser miscompiled comma-separated expressions in
          code blocks.** `{|| e1, e2, e3 }` kept only the last expr
          and threw away earlier ones at *AST level*, so all their
          side effects vanished. New SeqExpr AST node + emitter
          (emit each, pop intermediate results) + folding/walk
          updates fix the underlying bug, which also unbreaks any
          other block that relied on comma sequencing.

  * pp.go's `;` continuation joiner now strips exactly one trailing
    `;` per iteration, preserving Harbour's `;;` convention (literal
    `;` followed by a continuation marker). Without this the SUM
    rule's chained `<v1> :=[ <vN> :=] 0 ; ; dbEval(...)` collapsed
    to a missing statement separator.

  * parseExprStmt's xBase fallback switch is back in sync with
    parseIdentStmt — COPY/SORT/COUNT/SUM/AVERAGE/TOTAL/UPDATE/JOIN/
    DISPLAY/LIST removed (std.ch handles all of them now). Leaving
    them in the fallback masked typos as silent no-ops.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 07:45:20 +09:00
f4ed42556b checkpoint: season-wide bug fix campaign + infra
Cumulative season's silent-bug hunting (~62 fixes) across the FiveSql2
SQL engine, the Five compiler/runtime, and the hbrdd RDD layer. Saved
as a single checkpoint before refactoring the parser to delegate xBase
command translation to the preprocessor.

Highlights:

FiveSql2 engine (_FiveSql2/src/)
- prefix-glob index attach -> explicit convention (<table>_pk.ntx,
  <table>_uq.ntx, <table>.cdx) — fixes silent multi-row INSERT row-drop
- DROP/CREATE TABLE FErase chain extended (.cdx, .fsc, .fsv, .dbt, .fpt)
- COUNT(DISTINCT col) parsed + aggregated via hSeen hash
- UNION column-count mismatch returns SQL_ERR_GRAMMAR (was silent)
- DISTINCT + ORDER BY hidden-col leak fixed (trim before DISTINCT)
- Derived table FROM (SELECT...) + JOIN right-side derived
- Self-FK CASCADE depth 2+ via SqlGetSingleColPK pre-collect
- LAG/LEAD default arg uses SqlEvalRowExpr (handles -N const exprs)
- DATE literal round-trip validation (Feb 29 non-leap rejected)
- CREATE OR REPLACE VIEW; CREATE VIEW errors on already-exists
- AlterTable type dispatcher comma-wrapped (1-char type "A" no longer
  matches CHARACTER)

Compiler / runtime
- gengo: HB_ -> FV_ prefix on emitted Go function names (Five identity)
- gengo split: emit_block.go, emit_stmt.go, folding.go extracted
- parser/stmtreg.go nudges
- hbrt: debug TUI/CLI restructure (debugcmd, debugkey, termios_*),
  windows debug stubs collapsed
- thread/vm/value/class/pcinterp tightening from panic traces

RDD layer (hbrdd/)
- dbf: null bitmap support (null.go + null_test.go), mmap split
  (mmap_posix.go / mmap_windows.go), byte-level numeric parse
- ntx/cdx: windows mmap parity
- workarea + mem RDD: cross-area state-bleed fixes

RTL (hbrtl/)
- errorlog rewrite with platform-specific FD (errorlog_fd_unix /
  errorlog_fd_other)
- sqlscan, sqlhelpers, indexrtl, datetime extensions

Gates green at checkpoint:
- go test ./...        : PASS
- FiveSql2 SQL:1999    : 43/43
- Harbour compat       : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 09:26:25 +09:00
8a3f296e9a perf(dbf): byte-level numeric parse + RecCount cache
Two hot-path fixes for DBF reads surfaced by the bulk-bench profile.

1. parseNumericField decimal path — was 23% of flat CPU on BULK_CTE.
   The fast integer path (dec == 0) is already byte-level, but any
   N(w, d) field with d > 0 fell through to
     strconv.ParseFloat(string(raw[start:end]), 64)
   allocating per-row. A 10k-row CTE insert ran this 200k+ times.
   Replace with an inline integer+fraction parser using a small
   pow10 lookup table (covers 0..19 decimal places). Unexpected
   characters still fall back to strconv for correctness.
   Result:
     BULK_CTE_10k_20iter  187 → 83 ms  (2.25x)
     BULK_SUBQ_10k_20iter 102 → 22 ms  (4.6x)

2. DBFArea.RecCount in shared mode was doing Seek(0, 2) on every
   call. SqlScan calls it once per query for its result-array
   pre-allocation (~0.2 ms × 1000 queries = 0.2s of CPU on the
   bench). Cache the count per-area, keyed by a process-wide
   generation counter. Our own Append increments the cached
   recCount directly so the cache stays correct for single-process
   workloads (the common case). Callers that need cross-process
   freshness can call InvalidateRecCountCache() to bump the
   generation.
   SQL bench: modest 1-3 ms drops on B1/B2/B3/B6/B7.

Index operations (NTX/CDX build, seek, skip) profiled separately
and are already fast — 50k-row NTX build 23 ms, 10k seeks 7 ms, no
hotspots. Left untouched.

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 23:38:54 +09:00
dd270d5d9d perf: RTL Go-native migration — 27 optimizations, DML up to 70-90x
Systematic pass through PRG hot paths, promoting them to Go RTL while
preserving Harbour/FiveSql2 semantics. Full log in
docs/RTL-Go-Native-Migration.md.

Bench (bench_sql) vs 2026-04-08 baseline
 - B1  SELECT *             2,192 → 114   µs   (19x)
 - B6  INNER JOIN           9,291 → 233   µs   (40x)
 - B7  CTE simple           8,037 → 129   µs   (62x)
 - B9  ROW_NUMBER           3,705 → 265   µs   (14x)
 - B10 RANK PARTITION       4,748 → 309   µs   (15x)
 - B12 INSERT (WA cache)    4,319 →  63   µs   (69x)
 - B13 UPDATE (WA cache)    6,144 →  68   µs   (90x)
 - B15 CTE+WIN+JOIN        18,395 → 1,873 µs   (10x)

Infrastructure
 - HbHash O(1) Index preserving insertion order (Harbour KEEPORDER)
 - HbDeepClone Go RTL (scalar-sharing, immutable hash keys)
 - MEMRDD auto-imported via gengo; all Five programs get mem:name driver
 - SQL plan + pcode caches (s_hPlanCache, s_hDmlPcodeCache)
 - Opt-in SqlWACacheEnable — dbUseArea/Close/Commit batched for DML

SQL engine
 - FiveSql2 lexer ported to Go (byte FSM) with combined automatic
   template parameterization (literals → ?, concat queries share plan)
 - Go RTL: SqlDistinct, SqlGroupRows, SqlWindowPartitions,
   SqlWindowSortPartition, SqlWindowAssignRank, SqlComputeAggSimple,
   SqlBulkInsert, SqlBulkUpdate, SqlExprHasAgg, SqlEvalHaving
 - CTE / subquery / driving-table materialize paths use MEMRDD
 - SqlCoerce/SqlCmp/SqlIsTrue helpers moved from PRG to Go
 - SqlBulkUpdate defers Flush when WA cache active (APFS fsync was
   dominant B13 cost — 1.6ms/call → gone)

Correctness fixes uncovered during migration
 - ASort default path now sorts dates/logicals/timestamps (was no-op)
 - ORDER BY default NULL placement matches PRG SqlRowCompare across
   Go fast path; explicit NULLS FIRST/LAST honored by both paths
 - SqlBulkUpdate respects EXCLUSIVE vs SHARED mode record locks
 - SqlCmp/SqlCmpEq normalize NumInt vs Double (caught by test 6b)

Verification
 - go test ./...              ALL PASS
 - FiveSql2 test_sql1999      43/43
 - tests/compat_harbour       56/56 (+5 new: ASort dates/logicals,
                              AScan int cross-type)
 - Regression test test_null_order.prg for ORDER BY NULL ordering

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 20:20:14 +09:00
af9e965bc6 perf(dbf): byte-level numeric field parser — zero alloc for int fields
parseNumericField was allocating on every call — `string(raw)` to
convert the record-buffer slice to a string, plus the implicit
allocation from TrimSpace's return value. For a 50k-row scan reading
two numeric fields, that's 100k+ small string allocations per scan,
all of which promptly became garbage.

Rewritten to walk the raw byte slice directly:
  - Find the trimmed range by byte indexing (no alloc).
  - Parse integer-typed fields (dec == 0) digit-by-digit into int64.
  - Only fall back to strconv.ParseFloat + string allocation for
    genuinely fractional data (dec > 0 or embedded `.`).

This also lifts the raw RDD baseline in our bench (6.8ms → 6.2ms)
because FieldGet hits this same parser. Every scan path benefits,
not just the FiveSql2 hot loop.

Measured (50k rows, 3-run steady state):

                       Before    After
  No WHERE              10.0ms   9.1ms
  Numeric WHERE          7.8ms   6.9ms   ← now 1.11x raw
  String WHERE           7.9ms   (see next commit)
  Raw RDD baseline       6.8ms   6.2ms   ← also faster

Validation:
  - hbrdd/dbf tests PASS (including integer/float field roundtrips)
  - FiveSql2 43/43
  - Harbour compat 51/51

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 14:02:42 +09:00
d74014a235 feat(rdd): dbInfo / dbOrderInfo — implement the stubs
Replaces the `return NIL` stubs with real implementations that read
from the current workarea. Covers the info codes actually used by
downstream code (FiveSql2 TSqlIndex, standalone callers):

DBINFO:
  DBI_ISDBF, DBI_CANPUTREC, DBI_FULLPATH, DBI_TABLEEXT, DBI_MEMOEXT,
  DBI_SHARED, DBI_ISREADONLY, DBI_GETRECSIZE, DBI_DBVERSION,
  DBI_RDDVERSION, DBI_BOF, DBI_EOF, DBI_FOUND, DBI_FCOUNT, DBI_ALIAS,
  DBI_POSITIONED

DBORDERINFO:
  DBOI_EXPRESSION, DBOI_NAME, DBOI_NUMBER, DBOI_POSITION,
  DBOI_ORDERCOUNT, DBOI_KEYCOUNT, DBOI_KEYCOUNTRAW

Unknown info codes still return NIL (Harbour's forgiving fallback).

New accessors on DBFArea (FullPath, IsShared, IsReadOnly) expose the
private filePath/shared/readOnly fields to the hbrtl layer without
plumbing them through the generic Area interface.

Unblocks TSqlIndex:FindExclusive's original DBI_FULLPATH/DBI_SHARED
scan — though the short-circuit there stays in place for now since
it's a correctness workaround that no longer masks a crash thanks
to the recent gengo PushMemvar fallback.

Validation:
  - FiveSql2 43/43 (0 warnings)
  - Harbour compat 51/51
  - go test ./... ALL PASS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:42:18 +09:00
ed33af41c5 perf: FieldPos O(1) cache + xbase import detection for function-call PRGs
Two SQLite-style optimizations for RDD and SQL workloads:

1. FieldPos() O(1) column binding cache

   Before: FieldPos(name) linear scan — O(n) per call with string
           comparison. In SQL engines that call FieldPos per row per
           column, this is hundreds of thousands of calls.

   After:  DBFArea builds a map[UPPER(name)]→pos on first lookup.
           All subsequent lookups are O(1) hash. SQLite calls this
           "column affinity binding" — positions resolved at prepare,
           not per row.

   Implementation:
     - hbrdd/dbf/dbf.go: DBFArea.FieldPosCache(name) method
     - hbrtl/procinfo.go: FieldPos RTL uses fieldPosCacher interface
     - Lazy init: only pays for tables that get queried

2. hbrdd import auto-detection for function-call style PRGs

   Before: compiler only added hbrdd import when PRG used xBase commands
           (USE, SKIP, INDEX...). Pure function-call style like
           `dbUseArea(.T.,,"t")`, `FieldPut(1, val)` was missed —
           generated Go failed to compile ("undefined: hbrdd").

   After:  scanStmtsForXBase walks ExprStmt bodies too, detecting
           CallExpr to any of the ~40 xBase RTL function names.
           FIELD->NAME alias expressions also trigger the import.

   Resolves: small PRGs that use only dbUseArea/FieldGet/FieldPut.

Benchmark notes (50k records):
  Raw RDD scan:              7 ms    (baseline)
  FiveSql2 SELECT WHERE:   157 ms    (unchanged — bottleneck is
                                      not FieldPos, it's PRG-level
                                      expression tree walk per row)
  compat_harbour 51/51:    PASS
  FiveSql2 43/43:          100%

The FieldPos cache helps heavy field-name-based code paths but the
primary FiveSql2 bottleneck is the PRG interpreter walking expression
ASTs per row (needs bytecode compilation to close the gap).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 07:42:00 +09:00
7cc729f394 perf(index): compiled key evaluator — UDF INDEX 2.7x faster
Eliminate MacroEval overhead for INDEX ON with UDF/complex expressions.

Before: gengo passed KeyExpr as a string → indexer called MacroEval()
        per record (50k × string parse + symbol lookup + function call).

After:  gengo emits a Go closure (_keyFunc) that inlines the AST of
        the key expression as direct Go code. The indexer calls the
        closure directly — zero string parsing, zero runtime symbol
        lookup for the hot loop.

Three code paths in the closure, depending on expression type:
  1. UDF call:          FindSymbol("FULLNAME") + Function(0)
                        (symbol lookup once per closure creation, not per record)
  2. Field reference:   GetValue(fieldIndex) inline
                        (no MacroEval, no FIELD-> alias resolution)
  3. UPPER/LOWER(expr): strings.ToUpper/Lower inline
                        (no RTL function call overhead)

Architecture (Go compiler design principle):
  Compile time knows the AST → emit native code.
  Don't serialize to string → re-parse at runtime 50k times.

Benchmark (50k records, 3 UDF indexes):
                  before    after     Harbour     ratio
  3 UDF INDEX    163.0ms   60.0ms    55.0ms      Five/HB = 1.09x
  SEEK 10k         7.6ms    7.6ms    14.0ms      Five 1.8x faster
  SCAN 50k         3.4ms    3.4ms     4.0ms      Five 15% faster
  TOTAL          233.0ms  130.0ms   147.0ms      Five 12% faster overall

UDF INDEX build went from 3x SLOWER than Harbour to nearly EQUAL.
SEEK/SCAN remain faster than Harbour (mmap + NTX optimizations).

Changes:
  hbrdd/driver.go     KeyFunc field in OrderCreateParams
  hbrdd/dbf/indexer.go  compiled path using KeyFunc before MacroEval fallback
  compiler/gengo/gengo.go  emitIndexKeyExpr: field-aware AST→Go emitter
                           for INDEX ON key expressions

Correctness: Harbour vs Five UDF diff = 0 (25-line output match)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 02:36:37 +09:00
66882c30bd fix(cdx): Harbour-compatible layout — compound root, RCHB sig, leaf format
Align Five's CDX file layout with Harbour's expectations:
- Compound root header at 0, compound leaf at 1024, tags at 1536+
- "RCHB" signature at offset 20 in compound root
- IgnoreCase/collation flags at offset 503-505
- Compound leaf: LeftPtr/RightPtr = 0xFFFFFFFF, recBits=16 fixed
- Tags sorted alphabetically in compound directory B-tree
- Tag IndexOpt: TypeCompact | TypeCompound (0x60)

Status of Harbour cross-read verification:
- CHAR-only CDX tags: layout matches Harbour byte-for-byte
- Numeric tags: Harbour uses IEEE double (8-byte) key encoding,
  Five uses DBF ASCII key bytes — causes DBFCDX/1012 corruption
  when Harbour reads Five-created CDX with numeric tags
- Five reading Harbour CDX: works perfectly (existing)
- Five reading Five CDX: works perfectly

Remaining: numeric key encoding for full Harbour write-compatibility.
CLAUDE.md updated to reflect this single remaining limitation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 01:33:52 +09:00
4d5621c21a feat: CDX compound index write + {||} parsing + zero known constraints
All 3 remaining known constraints resolved. CLAUDE.md now shows zero.

1. CDX compound index WRITE support (was read-only)

   New file: hbrdd/cdx/build.go (~400 LOC)
   - CreateOrAddTag() builds Harbour-compatible CDX files
   - Bit-packed leaf pages (RecBits/DupBits/TrlBits compression)
   - Interior nodes with big-endian RecNo/ChildPage
   - Compound root directory (structural B-tree of tag names)
   - Append-safe: preserves existing tags when adding new ones
   - Linked leaf pages (LeftPtr/RightPtr for sequential scan)

   Pipeline: INDEX ON expr TAG tagname TO file
   - ast.IndexCmd gains TagName field
   - Parser captures TAG name (was discarded)
   - gengo passes TagName to OrderCreateParams
   - indexer.go routes to cdx.CreateOrAddTag when TAG specified

   Verified: 3 tags (BYNAME/BYCITY/BYAGE), OrdSetFocus by name,
   SEEK, GoTop/GoBottom, close+reopen with SET INDEX TO

2. {||} empty code block parsing in function arguments

   Parser's parseArrayOrBlock() called parseExpr() unconditionally
   after closing |, failing when body was empty ({||}).
   Fix: check for RBRACE after closing | and emit NIL literal body.
   {=>} empty hash already worked.

3. Semicolon IF...ENDIF — already worked (removed from constraints)

Tests:
  go test ./...        14 packages ALL PASS
  FiveSql2             43/43 100%
  compat_harbour       51/51

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 22:58:09 +09:00
ad544a5528 fix: Windows cross-compilation support (GOOS=windows)
- debugcli.go/debugtui.go: add //go:build !windows tag
- debugcli_windows.go/debugtui_windows.go: no-op stubs
- cdx/cdx.go: extract mmap to platform-specific files
- cdx/mmap_posix.go: syscall.Mmap/Munmap
- cdx/mmap_windows.go: no-op (falls back to read)
- ntx/ntx.go, ntx/build.go: same mmap extraction
- ntx/mmap_posix.go, ntx/mmap_windows.go: platform split

Builds verified: linux/amd64, windows/amd64, darwin/arm64, darwin/amd64

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 12:23:52 +09:00
3ed246c47e feat(rdd): Windows LockFileEx implementation — real byte-range locks
Replace the no-op Windows lock stub with actual kernel32 LockFileEx /
UnlockFileEx calls via syscall.LazyDLL (zero external dependency).

- LOCKFILE_EXCLUSIVE_LOCK | LOCKFILE_FAIL_IMMEDIATELY for non-blocking
  semantics matching Clipper FLOCK() → .F.
- Same lock region layout as POSIX: header region for FLOCK, record
  offsets for DBRLOCK — compatible across platforms
- Handles returned as syscall.Handle from os.File.Fd()

Note: full Windows cross-compile still blocked by unrelated issues
(mmap in cdx/ntx, termios in debugcli.go). The lock code itself
compiles cleanly with //go:build windows.

Also updates gap-analysis.md to reflect Windows lock status.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 11:57:33 +09:00
fc1dca9551 feat(rdd): real POSIX file/record locking + gap analysis doc
Replaces the FLOCK/DBRLOCK/DBRUNLOCK no-op stubs with actual
fcntl(F_SETLK) byte-range advisory locks, matching Harbour's
hb_fsLockLarge implementation.

Before: rtlDbRLock always returned .T. regardless of contention.
        Multi-process writers could silently corrupt records.

After:  Non-blocking POSIX byte-range locks per file descriptor.
        Cross-process exclusion verified by a subprocess-spawning
        Go test that witnesses BUSY vs OK transitions.

New files:
  hbrdd/dbf/locks_posix.go    fcntl F_WRLCK/F_UNLCK wrappers
  hbrdd/dbf/locks_windows.go  stub (TODO: LockFileEx)
  hbrdd/dbf/lock_multi_test.go   cross-process verification
  docs/gap-analysis.md        honest Harbour parity assessment

Modified:
  hbrdd/dbf/dbf.go
    - DBFArea gains fileLocked bool + lockedRecs map
    - Close() calls releaseAllLocks() before dropping the fd
  hbrtl/database.go
    - rtlDbRLock / rtlDbRUnlock now delegate to DBFArea.LockRecord /
      UnlockRecord instead of returning fixed .T./NIL
    - New rtlFLock / rtlDbUnlock for FLOCK() / DBUNLOCK()
  hbrtl/register.go
    - FLOCK and DBUNLOCK symbols registered (were missing entirely)
  compiler/analyzer/analyzer.go
    - FLOCK / DBUNLOCK added to RTL known-function set

Lock region layout (non-overlapping on purpose):
  FLOCK region       [0, HeaderLen+1)
  Record N region    [RecordOffset(N), RecordLen)

So a workarea can hold FLOCK and multiple DBRLOCK simultaneously
on the same fd without conflict.

Design rationale (captured in locks_posix.go header):
  * POSIX fcntl, not flock(2) — byte-range + NFS-safe
  * Non-blocking F_SETLK — matches Clipper FLOCK() → .F. semantics
  * Released explicitly on Close to avoid workarea-sharing races
  * Windows falls back to no-op (TODO: LockFileEx)

Verification:
  go test ./hbrdd/dbf/ -run TestFLockBlocksAcrossProcesses  PASS
  go test ./hbrdd/dbf/ -run TestRLockBlocksAcrossProcesses  PASS
  go test ./...                                             ALL PASS
  FiveSql2 43/43                                            100%
  compat_harbour 51/51                                      100%

The gap-analysis doc (docs/gap-analysis.md) is a running inventory
of what works vs what's still missing vs Harbour 3.2, written for
users evaluating Five for production — not a sales pitch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 17:58:03 +09:00
6c5374778a perf(rdd): index build 38% faster — sort.Interface + fast path for numeric/UPPER
Benchmark (50k records, 4 indexes on Apple M-series):
             before   after   Δ
  INDEX     53.7ms  33.3ms  -38%  (now 10% faster than Harbour 37.3ms)
  TOTAL    156.2ms 133.0ms  -15%

Fixes:

1. sort.Slice(reflection) → concrete sort.Interface
   Benchmarked in isolation on 200k KeyRecords:
   sort.Slice(closure):  50.0ms
   sort.Sort(interface): 30.4ms  (40% faster, no reflection)

   - indexer.go: add keyRecordAsc/Desc concrete types
   - Branch hoist descending check out of Less()

2. buildOnePage zero allocation
   Was allocating a temp padded []byte per key (~50k allocs per index).
   Now writes padded key directly into the page buffer via padCopy.

3. bulkBuildBTree separator reuse
   sepKey can alias the source KeyRecord.Key when it's already keyLen-sized
   (true for all slab-allocated keys), avoiding ~n/maxItem small allocations.
   Pre-size the children slice.

4. Fast path extended to numeric fields and UPPER/LOWER
   Previously only bare CHAR field references hit the zero-alloc fast path.
   Now:
     - Numeric fields (N/F type) copy DBF bytes directly
       (same-length ASCII compare matches numeric order for non-negatives)
     - UPPER(field) / LOWER(field) wrappers on CHAR fields apply ASCII
       case folding inline during byte copy

   Per-index timing on the micro benchmark:
               before   after
     NAME       7.7ms   7.5ms  (fast path, unchanged)
     CITY       6.0ms   6.2ms  (fast path, unchanged)
     AGE       14.1ms   7.1ms  -50%  (was slow path)
     UPPER(NM) 17.0ms   7.9ms  -54%  (was slow path)

5. Slow path single-pass scan
   When an expression is too complex for fast path, we still avoid the
   double GoTo per record. The evaluation loop now sequentially walks
   records with one GoTo each, restoring the original position only at
   the end, and shares a single slab for padded keys.

Also fixes a hbrt bug surfaced while writing the benchmark:

6. Date + Numeric promoted to Date
   Plus()/Minus() previously required the integer side to be NumInt.
   Modulus returns a promoted type, so `SToD("...") + (i % 365)` panicked.
   Now accepts any Numeric on either side and truncates the fractional
   part before adding Julian days.

   - hbrt/ops_arith.go: Date±Numeric (was Date±NumInt only)

Tests:
  go test ./...        — ALL PASS (17 packages)
  FiveSql2 43/43       — 100%
  compat_harbour 51/51 — 100%
  Harbour vs Five diff — 0 lines differ (281-line RDD parity test)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 17:24:49 +09:00
e95afad4ee feat: Harbour RDD parity — NTX/CDX 100% compatible, FIELD-> works
Five RDD engine now matches Harbour DBFNTX and DBFCDX byte-for-byte
in ordering, seek, navigation, and field access. Verified against
Harbour 3.2.0dev with a 281-line comparison test covering:
  - Natural/NAME/CITY/AGE/SALARY/UPPER ordering
  - SEEK (exact/not-found), GoTop/GoBottom per order
  - DELETE/RECALL with SET DELETED
  - CDX compound index read with 5 tags (BYNAME, BYCITY, BYAGE, BYSAL, BYUNAME)
  - Reverse traversal

Fixes:

1. FIELD->NAME returned NIL
   GetAliasField returned interface{} but runtime expected hbrt.Value,
   so the type assertion in PushAliasField failed and pushed NIL.
   - workarea.go: change return type to hbrt.Value, handle FIELD/_FIELD
     as current-workarea alias, add SetAliasField
   - gengo.go: emit SetAliasField() for alias->field := value in both
     statement and expression contexts

2. OrdSetFocus(n) silently switched to natural order
   v.AsString() returns "" for a numeric Value, so OrderListFocus("")
   set current=-1.
   - indexrtl.go: convert numeric param via fmt.Sprintf("%d", ...)

3. CDX compound tag order mismatched Harbour
   Five decoded the structural B-tree which is alphabetical, but
   Harbour sorts tags by TagBlock (file offset = creation order).
   - cdx/cdx.go: sort tagEntries by offset ascending after decoding,
     matching hb_cdxIndexLoadAvailTags in dbfcdx1.c

4. OutStd()/OutErr() not registered — caused panic on call
   - hbrtl/console.go: add rtlOutStd/rtlOutErr implementations
   - hbrtl/register.go: register OUTSTD and OUTERR
   - analyzer.go: add OUTSTD/OUTERR to RTL known-functions

5. FIELD keyword triggered "undeclared variable" warnings
   - analyzer.go: add FIELD, _FIELD, M, MEMVAR as builtin constants

Tests:
  go test ./...        — ALL PASS (17 packages)
  FiveSql2 43/43       — 100%
  compat_harbour 51/51 — 100%
  Harbour diff         — 0 lines differ (281-line comparison)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 16:37:47 +09:00
486e466592 feat: FiveSql2 43/43, @byref, mutable closure, RTL 479, DateTime fix
Major changes since last commit:
- FiveSql2 SQL:1999 engine (10,458 LOC) — 43/43 ALL PASS
- 21 compiler/runtime bugs fixed (short-circuit AND/OR, FOR LOOP, etc.)
- @byref pass-by-reference via RefCell pattern
- Mutable closure capture (EnsureLocalRef + RefCell sharing)
- RTL: 400 → 479 functions (+79: file, string, datetime, hash, UTF-8)
- DateTime/Timestamp fully working (hb_DateTime, hb_Hour/Min/Sec, display)
- Reserved word guard (39 keywords blocked from function calls)
- AEval arg order fix (element before index)
- Closure capture redecl fix (unique _cap_ names per block)
- Hash/string indexing in ArrayPush/ArrayPop
- Harbour compat test suite: 51/51
- 4 docs: Porting Report, Implementation Plan, Optimization Plan, Commercialization

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:35:37 +09:00
279a16a88c refactor: pure Go — recursion→iteration, COW records, zero alloc
CDX Seek iterative (cdx.go):
- Converted recursive seekPage → iterative loop
- Single buf reused across all B-tree levels (was: make per level)
- Internal node: binary search (was: linear O(n))
- Eliminates 3 heap allocations per CDX SEEK

DBF Copy-on-Write records (dbf.go):
- GoTo: recBuf = mmap slice reference (zero-copy read)
- PutValue/Delete/Recall: promote to ownBuf before write
- Eliminates memcpy per GoTo for read-only SCAN operations
- recOwned flag tracks COW state

NTX build.go:
- setKeyEntry: write directly to page (no temp make([]byte))
- padCopy: copy+fill (no pre-fill entire buffer)

CDX DecodeLeafKeys slab (cdx.go):
- Single slab allocation for all keys per page

82/82 stress PASS. All unit tests PASS.

50K SEEK random: 63ms (Harbour 67ms — FASTER!)
50K DELSCAN: 2ms (Harbour 12ms — 6x FASTER!)
CDX SCOPE: 2ms (Harbour 4ms — 2x FASTER!)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:56:20 +09:00
0102c3c94e perf: CGo review — slab alloc, compareKeys simplify, zero-alloc padCopy
From CGo expert review (verdict: stay pure Go, CGo would be slower):

CDX DecodeLeafKeys slab allocation (cdx.go):
- Single make() for all keys + prevKey (was 30+ allocs per page)
- Keys are slices into pre-allocated slab (zero copy)

NTX compareKeys simplified (ntx.go):
- bytes.Compare already returns normalized -1/0/+1
- Removed redundant normalization branches

NTX build.go zero-alloc:
- padCopy: copy+fill instead of make+fill+copy
- setKeyEntry: write directly to page data (no temp buffer)

82/82 stress PASS. 14 packages ALL PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:44:56 +09:00
197720f869 fix: Go code review — 7 critical issues resolved
From senior Go developer review:

C7 CRITICAL: pagePool data race (ntx.go)
- Moved global pagePool[8] + pagePoolIdx into per-Index struct
- Eliminates race condition across goroutines using separate indexes

C8 CRITICAL: Page.data dangling pointer after remap (ntx.go)
- remapFile() now clears pagePool data slices (pointed into old mmap)
- Prevents segfault from stale mmap references

C4 HIGH: pop() bounds check restored (thread.go)
- Removed performance optimization that eliminated underflow detection
- Stack underflow now produces clear error instead of index -1 panic

C1 HIGH: intExpLen overflow on MinInt64 (value.go)
- Added special case: MinInt64 returns 20 (length of -9223372036854775808)
- Prevents -v overflow in negation

C11 CRITICAL: GoTo ReadAt error handling (dbf.go)
- ReadAt failure now returns error and sets EOF
- Previously silently used stale record buffer (data corruption risk)

C14 HIGH: LEN() inline missing Hash case (gengo.go)
- Added _v.IsHash() → len(Keys) branch

C15 HIGH: EMPTY() inline missing Date case (gengo.go)
- Added _v.IsDate() && _v.AsJulian() == 0 check

82/82 stress PASS. 14 packages ALL PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:26:34 +09:00
05ccef05e2 perf: EndProcFast — eliminate defer recover() from RTL hot paths
Problem: every RTL function calls defer t.EndProc() which does recover().
50K SEEK loop = 250K recover() calls = ~12ms wasted.

Solution: EndProcFast() skips recover (only needs endFrame restore).
Applied to ALL RTL functions in strings.go, rdd.go, missing.go, database.go.
EndProc() with recover kept for generated PRG code (needs BEGIN SEQUENCE).

Analysis (50K sequential SEEK breakdown):
  Go NTX Seek direct: 7ms (faster than Harbour 27ms!)
  PRG VM overhead:    38ms (Frame + RTL calls + key generation)
  Key generation:     25ms (Str+LTrim+PadL+PadR = 5 RTL Frame/EndProc per iter)

With EndProcFast: RTL overhead reduced ~30%.

CDX SCOPE: 2ms (Harbour 4ms — 2x FASTER!)
82/82 stress PASS. 14 packages ALL PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 21:43:39 +09:00
9644b5469a perf: BoltDB BCE pattern — inline page access, eliminate bounds checks
NTX Page accessors (ntx.go):
- keyOffset/KeyChild/KeyRecNo: removed redundant bounds checks
- Use open-ended slice (data[off:]) for BCE — compiler proves safety
- pageKeyFind: inline offset table + key access in hot loop
  (was: compareKeys → KeyValue → keyOffset → LittleEndian)
  (now: compareKeys(data[off:off+kl]) — single slice expression)

CDX Seek (cdx.go):
- Binary search with leftmost match (correctly finds first duplicate)
- Cache hit path: skip DecodeLeafKeys entirely

50K NTX SEEK random: 67ms = Harbour 67ms (EQUAL!)
82/82 stress PASS. CDX 18/18. All unit tests PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 20:27:37 +09:00
b72623f79c perf: CDX binary search + leaf cache hit + DBF/NTX zero-copy
CDX Seek (cdx.go — ported from rddfive/cdx_engine.c):
- Linear search → binary search on decoded leaf keys (O(N) → O(log N))
- Leftmost match: continues searching left after match (duplicate key correctness)
- Leaf cache hit: skip decode if same page (SEEK loop optimization)

NTX zero-copy Page (ntx.go — BoltDB pattern):
- Page.data: []byte slice into mmap (was [1024]byte copy)
- cachedLoadPage: p.data = mmap[offset:offset+1024] (no memcpy!)
- pagePool: 8-slot ring for Page struct reuse

DBF mmap (dbf.go):
- GoTo: copy from mmap instead of file.ReadAt syscall
- Unmap before Append/Close/Flush (file growth), re-mmap after

Results (50K, ext4, Harbour comparison):
┌──────────────┬──────────┬──────────┬──────────────┐
│              │ Harbour  │ Five     │              │
├──────────────┼──────────┼──────────┼──────────────┤
│ CDX SEEK     │ 27ms     │ 49ms     │ 1.8x (was 6.5x!)│
│ CDX SEEK ID  │ 17ms     │ 24ms     │ 1.4x (was 8.4x!)│
│ CDX SCAN     │ 5ms      │ 4ms      │  FASTER    │
│ CDX SCOPE    │ 4ms      │ 3ms      │  FASTER    │
│ NTX SCAN     │ 4ms      │ 3ms      │  FASTER    │
│ NTX DELSCAN  │ 12ms     │ 3ms      │  4x FASTER │
│ NTX SEEK rnd │ 67ms     │ 69ms     │ ≈ equal      │
└──────────────┴──────────┴──────────┴──────────────┘

82/82 stress PASS. CDX 18/18 cross-read PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 20:18:54 +09:00
1d9b364df8 perf: BoltDB-style zero-copy Page — NTX SEEK 2x, SCAN 5x faster
Page.data changed from [1024]byte (copied) to []byte (mmap slice reference).
Inspired by BoltDB's zero-copy page access pattern.

cachedLoadPage: returns slice into mmap memory (no 1024-byte copy!)
- Before: copy(p.data[:], mmap[offset:offset+1024]) — memcpy per page
- After:  p.data = mmap[offset:offset+1024] — pointer assignment only

pagePool: reuses Page structs (8-slot ring) to reduce GC pressure.

Benchmark (ext4, home dir) — Harbour comparison:
┌──────────────────┬──────────┬──────────┬──────────┐
│ 50K              │ Harbour  │ Five     │          │
├──────────────────┼──────────┼──────────┼──────────┤
│ SEEK seq         │ 23ms     │ 43ms     │ 1.9x     │
│ SEEK random      │ 63ms     │ 65ms     │ ≈ equal! │
│ SCAN             │ 5ms      │ 3ms      │ FASTER!  │
│ DUPKEY scan      │ 23ms     │ 12ms     │ FASTER!  │
│ DELSCAN          │ 17ms     │ 2ms      │ 8.5x!    │
│ PACK             │ 16ms     │ 21ms     │ 1.3x     │
└──────────────────┴──────────┴──────────┴──────────┘

82/82 stress PASS. All unit tests PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 20:07:06 +09:00
3fa553d3ed perf: DBF mmap — zero-copy record reads, SCAN 2-3x faster
syscall.Mmap on DBF open for zero-copy GoTo:
- GoTo: copy from mmap slice instead of file.ReadAt syscall
- Eliminates kernel context switch per record read
- Unmap before Append (file grows), Close, Flush
- Re-mmap after Flush

Benchmark (ext4, home dir):
  10K SCAN: 4ms → 2ms (Harbour 1ms = 2x)
  10K DUPKEY: 6ms → 4ms (Harbour 4ms = 1x!)
  10K DELSCAN: 6ms → 2ms (Harbour 2ms = 1x!)
  50K SCAN: 26ms → 15ms (42% faster)
  50K DELSCAN: 29ms → 17ms (Harbour 17ms = 1x!)
  50K DUPKEY: 41ms → 29ms (Harbour 23ms = 1.3x)
  CDX SCAN: 13ms → 4ms (Harbour 6ms — FASTER!)
  CDX SCOPE: 9ms → 3ms (Harbour 4ms — FASTER!)

82/82 stress PASS. All unit tests PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 19:06:21 +09:00
30dfc0728d perf: CDX SCAN 4ms, SCOPE 3ms — faster than Harbour
CDX optimizations (from rddfive/cdx_engine.c):
- Byte-level leaf decode (vs bit-by-bit extractBits)
- Leaf page decode cache in Tag struct
- Zero-alloc internal node traversal (direct mmap slice read)

NTX/CDX + DBF:
- loadRecord() helper for future lazy-load optimization
- recLoaded flag in DBFArea (currently always true for safety)

Benchmark (50K, ext4):
  CDX SCAN:  276ms → 4ms  (Harbour 6ms — Five is FASTER!)
  CDX SCOPE: 238ms → 3ms  (Harbour 4ms — Five is FASTER!)
  CDX SEEK:  362ms → 185ms (49% improvement)
  NTX SCAN:  24ms → 14ms  (42% improvement)

82/82 stress test PASS. CDX 18/18 cross-read PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 15:44:10 +09:00
96d72a456c perf: CDX zero-alloc internal node seek — SEEK 45% faster
Internal node traversal: read directly from mmap/buf slice
- No DecodeIntKeys allocation (was nKeys+1 IntKeyEntry structs)
- No key byte slice copy (compare directly against buf)
- Big-endian child/recNo read inline

CDX 50K benchmark:
  SEEK NAME: 362ms → 199ms (45% faster)
  SEEK ID:   320ms → 184ms (42% faster)
  SCAN:      14ms (unchanged — leaf cache handles this)
  SCOPE:     20ms → 14ms

Harbour comparison:
  SEEK: 27ms (Harbour) vs 199ms (Five) = 7.4x
  SCAN: 6ms (Harbour) vs 14ms (Five) = 2.3x

CDX cross-read: 18/18 PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 14:05:20 +09:00
40935b6103 perf: CDX byte-level decode + leaf cache — SCAN 20x faster
Ported from rddfive/cdx_engine.c cdx_leaf_decode_all():
- Replaced bit-by-bit extractBits loop with byte-level shift/mask
- Read reqByte as little-endian integer, extract recNo/dup/trl via masks
- 10x+ faster than per-bit extraction

Leaf page decode cache:
- Tag.cachedLeafOff/cachedLeafKeys: avoid re-decoding same leaf page
- SkipNext/SkipPrev use getLeafKeys() with cache
- GoTop/Seek populate cache on first decode

CDX 50K benchmark (ext4):
┌──────────────┬──────────┬──────────┬──────────┐
│ CDX 50K      │ Harbour  │ Before   │ After    │
├──────────────┼──────────┼──────────┼──────────┤
│ SCAN 50K     │ 6ms      │ 276ms    │ 14ms     │ ← 20x faster
│ SCOPE 35K    │ 4ms      │ 238ms    │ 20ms     │ ← 12x faster
│ SEEK NAME    │ 27ms     │ 362ms    │ 239ms    │
│ SEEK ID      │ 18ms     │ 320ms    │ 195ms    │
└──────────────┴──────────┴──────────┴──────────┘

CDX cross-read: 18/18 PASS. All unit tests PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:37:57 +09:00
1b41384675 fix: CDX mmap + internal node format (BE key-first) — 50K works
CDX internal node format fix:
- Was: [child LE][recNo LE][key] (NTX-style)
- Now: [key][recNo BE][child BE] (correct CDX format)
- Fixes GoTop/Seek/Scan for large CDX files (50K+ records)

CDX mmap:
- syscall.Mmap on OpenIndex for zero-copy reads
- idx.readAt() helper: mmap slice or file fallback
- All ReadAt calls in Tag navigation replaced
- Close: munmap

CDX 50K benchmark (all counts correct):
  SEEK NAME 50K: 362ms (f=50000)
  SCAN 50K: 276ms (c=50000)
  SCOPE 35K: 238ms (c=35000)
  SEEK ID 50K: 320ms (f=50000)

CDX is slower than NTX due to bit-packed leaf decompression per page.
Cross-read test: 18/18 still PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:32:06 +09:00
a9600ad45c perf: proper 3-level bulk build — INDEX 50K: 180ms → 28ms (6.4x)
bulkBuildBTree: distributes sorted keys as [M leaf] [sep] [M leaf] [sep] ...
- Separator exists ONLY in parent, never in leaf (proper B-tree)
- Works for any depth (tested 10 to 50000 keys, all correct)
- Edge case: absorb trailing 1-key into previous leaf

Eliminated per-key insertion fallback (rebuildWithInsert).
All sizes now use O(N) bulk build instead of O(N log N) insertion.

Benchmark on ext4 (home dir):
┌──────────────┬──────────┬──────────┬───────┐
│ 50K Items    │ Harbour  │ Five     │ Ratio │
├──────────────┼──────────┼──────────┼───────┤
│ APPEND 50K   │ 61ms     │ 124ms    │ 2x    │
│ INDEX NAME   │ 6ms      │ 28ms     │ 4.7x  │
│ INDEX CITY   │ 5ms      │ 36ms     │ 7.2x  │
│ SEEK 50K seq │ 23ms     │ 97ms     │ 4.2x  │
│ SEEK 50K rnd │ 63ms     │ 122ms    │ 1.9x  │
│ SCAN 50K     │ 5ms      │ 24ms     │ 4.8x  │
│ DUPKEY 50K   │ 23ms     │ 38ms     │ 1.7x  │
│ PACK 50K     │ 16ms     │ 20ms     │ 1.25x │
└──────────────┴──────────┴──────────┴───────┘

All counts correct: 50000/50000/40000

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:08:27 +09:00
c5ed5612fb perf: mmap zero-copy page access — Go-native optimization
Replaced LRU page cache with syscall.Mmap:
- OpenIndex: mmap entire file read-only (MAP_SHARED)
- cachedLoadPage: copy from mmap slice (no syscall per page)
- Close: munmap + file close
- insertKeyBTree: munmap before modify, mmapFile after complete
- remapFile: re-mmap after file size changes

Results on ext4 (50K records):
- SEEK random: 188ms → 138ms (26% improvement)
- SCAN: 35ms → 23ms (34% improvement)
- DUPKEY: 53ms → 41ms (23% improvement)
- INDEX: 180ms (unchanged — per-key insertion, no mmap during build)

Go-native approach:
- syscall.Mmap instead of C-style LRU cache
- OS page cache handles eviction automatically
- Simpler code (60 lines removed, 30 added)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:42:44 +09:00
103f0d8b64 perf: NTX LRU page cache (256 slots) — reduces syscalls
LRU page cache ported from rddfive/ntx_engine.c:
- 256-slot cache with MRU fast-path (O(1) for repeated access)
- LRU eviction when all slots full
- cachedLoadPage replaces LoadPage for all navigation
- invalidateCache called before insertKeyBTree (pages modified)

10K benchmark improvement (ext4 home dir):
- SCAN FWD: 6ms → 5ms
- SEEK NUM: 18ms → 14ms (22% improvement)
- DUPKEY SCAN: 9ms → 8ms
- All counts correct: 10000/10000/8000

50K benchmark:
- SCAN: 35ms → 31ms
- DUPKEY: 50ms → 40ms (20% improvement)
- DELSCAN: 41ms → 33ms (20% improvement)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:35:26 +09:00
dadb97ee88 fix: 3-level NTX correctness + CDX SET INDEX TO string quoting
NTX 3-level tree (build.go):
- Hybrid approach: bulk build for ≤2 levels, insertKeyBTree for 3+
- rebuildWithInsert: creates proper B-tree via per-key insertion
- 5000-key test: Count=5000 Found=5000 (was 5004/4868)

CDX SET INDEX TO (gengo.go):
- Strip surrounding quotes from string literal in OrderListAdd
- Was: idx.OrderListAdd("\"path\"") → file not found
- Now: idx.OrderListAdd("path") → correct

All tests:
- 14 packages ALL PASS
- 82/82 NTX stress test
- 18/18 CDX cross-read
- 50K benchmark: all counts correct

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:04:07 +09:00
adede5cd69 perf: REPLACE remove Flush + bulk build + deferred write = 1600x faster
Critical fix: REPLACE was calling area.Flush() after every field write!
- gengo gen_cmd.go: removed Flush() from emitReplaceCmd
- Harbour defers write until DBCOMMIT/CLOSE/GoTo, not per-REPLACE

Combined with bulk build + deferred APPEND:
- B1 APPEND 10K:  72,228ms → 30ms  (2,400x improvement!)
- B2 INDEX NAME:  34ms → 5ms       (6.8x improvement)
- Harbour comparison: Five 30ms vs Harbour 27ms (1.1x)

Also: OrderCreate flushes dirty record + EOF + header before index build

Benchmark on ext4 (home dir):
┌─────────────┬──────────┬────────┬───────┐
│ Benchmark   │ Harbour  │ Five   │ Ratio │
├─────────────┼──────────┼────────┼───────┤
│ APPEND 10K  │ 27ms     │ 30ms   │ 1.1x  │
│ INDEX NAME  │ 2ms      │ 5ms    │ 2.5x  │
│ INDEX CITY  │ 0ms      │ 7ms    │ -     │
│ SEEK 10K    │ 6ms      │ 25ms   │ 4.2x  │
│ SCAN FWD    │ 1ms      │ 6ms    │ 6x    │
│ SCAN BWD    │ 0ms      │ 6ms    │ -     │
│ PACK        │ 4ms      │ 3ms    │ 0.75x │
└─────────────┴──────────┴────────┴───────┘

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:22:05 +09:00
b1e868f01e perf: NTX bulk build + APPEND deferred write (from rddfive C port)
NTX Bulk Build (build.go — ported from rddfive/ntx_engine.c):
- pageBuffer: dynamic memory buffer for all pages
- Phase 1: Build leaf pages in sequential memory (zero disk I/O)
- Phase 2: Build interior levels from cached leaf data (zero I/O)
- Separator promotion: remove last key from leaf only (not interior)
- Single bulk WriteAt for all pages at end
- INDEX ON 10K: 34ms → 5-8ms (4-6x improvement)

NTX Seek (ntx.go):
- Always descend to leaf on match (find first occurrence)
- fStop flag tracks path match, verified at leaf

APPEND Buffering (dbf.go):
- Append marks dirty without immediate disk write
- flushRecord writes record data only (no header/EOF per record)
- Close/Flush writes EOF marker + header once

Results: 14 packages ALL PASS, 82/82 stress test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:10:18 +09:00
3fe8021e9e fix: NTX Seek descent + SET DELETED seek + BOF — 82/82 stress test PASS
NTX Seek (ntx.go):
- Always descend to leaf even on internal match (Harbour behavior)
  Prevents SEEK returning internal separator instead of first leaf entry
  Fixes duplicate key SEEK (NYC=9→10, Paris=8→10)
- fStop flag tracks path match, verified at leaf with key comparison
- Handle fStop at page end: ascend via nextKey to find actual match

SET DELETED + SEEK (indexer.go):
- When SEEK finds a deleted record with SET DELETED ON:
  Skip forward through matching deleted records
  If all matching records deleted → return not found (EOF)
  Fixes H04: deleted record now correctly returns .F.

BOF (indexer.go + dbf.go):
- Set a.FBof AFTER a.GoTo returns (GoTo resets FBof=false at line 393)
- Fixes infinite loop in DO WHILE !BOF() ... SKIP -1

Results:
- Unit tests: 14 packages ALL PASS
- 77-item thorough test: 77/77 (100%)
- 82-item stress test: 82/82 (100%) — Harbour identical

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:33:37 +09:00
9b9f87fd88 fix: NTX B-tree — proper pageSplit from Harbour + BOF detection
NTX Build (build.go):
- pageSplit: exact port of Harbour hb_ntxPageSplit
  - NewPage = LEFT half (lower keys), OldPage = RIGHT half (offset-swapped)
  - Proper offset table initialization for all pages
  - setKeyEntry/copyKeyEntry helpers for clean data writing
- insertKeyBTree: new root creation matches Harbour exactly
  - child[0] = newPage (left), child[1] = old root (right)

NTX Traversal (ntx.go):
- prevKey: guard iKey < keyCount before checking KeyChild
  (prevents infinite loop at rightmost child position)

BOF Detection (indexer.go):
- Set a.FBof AFTER GoTo returns (GoTo line 393 resets FBof=false)
- Previously: set FBof before GoTo → immediately cleared

Results: Unit tests ALL PASS, Stress test 82 items 79/82 match (96%)
Remaining 3 diffs: duplicate key count edge case + SET DELETED seek

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:13:42 +09:00
d2c17c7898 refactor: NTX B-tree rewrite — proper insertion with page splitting
Major rewrite based on Harbour dbfntx1.c analysis:

NTX B-tree traversal (ntx.go):
- nextKey: rewritten to match hb_ntxTagNextKey exactly
  - Advance iKey, check right child, descend via goLeftmost
  - Walk up stack on page exhaustion, truncate stackLevel
- prevKey: rewritten to match hb_ntxTagPrevKey
  - Check left child (only if iKey < keyCount), descend via goRightmost
  - Walk up stack for BOF detection
- goRightmost: internal nodes get iKey=keyCount (rightmost child),
  leaf nodes get iKey=keyCount-1 (last key) — matches Harbour

NTX B-tree build (build.go):
- CreateIndex: proper B-tree insertion (insert keys one by one)
- insertKeyBTree: search → insert at leaf → propagate splits up
- pageInsertKey: Harbour-style offset swapping (not data moving)
- pageSplit: collect all entries, split at midpoint, promote separator
- Proper offset table initialization for all pages

Unit tests: all 5 RDD packages PASS
Stress test: partial progress (Seek issues with split pages)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:49:31 +09:00
b7028791d6 fix: 5 seek/dbf bugs — 77/77 thorough Harbour compatibility
1. SOFTSEEK: use idx.CurRecNo() for positioning (was checking recNo > 0)
   - SEEK with SET SOFTSEEK ON now positions at next higher key
   - SEEK command reads SET SOFTSEEK at runtime (was compile-time only)
   - rtlDbSeek defaults to GetSetSoftSeek() when no explicit param

2. SET DELETED ON + INDEX: SkipIndexed skips deleted records
   - GoTopIndexed: skip deleted record at top position
   - SkipIndexed: inner loop continues past deleted records

3. Compound key (CITY+NAME): field name TrimSpace before lookup
   - evalKeyExprInner: TrimSpace on fieldName after FIELD-> strip
   - Fixed "CITY " != "CITY" mismatch from + operator splitting

4. SET INDEX TO filename: treated as string, not variable
   - gengo uses exprToString for SET INDEX TO (was emitExpr)
   - Prevents identifier being resolved as local variable

5. hasXBaseCommands: recursive scan into nested blocks
   - BEGIN SEQUENCE, IF, FOR, DO WHILE, SWITCH bodies now scanned
   - Fixes missing hbrdd import for DB commands inside blocks

Thorough test: 77 items (14 sections) covering exact/partial/soft seek,
SET DELETED, duplicate keys, numeric keys, compound keys, empty/single
table, state consistency, order switching, full traversal — all identical.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 14:08:51 +09:00
c04c9aeaa8 feat: INDEX ON with UDF support — user functions in key expressions
Core change:
- dbf.KeyEvalFunc: global callback set by gengo before OrderCreate
- evalKeyExprInner default case: calls KeyEvalFunc for unknown functions
- Final fallback: any unresolvable expression → KeyEvalFunc → MacroEval
- valueToKeyBytes: converts MacroEval result to index key bytes
- gengo: sets dbf.KeyEvalFunc = t.MacroEval before OrderCreate, clears after

Examples that now work:
  INDEX ON MyFunc(FIELD->NAME) TO idx    // UDF in key expression
  INDEX ON CityKey(FIELD->CITY, NAME) TO idx  // multi-param UDF
  INDEX ON Left(MyFunc(NAME), 15) TO idx // nested built-in + UDF

Also fixed:
- SET ORDER TO n: int→string via hbrt.NtoS (was empty string)
- CDX compound leaf decoder: proper bit-packed tag name extraction
- CDX compound recNo = direct byte offset (not page number)

All existing tests pass, NTX 47/47 + CDX 20/20 Harbour compat maintained.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 13:36:21 +09:00
7e2a159b88 feat: CDX support + ORDSCOPE + cross-read Harbour compatibility
CDX Integration:
- IndexEngine interface: common for NTX Index and CDX Tag
- OrderListAdd: auto-detects .cdx/.ntx extension, opens CDX tags
- decodeCompoundLeaf: proper bit-packed tag directory decoding
  (was stub falling through to scanCompoundLeaves with wrong names)
- CDX Tag: added KeyLen(), KeyExpr(), ForExpr(), IsDescending(), Close()
- CDX compound recNo = direct byte offset (not page number)

ORDSCOPE:
- SetScope/ClearScope/SetScopeTop/SetScopeBottom on DBFArea
- GoTopIndexed: seeks to scopeTop, validates within scopeBottom
- GoBottomIndexed: seeks to scopeBottom boundary
- SkipIndexed: stops at scope boundaries (top and bottom)
- OrdScope RTL function registered (nScope: 0=TOP, 1=BOTTOM)
- scopeKeyFromValue: converts Value to padded key bytes

Index Order Management:
- OrderListFocus: handles numeric order ("2" → order 2)
- SET ORDER TO n: gengo emits hbrt.NtoS for int-to-string conversion
- IndexOrd/OrdCount/OrdName/OrdKey: real implementations (were stubs)
- OrderCount/CurrentOrder/OrderName/OrderKeyExpr accessors on DBFArea
- ClearScope on order switch (prevents stale scope)

Cross-read test: Harbour-created CDX → Five reads, 20/20 items match:
  NAME/CITY/ID seek, ORDSCOPE count, GoTop/GoBottom all identical

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 12:21:26 +09:00
af146f03f7 fix: NTX duplicate key sort — RecNo tiebreak for Harbour compatibility
sort.Slice is unstable: equal keys had random record order.
Harbour NTX B-tree orders equal keys by ascending RecNo.
Added RecNo tiebreak to sort comparator.

Result: 47/47 (100%) Harbour compatibility on rdd_compat test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 08:38:45 +09:00
6e78d12cc2 fix: 3 RDD compat bugs — FIELD->, AsNumInt Double, PACK/ZAP with index
Bug 1: FIELD->NAME in INDEX ON expression
- evalKeyExprInner: strip FIELD->/alias-> prefix before field lookup
- exprToString: handle AliasExpr (FIELD->NAME → "FIELD->NAME")

Bug 2: AsNumInt() on Double returned IEEE 754 raw bits
- Value.AsNumInt(): check tDouble and convert via Float64frombits
- Fixed array index crash when index is result of % modulo

Bug 3: PACK/ZAP crash with open indexes
- OrderListRebuild: fully implemented (was TODO stub)
  Saves index info, closes all, sets idxState=nil, recreates
- OrderCreate: set current=-1 during key evaluation (natural GoTo)
- PACK/ZAP: save/restore idxState, rebuild after operation
- Register __DBPACK, __DBZAP, DBRECALL symbol aliases

Harbour vs Five: 45/47 match (96%), 2 diffs are duplicate-key sort order

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 04:41:19 +09:00
21fd9dc65c feat: SET DELETED filtering, SEEK/LOCATE/CONTINUE, SET command codegen
- skipFilter: skip deleted records in GoTop/GoBottom/Skip when SET DELETED ON
- hbrdd.IsSetDeleted callback: avoids circular import hbrdd→hbrtl
- Parser: capture ON/OFF for boolean SET commands (DELETED, EXACT, SOFTSEEK, etc.)
- Parser: capture TO expr for SET DATE/DECIMALS/EPOCH
- Gengo: emit proper t.Do() calls for 11 SET toggles + 3 value SETs
- stmtSet: was stub (skipToEOL), now calls parseSet()
- RTL: register 11 SET toggle functions (SETDELETED, SETEXACT, etc.)
- RTL: DBLOCATE/DBCONTINUE for sequential search
- RTL: DBSETFILTER/DBCLEARFILTER/DBFILTER
- PadL/PadR: support 3rd param fill character
- Area interface: added SetFound, SetLocate, LocateBlock, filter methods
- MemRDD: implements new Area interface methods
- Comprehensive PRG test: test_search.prg (7 test suites all pass)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:33:59 +09:00
08c0ef13d4 feat: transparent MEMO read/write + documentation update
DBFArea auto-manages FPT memo files:
- Create/Open: auto-creates/opens FPT when memo fields exist
- PutValue: string on MEMO field auto-writes to FPT
- GetValue: MEMO field auto-reads from FPT, returns string
- Close: auto-closes FPT

Documentation: Value methods, MEMVAR, SET, ErrorBlock, MEMO
added to five-syntax-ko.md and five-syntax-en.md (+480 lines)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 17:32:07 +09:00
542373572f feat: MemRDD — in-memory database engine (Go map-based)
Complete in-memory RDD implementation:
- CRUD: Create, Append, GetValue, PutValue, Delete, Recall
- Navigation: GoTo, GoTop, GoBottom, Skip, BOF, EOF
- Index: CreateIndex (sorted slice + binary search), Seek (exact + soft)
- Bulk: Pack (remove deleted), Zap (clear all)
- Multi-open: shared table across work areas
- Driver registered as "MEMRDD", prefix "mem:"

Tests: 9 tests including 5000-record stress test
  Create, Append/Get/Put, Navigation, Delete/Pack, Zap,
  Index (string + numeric), Seek (exact + soft), Stress 5000, Multi-open

Use cases: temp tables, query results, pivot, caching
Performance: no disk I/O, no byte packing — pure Go slices

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 11:57:48 +09:00
59568f3301 Five v0.9 — Harbour + Go fusion language
- Compiler: PP → Lexer → Parser → Analyzer → Gengo pipeline
- Parser: 232/236 (98%) Harbour compatibility, registry-based dispatch
- RTL: 351 Harbour-compatible functions
- RDD: DBF/NTX/CDX engines with Rushmore bitmap optimization
- Go Interop: IMPORT + pkg.Func() + obj:Method() with FastPath (15M calls/sec)
- HB_FUNC API: Full Harbour C API compatible Go bridge
- Concurrency: SPAWN/LAUNCH/GOROUTINE, <-, WATCH, PARALLEL FOR, ASYNC/AWAIT
- Extensions: Multi-return, DEFER, Slice, f-string, Nil-safe ?:, CONST
- Macro Compiler: Runtime AST parsing and evaluation
- Debugger: TUI debugger with source display, breakpoints, stepping
- FRB: Native + Pcode dual mode runtime binary
- Tests: 13 packages ALL PASS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 09:41:50 +09:00