19 Commits

Author SHA1 Message Date
f4ed42556b checkpoint: season-wide bug fix campaign + infra
Cumulative season's silent-bug hunting (~62 fixes) across the FiveSql2
SQL engine, the Five compiler/runtime, and the hbrdd RDD layer. Saved
as a single checkpoint before refactoring the parser to delegate xBase
command translation to the preprocessor.

Highlights:

FiveSql2 engine (_FiveSql2/src/)
- prefix-glob index attach -> explicit convention (<table>_pk.ntx,
  <table>_uq.ntx, <table>.cdx) — fixes silent multi-row INSERT row-drop
- DROP/CREATE TABLE FErase chain extended (.cdx, .fsc, .fsv, .dbt, .fpt)
- COUNT(DISTINCT col) parsed + aggregated via hSeen hash
- UNION column-count mismatch returns SQL_ERR_GRAMMAR (was silent)
- DISTINCT + ORDER BY hidden-col leak fixed (trim before DISTINCT)
- Derived table FROM (SELECT...) + JOIN right-side derived
- Self-FK CASCADE depth 2+ via SqlGetSingleColPK pre-collect
- LAG/LEAD default arg uses SqlEvalRowExpr (handles -N const exprs)
- DATE literal round-trip validation (Feb 29 non-leap rejected)
- CREATE OR REPLACE VIEW; CREATE VIEW errors on already-exists
- AlterTable type dispatcher comma-wrapped (1-char type "A" no longer
  matches CHARACTER)

Compiler / runtime
- gengo: HB_ -> FV_ prefix on emitted Go function names (Five identity)
- gengo split: emit_block.go, emit_stmt.go, folding.go extracted
- parser/stmtreg.go nudges
- hbrt: debug TUI/CLI restructure (debugcmd, debugkey, termios_*),
  windows debug stubs collapsed
- thread/vm/value/class/pcinterp tightening from panic traces

RDD layer (hbrdd/)
- dbf: null bitmap support (null.go + null_test.go), mmap split
  (mmap_posix.go / mmap_windows.go), byte-level numeric parse
- ntx/cdx: windows mmap parity
- workarea + mem RDD: cross-area state-bleed fixes

RTL (hbrtl/)
- errorlog rewrite with platform-specific FD (errorlog_fd_unix /
  errorlog_fd_other)
- sqlscan, sqlhelpers, indexrtl, datetime extensions

Gates green at checkpoint:
- go test ./...        : PASS
- FiveSql2 SQL:1999    : 43/43
- Harbour compat       : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 09:26:25 +09:00
ad544a5528 fix: Windows cross-compilation support (GOOS=windows)
- debugcli.go/debugtui.go: add //go:build !windows tag
- debugcli_windows.go/debugtui_windows.go: no-op stubs
- cdx/cdx.go: extract mmap to platform-specific files
- cdx/mmap_posix.go: syscall.Mmap/Munmap
- cdx/mmap_windows.go: no-op (falls back to read)
- ntx/ntx.go, ntx/build.go: same mmap extraction
- ntx/mmap_posix.go, ntx/mmap_windows.go: platform split

Builds verified: linux/amd64, windows/amd64, darwin/arm64, darwin/amd64

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 12:23:52 +09:00
6c5374778a perf(rdd): index build 38% faster — sort.Interface + fast path for numeric/UPPER
Benchmark (50k records, 4 indexes on Apple M-series):
             before   after   Δ
  INDEX     53.7ms  33.3ms  -38%  (now 10% faster than Harbour 37.3ms)
  TOTAL    156.2ms 133.0ms  -15%

Fixes:

1. sort.Slice(reflection) → concrete sort.Interface
   Benchmarked in isolation on 200k KeyRecords:
   sort.Slice(closure):  50.0ms
   sort.Sort(interface): 30.4ms  (40% faster, no reflection)

   - indexer.go: add keyRecordAsc/Desc concrete types
   - Branch hoist descending check out of Less()

2. buildOnePage zero allocation
   Was allocating a temp padded []byte per key (~50k allocs per index).
   Now writes padded key directly into the page buffer via padCopy.

3. bulkBuildBTree separator reuse
   sepKey can alias the source KeyRecord.Key when it's already keyLen-sized
   (true for all slab-allocated keys), avoiding ~n/maxItem small allocations.
   Pre-size the children slice.

4. Fast path extended to numeric fields and UPPER/LOWER
   Previously only bare CHAR field references hit the zero-alloc fast path.
   Now:
     - Numeric fields (N/F type) copy DBF bytes directly
       (same-length ASCII compare matches numeric order for non-negatives)
     - UPPER(field) / LOWER(field) wrappers on CHAR fields apply ASCII
       case folding inline during byte copy

   Per-index timing on the micro benchmark:
               before   after
     NAME       7.7ms   7.5ms  (fast path, unchanged)
     CITY       6.0ms   6.2ms  (fast path, unchanged)
     AGE       14.1ms   7.1ms  -50%  (was slow path)
     UPPER(NM) 17.0ms   7.9ms  -54%  (was slow path)

5. Slow path single-pass scan
   When an expression is too complex for fast path, we still avoid the
   double GoTo per record. The evaluation loop now sequentially walks
   records with one GoTo each, restoring the original position only at
   the end, and shares a single slab for padded keys.

Also fixes a hbrt bug surfaced while writing the benchmark:

6. Date + Numeric promoted to Date
   Plus()/Minus() previously required the integer side to be NumInt.
   Modulus returns a promoted type, so `SToD("...") + (i % 365)` panicked.
   Now accepts any Numeric on either side and truncates the fractional
   part before adding Julian days.

   - hbrt/ops_arith.go: Date±Numeric (was Date±NumInt only)

Tests:
  go test ./...        — ALL PASS (17 packages)
  FiveSql2 43/43       — 100%
  compat_harbour 51/51 — 100%
  Harbour vs Five diff — 0 lines differ (281-line RDD parity test)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 17:24:49 +09:00
486e466592 feat: FiveSql2 43/43, @byref, mutable closure, RTL 479, DateTime fix
Major changes since last commit:
- FiveSql2 SQL:1999 engine (10,458 LOC) — 43/43 ALL PASS
- 21 compiler/runtime bugs fixed (short-circuit AND/OR, FOR LOOP, etc.)
- @byref pass-by-reference via RefCell pattern
- Mutable closure capture (EnsureLocalRef + RefCell sharing)
- RTL: 400 → 479 functions (+79: file, string, datetime, hash, UTF-8)
- DateTime/Timestamp fully working (hb_DateTime, hb_Hour/Min/Sec, display)
- Reserved word guard (39 keywords blocked from function calls)
- AEval arg order fix (element before index)
- Closure capture redecl fix (unique _cap_ names per block)
- Hash/string indexing in ArrayPush/ArrayPop
- Harbour compat test suite: 51/51
- 4 docs: Porting Report, Implementation Plan, Optimization Plan, Commercialization

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:35:37 +09:00
0102c3c94e perf: CGo review — slab alloc, compareKeys simplify, zero-alloc padCopy
From CGo expert review (verdict: stay pure Go, CGo would be slower):

CDX DecodeLeafKeys slab allocation (cdx.go):
- Single make() for all keys + prevKey (was 30+ allocs per page)
- Keys are slices into pre-allocated slab (zero copy)

NTX compareKeys simplified (ntx.go):
- bytes.Compare already returns normalized -1/0/+1
- Removed redundant normalization branches

NTX build.go zero-alloc:
- padCopy: copy+fill instead of make+fill+copy
- setKeyEntry: write directly to page data (no temp buffer)

82/82 stress PASS. 14 packages ALL PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:44:56 +09:00
197720f869 fix: Go code review — 7 critical issues resolved
From senior Go developer review:

C7 CRITICAL: pagePool data race (ntx.go)
- Moved global pagePool[8] + pagePoolIdx into per-Index struct
- Eliminates race condition across goroutines using separate indexes

C8 CRITICAL: Page.data dangling pointer after remap (ntx.go)
- remapFile() now clears pagePool data slices (pointed into old mmap)
- Prevents segfault from stale mmap references

C4 HIGH: pop() bounds check restored (thread.go)
- Removed performance optimization that eliminated underflow detection
- Stack underflow now produces clear error instead of index -1 panic

C1 HIGH: intExpLen overflow on MinInt64 (value.go)
- Added special case: MinInt64 returns 20 (length of -9223372036854775808)
- Prevents -v overflow in negation

C11 CRITICAL: GoTo ReadAt error handling (dbf.go)
- ReadAt failure now returns error and sets EOF
- Previously silently used stale record buffer (data corruption risk)

C14 HIGH: LEN() inline missing Hash case (gengo.go)
- Added _v.IsHash() → len(Keys) branch

C15 HIGH: EMPTY() inline missing Date case (gengo.go)
- Added _v.IsDate() && _v.AsJulian() == 0 check

82/82 stress PASS. 14 packages ALL PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:26:34 +09:00
05ccef05e2 perf: EndProcFast — eliminate defer recover() from RTL hot paths
Problem: every RTL function calls defer t.EndProc() which does recover().
50K SEEK loop = 250K recover() calls = ~12ms wasted.

Solution: EndProcFast() skips recover (only needs endFrame restore).
Applied to ALL RTL functions in strings.go, rdd.go, missing.go, database.go.
EndProc() with recover kept for generated PRG code (needs BEGIN SEQUENCE).

Analysis (50K sequential SEEK breakdown):
  Go NTX Seek direct: 7ms (faster than Harbour 27ms!)
  PRG VM overhead:    38ms (Frame + RTL calls + key generation)
  Key generation:     25ms (Str+LTrim+PadL+PadR = 5 RTL Frame/EndProc per iter)

With EndProcFast: RTL overhead reduced ~30%.

CDX SCOPE: 2ms (Harbour 4ms — 2x FASTER!)
82/82 stress PASS. 14 packages ALL PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 21:43:39 +09:00
9644b5469a perf: BoltDB BCE pattern — inline page access, eliminate bounds checks
NTX Page accessors (ntx.go):
- keyOffset/KeyChild/KeyRecNo: removed redundant bounds checks
- Use open-ended slice (data[off:]) for BCE — compiler proves safety
- pageKeyFind: inline offset table + key access in hot loop
  (was: compareKeys → KeyValue → keyOffset → LittleEndian)
  (now: compareKeys(data[off:off+kl]) — single slice expression)

CDX Seek (cdx.go):
- Binary search with leftmost match (correctly finds first duplicate)
- Cache hit path: skip DecodeLeafKeys entirely

50K NTX SEEK random: 67ms = Harbour 67ms (EQUAL!)
82/82 stress PASS. CDX 18/18. All unit tests PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 20:27:37 +09:00
1d9b364df8 perf: BoltDB-style zero-copy Page — NTX SEEK 2x, SCAN 5x faster
Page.data changed from [1024]byte (copied) to []byte (mmap slice reference).
Inspired by BoltDB's zero-copy page access pattern.

cachedLoadPage: returns slice into mmap memory (no 1024-byte copy!)
- Before: copy(p.data[:], mmap[offset:offset+1024]) — memcpy per page
- After:  p.data = mmap[offset:offset+1024] — pointer assignment only

pagePool: reuses Page structs (8-slot ring) to reduce GC pressure.

Benchmark (ext4, home dir) — Harbour comparison:
┌──────────────────┬──────────┬──────────┬──────────┐
│ 50K              │ Harbour  │ Five     │          │
├──────────────────┼──────────┼──────────┼──────────┤
│ SEEK seq         │ 23ms     │ 43ms     │ 1.9x     │
│ SEEK random      │ 63ms     │ 65ms     │ ≈ equal! │
│ SCAN             │ 5ms      │ 3ms      │ FASTER!  │
│ DUPKEY scan      │ 23ms     │ 12ms     │ FASTER!  │
│ DELSCAN          │ 17ms     │ 2ms      │ 8.5x!    │
│ PACK             │ 16ms     │ 21ms     │ 1.3x     │
└──────────────────┴──────────┴──────────┴──────────┘

82/82 stress PASS. All unit tests PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 20:07:06 +09:00
a9600ad45c perf: proper 3-level bulk build — INDEX 50K: 180ms → 28ms (6.4x)
bulkBuildBTree: distributes sorted keys as [M leaf] [sep] [M leaf] [sep] ...
- Separator exists ONLY in parent, never in leaf (proper B-tree)
- Works for any depth (tested 10 to 50000 keys, all correct)
- Edge case: absorb trailing 1-key into previous leaf

Eliminated per-key insertion fallback (rebuildWithInsert).
All sizes now use O(N) bulk build instead of O(N log N) insertion.

Benchmark on ext4 (home dir):
┌──────────────┬──────────┬──────────┬───────┐
│ 50K Items    │ Harbour  │ Five     │ Ratio │
├──────────────┼──────────┼──────────┼───────┤
│ APPEND 50K   │ 61ms     │ 124ms    │ 2x    │
│ INDEX NAME   │ 6ms      │ 28ms     │ 4.7x  │
│ INDEX CITY   │ 5ms      │ 36ms     │ 7.2x  │
│ SEEK 50K seq │ 23ms     │ 97ms     │ 4.2x  │
│ SEEK 50K rnd │ 63ms     │ 122ms    │ 1.9x  │
│ SCAN 50K     │ 5ms      │ 24ms     │ 4.8x  │
│ DUPKEY 50K   │ 23ms     │ 38ms     │ 1.7x  │
│ PACK 50K     │ 16ms     │ 20ms     │ 1.25x │
└──────────────┴──────────┴──────────┴───────┘

All counts correct: 50000/50000/40000

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:08:27 +09:00
c5ed5612fb perf: mmap zero-copy page access — Go-native optimization
Replaced LRU page cache with syscall.Mmap:
- OpenIndex: mmap entire file read-only (MAP_SHARED)
- cachedLoadPage: copy from mmap slice (no syscall per page)
- Close: munmap + file close
- insertKeyBTree: munmap before modify, mmapFile after complete
- remapFile: re-mmap after file size changes

Results on ext4 (50K records):
- SEEK random: 188ms → 138ms (26% improvement)
- SCAN: 35ms → 23ms (34% improvement)
- DUPKEY: 53ms → 41ms (23% improvement)
- INDEX: 180ms (unchanged — per-key insertion, no mmap during build)

Go-native approach:
- syscall.Mmap instead of C-style LRU cache
- OS page cache handles eviction automatically
- Simpler code (60 lines removed, 30 added)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:42:44 +09:00
103f0d8b64 perf: NTX LRU page cache (256 slots) — reduces syscalls
LRU page cache ported from rddfive/ntx_engine.c:
- 256-slot cache with MRU fast-path (O(1) for repeated access)
- LRU eviction when all slots full
- cachedLoadPage replaces LoadPage for all navigation
- invalidateCache called before insertKeyBTree (pages modified)

10K benchmark improvement (ext4 home dir):
- SCAN FWD: 6ms → 5ms
- SEEK NUM: 18ms → 14ms (22% improvement)
- DUPKEY SCAN: 9ms → 8ms
- All counts correct: 10000/10000/8000

50K benchmark:
- SCAN: 35ms → 31ms
- DUPKEY: 50ms → 40ms (20% improvement)
- DELSCAN: 41ms → 33ms (20% improvement)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:35:26 +09:00
dadb97ee88 fix: 3-level NTX correctness + CDX SET INDEX TO string quoting
NTX 3-level tree (build.go):
- Hybrid approach: bulk build for ≤2 levels, insertKeyBTree for 3+
- rebuildWithInsert: creates proper B-tree via per-key insertion
- 5000-key test: Count=5000 Found=5000 (was 5004/4868)

CDX SET INDEX TO (gengo.go):
- Strip surrounding quotes from string literal in OrderListAdd
- Was: idx.OrderListAdd("\"path\"") → file not found
- Now: idx.OrderListAdd("path") → correct

All tests:
- 14 packages ALL PASS
- 82/82 NTX stress test
- 18/18 CDX cross-read
- 50K benchmark: all counts correct

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:04:07 +09:00
adede5cd69 perf: REPLACE remove Flush + bulk build + deferred write = 1600x faster
Critical fix: REPLACE was calling area.Flush() after every field write!
- gengo gen_cmd.go: removed Flush() from emitReplaceCmd
- Harbour defers write until DBCOMMIT/CLOSE/GoTo, not per-REPLACE

Combined with bulk build + deferred APPEND:
- B1 APPEND 10K:  72,228ms → 30ms  (2,400x improvement!)
- B2 INDEX NAME:  34ms → 5ms       (6.8x improvement)
- Harbour comparison: Five 30ms vs Harbour 27ms (1.1x)

Also: OrderCreate flushes dirty record + EOF + header before index build

Benchmark on ext4 (home dir):
┌─────────────┬──────────┬────────┬───────┐
│ Benchmark   │ Harbour  │ Five   │ Ratio │
├─────────────┼──────────┼────────┼───────┤
│ APPEND 10K  │ 27ms     │ 30ms   │ 1.1x  │
│ INDEX NAME  │ 2ms      │ 5ms    │ 2.5x  │
│ INDEX CITY  │ 0ms      │ 7ms    │ -     │
│ SEEK 10K    │ 6ms      │ 25ms   │ 4.2x  │
│ SCAN FWD    │ 1ms      │ 6ms    │ 6x    │
│ SCAN BWD    │ 0ms      │ 6ms    │ -     │
│ PACK        │ 4ms      │ 3ms    │ 0.75x │
└─────────────┴──────────┴────────┴───────┘

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:22:05 +09:00
b1e868f01e perf: NTX bulk build + APPEND deferred write (from rddfive C port)
NTX Bulk Build (build.go — ported from rddfive/ntx_engine.c):
- pageBuffer: dynamic memory buffer for all pages
- Phase 1: Build leaf pages in sequential memory (zero disk I/O)
- Phase 2: Build interior levels from cached leaf data (zero I/O)
- Separator promotion: remove last key from leaf only (not interior)
- Single bulk WriteAt for all pages at end
- INDEX ON 10K: 34ms → 5-8ms (4-6x improvement)

NTX Seek (ntx.go):
- Always descend to leaf on match (find first occurrence)
- fStop flag tracks path match, verified at leaf

APPEND Buffering (dbf.go):
- Append marks dirty without immediate disk write
- flushRecord writes record data only (no header/EOF per record)
- Close/Flush writes EOF marker + header once

Results: 14 packages ALL PASS, 82/82 stress test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:10:18 +09:00
3fe8021e9e fix: NTX Seek descent + SET DELETED seek + BOF — 82/82 stress test PASS
NTX Seek (ntx.go):
- Always descend to leaf even on internal match (Harbour behavior)
  Prevents SEEK returning internal separator instead of first leaf entry
  Fixes duplicate key SEEK (NYC=9→10, Paris=8→10)
- fStop flag tracks path match, verified at leaf with key comparison
- Handle fStop at page end: ascend via nextKey to find actual match

SET DELETED + SEEK (indexer.go):
- When SEEK finds a deleted record with SET DELETED ON:
  Skip forward through matching deleted records
  If all matching records deleted → return not found (EOF)
  Fixes H04: deleted record now correctly returns .F.

BOF (indexer.go + dbf.go):
- Set a.FBof AFTER a.GoTo returns (GoTo resets FBof=false at line 393)
- Fixes infinite loop in DO WHILE !BOF() ... SKIP -1

Results:
- Unit tests: 14 packages ALL PASS
- 77-item thorough test: 77/77 (100%)
- 82-item stress test: 82/82 (100%) — Harbour identical

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:33:37 +09:00
9b9f87fd88 fix: NTX B-tree — proper pageSplit from Harbour + BOF detection
NTX Build (build.go):
- pageSplit: exact port of Harbour hb_ntxPageSplit
  - NewPage = LEFT half (lower keys), OldPage = RIGHT half (offset-swapped)
  - Proper offset table initialization for all pages
  - setKeyEntry/copyKeyEntry helpers for clean data writing
- insertKeyBTree: new root creation matches Harbour exactly
  - child[0] = newPage (left), child[1] = old root (right)

NTX Traversal (ntx.go):
- prevKey: guard iKey < keyCount before checking KeyChild
  (prevents infinite loop at rightmost child position)

BOF Detection (indexer.go):
- Set a.FBof AFTER GoTo returns (GoTo line 393 resets FBof=false)
- Previously: set FBof before GoTo → immediately cleared

Results: Unit tests ALL PASS, Stress test 82 items 79/82 match (96%)
Remaining 3 diffs: duplicate key count edge case + SET DELETED seek

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:13:42 +09:00
d2c17c7898 refactor: NTX B-tree rewrite — proper insertion with page splitting
Major rewrite based on Harbour dbfntx1.c analysis:

NTX B-tree traversal (ntx.go):
- nextKey: rewritten to match hb_ntxTagNextKey exactly
  - Advance iKey, check right child, descend via goLeftmost
  - Walk up stack on page exhaustion, truncate stackLevel
- prevKey: rewritten to match hb_ntxTagPrevKey
  - Check left child (only if iKey < keyCount), descend via goRightmost
  - Walk up stack for BOF detection
- goRightmost: internal nodes get iKey=keyCount (rightmost child),
  leaf nodes get iKey=keyCount-1 (last key) — matches Harbour

NTX B-tree build (build.go):
- CreateIndex: proper B-tree insertion (insert keys one by one)
- insertKeyBTree: search → insert at leaf → propagate splits up
- pageInsertKey: Harbour-style offset swapping (not data moving)
- pageSplit: collect all entries, split at midpoint, promote separator
- Proper offset table initialization for all pages

Unit tests: all 5 RDD packages PASS
Stress test: partial progress (Seek issues with split pages)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:49:31 +09:00
59568f3301 Five v0.9 — Harbour + Go fusion language
- Compiler: PP → Lexer → Parser → Analyzer → Gengo pipeline
- Parser: 232/236 (98%) Harbour compatibility, registry-based dispatch
- RTL: 351 Harbour-compatible functions
- RDD: DBF/NTX/CDX engines with Rushmore bitmap optimization
- Go Interop: IMPORT + pkg.Func() + obj:Method() with FastPath (15M calls/sec)
- HB_FUNC API: Full Harbour C API compatible Go bridge
- Concurrency: SPAWN/LAUNCH/GOROUTINE, <-, WATCH, PARALLEL FOR, ASYNC/AWAIT
- Extensions: Multi-return, DEFER, Slice, f-string, Nil-safe ?:, CONST
- Macro Compiler: Runtime AST parsing and evaluation
- Debugger: TUI debugger with source display, breakpoints, stepping
- FRB: Native + Pcode dual mode runtime binary
- Tests: 13 packages ALL PASS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 09:41:50 +09:00