Implements hybrid execution model: keep AST tree-walk for SQL:2013+
features (Window, Recursive CTE, JOIN, aggregates) while compiling
simple SELECT hot paths to Go + pcode. See docs/FiveSql2-Hybrid-Plan.md
for the full architecture rationale (why not SQLite-style VDBE).
Hot path (single table, no joins/groups/aggregates):
- TryBuildFieldPositions: resolves SELECT column list to FieldPos
array once per query (bails to PRG loop on any complex expr).
- TryCompileWhere + SqlExprToPrg: walks WHERE AST, emits equivalent
PRG source, runs it through PcCompile to get a PcodeFunc.
- SqlScan RTL: Go-native scan loop — GoTop/EOF/Skip/GetValue
direct, ExecPcode per row for WHERE, result array pre-alloc.
WHERE compiler scope:
- ND_LIT numeric/logical/string (string literals AllTrim'd to match
SqlCmpEq CHAR-padding semantics; rejects embedded quotes/newlines)
- ND_COL: CHAR fields auto-wrapped with AllTrim(FieldGet(n)) based
on dbStruct() lookup cached once per query in aCompileStruct
- ND_BIN: = <> != < <= > >= AND OR + - * /
- ND_UNI: NOT -
- Anything else (ND_FN, ND_CASE, ND_SUB, ND_PAR, LIKE, IN, IS NULL,
BETWEEN, dates) returns NIL → falls back to PRG tree-walk.
Bench (50k rows, ~/tmp ext4):
Before After Speedup
Numeric WHERE ~150ms 11.7ms ~13x
String WHERE 119.3ms 10.5ms 11.4x
No WHERE - 14.6ms -
Raw RDD baseline 6.8ms 6.8ms 1.0x
Remaining gap to raw RDD (~1.5x) is structural: Value boxing, result
array construction, per-row ExecPcode frame overhead. Would need a
Value-pool or SoA refactor to close further.
Side fixes bundled:
- TSqlIndex:FindExclusive short-circuited. Originally called
dbInfo(DBI_FULLPATH)/DBI_SHARED which are unresolved symbols in
Five (dbInfo is a stub, DBI_* never defined). Panic'd with
"local variable index out of range: 0" whenever a standalone PRG
had a workarea Used before calling five_SQL. 43-test masked the
bug because it only reached FindExclusive with no open workareas.
Restore the scan once dbInfo lands in hbrtl.
- cmd/five/main.go: FIVE_KEEP_BUILD=1 env var keeps the temp Go
project around for debugging gengo output.
Validation:
- FiveSql2 43/43
- Harbour compat 51/51
- go test ./... ALL PASS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Expose Five's existing FRB bytecode compiler for single-expression
compilation, enabling prepared-statement-style caching in dynamic
query engines (FiveSql2, scripting layers, rule engines).
1. genpc.CompileExpr(ast.Expr) *hbrt.PcodeFunc
- New public API that compiles a single expression to a
standalone pcode function
- Reuses genpc's mature emitExpr (no new emit logic)
- ExecPcode manages the frame around the generated code
2. hbrtl.PcCompile(cPrgExpr) -> pFunc
- RTL entry point for runtime compilation
- Wraps the expression in a FUNCTION stub, uses the full PRG
parser pipeline (pp + parser + genpc), extracts the compiled
pcode function, returns it as an opaque pointer
- Callers pay parse+compile cost ONCE per expression
3. hbrtl.PcEval(pFunc) -> xValue
- RTL entry point for runtime execution
- Calls hbrt.ExecPcode; the pcode's RetValue opcode sets retVal,
which our EndProc preserves as PcEval's return value
- ~1.2x slower than direct FieldGet (pcode interpreter overhead),
but eliminates AST tree-walk per row for complex expressions
Usage (FiveSql2 hot path, planned):
pc := PcCompile("FieldGet(4) > 50000") // parse+compile once
WHILE !Eof()
IF PcEval(pc) // ~10us per row
AAdd(aRows, ...)
ENDIF
dbSkip()
ENDDO
Benchmark (50k records, WHERE salary > 50000):
Raw FieldGet: 7.9 ms (baseline)
FieldPos+Get: 10.2 ms (with O(1) FieldPos cache)
PcEval bytecode: 10.1 ms (interpreted bytecode)
MacroEval: parse+eval per row — orders of magnitude slower
Tests:
go test ./... ALL PASS (14 packages)
FiveSql2 43/43 100%
compat_harbour 51/51
PcCompile/PcEval verified on 50k-row scan
FiveSql2 engine integration deferred — requires careful PRG-level
refactoring to thread pcode pointers through the plan structure.
The Go-level infrastructure is now in place for that work.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two SQLite-style optimizations for RDD and SQL workloads:
1. FieldPos() O(1) column binding cache
Before: FieldPos(name) linear scan — O(n) per call with string
comparison. In SQL engines that call FieldPos per row per
column, this is hundreds of thousands of calls.
After: DBFArea builds a map[UPPER(name)]→pos on first lookup.
All subsequent lookups are O(1) hash. SQLite calls this
"column affinity binding" — positions resolved at prepare,
not per row.
Implementation:
- hbrdd/dbf/dbf.go: DBFArea.FieldPosCache(name) method
- hbrtl/procinfo.go: FieldPos RTL uses fieldPosCacher interface
- Lazy init: only pays for tables that get queried
2. hbrdd import auto-detection for function-call style PRGs
Before: compiler only added hbrdd import when PRG used xBase commands
(USE, SKIP, INDEX...). Pure function-call style like
`dbUseArea(.T.,,"t")`, `FieldPut(1, val)` was missed —
generated Go failed to compile ("undefined: hbrdd").
After: scanStmtsForXBase walks ExprStmt bodies too, detecting
CallExpr to any of the ~40 xBase RTL function names.
FIELD->NAME alias expressions also trigger the import.
Resolves: small PRGs that use only dbUseArea/FieldGet/FieldPut.
Benchmark notes (50k records):
Raw RDD scan: 7 ms (baseline)
FiveSql2 SELECT WHERE: 157 ms (unchanged — bottleneck is
not FieldPos, it's PRG-level
expression tree walk per row)
compat_harbour 51/51: PASS
FiveSql2 43/43: 100%
The FieldPos cache helps heavy field-name-based code paths but the
primary FiveSql2 bottleneck is the PRG interpreter walking expression
ASTs per row (needs bytecode compilation to close the gap).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Release-blocking compatibility issues discovered during the 258-test
pre-release validation suite (100 syntax + 44 RDD + 114 RTL).
1. PCount() always returned 0 in PRG code
Root cause: ParamCount() returned t.pendingParams, which is
overwritten by every nested Function() call. By the time the
PCount() RTL's Frame() executes, pendingParams is already 0.
Fix: Frame() now stores pendingParams in frame.paramCount.
PCount() RTL uses CallerParamCount() which reads callSP-2
(the PRG caller's frame), while RTL functions still use
ParamCount() (reads pendingParams before their own Frame).
Verified: PCount(1,2,3)=3, PCount(1)=1, PCount()=0
2. Break("string") panicked instead of being caught by RECOVER USING
Root cause: Generated SEQUENCE code only caught *HbError panics.
Break() panics with BreakValue (a different type), which fell
through to EndProc's "runtime error" message and re-panic.
Fix (two parts):
a) gengo emitBeginSequence: recover closure now catches any
panic (interface{}), then dispatches via type switch:
- *HbError → extract .Error() string
- hasValue interface (BreakValue) → extract .GetValue()
- other → static "error" string
b) hbrtl/error.go: BreakValue gets GetValue() method for
duck-type detection without import cycles
c) hbrt/thread.go EndProc: BreakValue type name check added
so it re-panics silently (no stderr noise)
3. SET INDEX TO a, b, c only opened the last file
Root cause: Parser's parseSet() called parseExpr() once for
INDEX setting, stopping at the first comma. Remaining file
names were consumed by the "eat rest of line" loop.
Fix: Parser now collects comma-separated identifiers into a
single string literal "a,b,c". gengo splits on comma and
calls OrderListAdd() for each file.
Verified: SET INDEX TO si_name, si_city → OrdCount=2
All tests pass:
go test ./... 14 packages OK
FiveSql2 43/43 100%
compat_harbour 51/51
Syntax test 100/100
RDD test 44/44
RTL test 114/114
Windows cross-compile OK
Linux cross-compile OK
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces the FLOCK/DBRLOCK/DBRUNLOCK no-op stubs with actual
fcntl(F_SETLK) byte-range advisory locks, matching Harbour's
hb_fsLockLarge implementation.
Before: rtlDbRLock always returned .T. regardless of contention.
Multi-process writers could silently corrupt records.
After: Non-blocking POSIX byte-range locks per file descriptor.
Cross-process exclusion verified by a subprocess-spawning
Go test that witnesses BUSY vs OK transitions.
New files:
hbrdd/dbf/locks_posix.go fcntl F_WRLCK/F_UNLCK wrappers
hbrdd/dbf/locks_windows.go stub (TODO: LockFileEx)
hbrdd/dbf/lock_multi_test.go cross-process verification
docs/gap-analysis.md honest Harbour parity assessment
Modified:
hbrdd/dbf/dbf.go
- DBFArea gains fileLocked bool + lockedRecs map
- Close() calls releaseAllLocks() before dropping the fd
hbrtl/database.go
- rtlDbRLock / rtlDbRUnlock now delegate to DBFArea.LockRecord /
UnlockRecord instead of returning fixed .T./NIL
- New rtlFLock / rtlDbUnlock for FLOCK() / DBUNLOCK()
hbrtl/register.go
- FLOCK and DBUNLOCK symbols registered (were missing entirely)
compiler/analyzer/analyzer.go
- FLOCK / DBUNLOCK added to RTL known-function set
Lock region layout (non-overlapping on purpose):
FLOCK region [0, HeaderLen+1)
Record N region [RecordOffset(N), RecordLen)
So a workarea can hold FLOCK and multiple DBRLOCK simultaneously
on the same fd without conflict.
Design rationale (captured in locks_posix.go header):
* POSIX fcntl, not flock(2) — byte-range + NFS-safe
* Non-blocking F_SETLK — matches Clipper FLOCK() → .F. semantics
* Released explicitly on Close to avoid workarea-sharing races
* Windows falls back to no-op (TODO: LockFileEx)
Verification:
go test ./hbrdd/dbf/ -run TestFLockBlocksAcrossProcesses PASS
go test ./hbrdd/dbf/ -run TestRLockBlocksAcrossProcesses PASS
go test ./... ALL PASS
FiveSql2 43/43 100%
compat_harbour 51/51 100%
The gap-analysis doc (docs/gap-analysis.md) is a running inventory
of what works vs what's still missing vs Harbour 3.2, written for
users evaluating Five for production — not a sales pitch.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Five RDD engine now matches Harbour DBFNTX and DBFCDX byte-for-byte
in ordering, seek, navigation, and field access. Verified against
Harbour 3.2.0dev with a 281-line comparison test covering:
- Natural/NAME/CITY/AGE/SALARY/UPPER ordering
- SEEK (exact/not-found), GoTop/GoBottom per order
- DELETE/RECALL with SET DELETED
- CDX compound index read with 5 tags (BYNAME, BYCITY, BYAGE, BYSAL, BYUNAME)
- Reverse traversal
Fixes:
1. FIELD->NAME returned NIL
GetAliasField returned interface{} but runtime expected hbrt.Value,
so the type assertion in PushAliasField failed and pushed NIL.
- workarea.go: change return type to hbrt.Value, handle FIELD/_FIELD
as current-workarea alias, add SetAliasField
- gengo.go: emit SetAliasField() for alias->field := value in both
statement and expression contexts
2. OrdSetFocus(n) silently switched to natural order
v.AsString() returns "" for a numeric Value, so OrderListFocus("")
set current=-1.
- indexrtl.go: convert numeric param via fmt.Sprintf("%d", ...)
3. CDX compound tag order mismatched Harbour
Five decoded the structural B-tree which is alphabetical, but
Harbour sorts tags by TagBlock (file offset = creation order).
- cdx/cdx.go: sort tagEntries by offset ascending after decoding,
matching hb_cdxIndexLoadAvailTags in dbfcdx1.c
4. OutStd()/OutErr() not registered — caused panic on call
- hbrtl/console.go: add rtlOutStd/rtlOutErr implementations
- hbrtl/register.go: register OUTSTD and OUTERR
- analyzer.go: add OUTSTD/OUTERR to RTL known-functions
5. FIELD keyword triggered "undeclared variable" warnings
- analyzer.go: add FIELD, _FIELD, M, MEMVAR as builtin constants
Tests:
go test ./... — ALL PASS (17 packages)
FiveSql2 43/43 — 100%
compat_harbour 51/51 — 100%
Harbour diff — 0 lines differ (281-line comparison)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. SOFTSEEK: use idx.CurRecNo() for positioning (was checking recNo > 0)
- SEEK with SET SOFTSEEK ON now positions at next higher key
- SEEK command reads SET SOFTSEEK at runtime (was compile-time only)
- rtlDbSeek defaults to GetSetSoftSeek() when no explicit param
2. SET DELETED ON + INDEX: SkipIndexed skips deleted records
- GoTopIndexed: skip deleted record at top position
- SkipIndexed: inner loop continues past deleted records
3. Compound key (CITY+NAME): field name TrimSpace before lookup
- evalKeyExprInner: TrimSpace on fieldName after FIELD-> strip
- Fixed "CITY " != "CITY" mismatch from + operator splitting
4. SET INDEX TO filename: treated as string, not variable
- gengo uses exprToString for SET INDEX TO (was emitExpr)
- Prevents identifier being resolved as local variable
5. hasXBaseCommands: recursive scan into nested blocks
- BEGIN SEQUENCE, IF, FOR, DO WHILE, SWITCH bodies now scanned
- Fixes missing hbrdd import for DB commands inside blocks
Thorough test: 77 items (14 sections) covering exact/partial/soft seek,
SET DELETED, duplicate keys, numeric keys, compound keys, empty/single
table, state consistency, order switching, full traversal — all identical.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CDX Integration:
- IndexEngine interface: common for NTX Index and CDX Tag
- OrderListAdd: auto-detects .cdx/.ntx extension, opens CDX tags
- decodeCompoundLeaf: proper bit-packed tag directory decoding
(was stub falling through to scanCompoundLeaves with wrong names)
- CDX Tag: added KeyLen(), KeyExpr(), ForExpr(), IsDescending(), Close()
- CDX compound recNo = direct byte offset (not page number)
ORDSCOPE:
- SetScope/ClearScope/SetScopeTop/SetScopeBottom on DBFArea
- GoTopIndexed: seeks to scopeTop, validates within scopeBottom
- GoBottomIndexed: seeks to scopeBottom boundary
- SkipIndexed: stops at scope boundaries (top and bottom)
- OrdScope RTL function registered (nScope: 0=TOP, 1=BOTTOM)
- scopeKeyFromValue: converts Value to padded key bytes
Index Order Management:
- OrderListFocus: handles numeric order ("2" → order 2)
- SET ORDER TO n: gengo emits hbrt.NtoS for int-to-string conversion
- IndexOrd/OrdCount/OrdName/OrdKey: real implementations (were stubs)
- OrderCount/CurrentOrder/OrderName/OrderKeyExpr accessors on DBFArea
- ClearScope on order switch (prevents stale scope)
Cross-read test: Harbour-created CDX → Five reads, 20/20 items match:
NAME/CITY/ID seek, ORDSCOPE count, GoTop/GoBottom all identical
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bug 1: FIELD->NAME in INDEX ON expression
- evalKeyExprInner: strip FIELD->/alias-> prefix before field lookup
- exprToString: handle AliasExpr (FIELD->NAME → "FIELD->NAME")
Bug 2: AsNumInt() on Double returned IEEE 754 raw bits
- Value.AsNumInt(): check tDouble and convert via Float64frombits
- Fixed array index crash when index is result of % modulo
Bug 3: PACK/ZAP crash with open indexes
- OrderListRebuild: fully implemented (was TODO stub)
Saves index info, closes all, sets idxState=nil, recreates
- OrderCreate: set current=-1 during key evaluation (natural GoTo)
- PACK/ZAP: save/restore idxState, rebuild after operation
- Register __DBPACK, __DBZAP, DBRECALL symbol aliases
Harbour vs Five: 45/47 match (96%), 2 diffs are duplicate-key sort order
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- skipFilter: skip deleted records in GoTop/GoBottom/Skip when SET DELETED ON
- hbrdd.IsSetDeleted callback: avoids circular import hbrdd→hbrtl
- Parser: capture ON/OFF for boolean SET commands (DELETED, EXACT, SOFTSEEK, etc.)
- Parser: capture TO expr for SET DATE/DECIMALS/EPOCH
- Gengo: emit proper t.Do() calls for 11 SET toggles + 3 value SETs
- stmtSet: was stub (skipToEOL), now calls parseSet()
- RTL: register 11 SET toggle functions (SETDELETED, SETEXACT, etc.)
- RTL: DBLOCATE/DBCONTINUE for sequential search
- RTL: DBSETFILTER/DBCLEARFILTER/DBFILTER
- PadL/PadR: support 3rd param fill character
- Area interface: added SetFound, SetLocate, LocateBlock, filter methods
- MemRDD: implements new Area interface methods
- Comprehensive PRG test: test_search.prg (7 test suites all pass)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>