Commit Graph

41 Commits

Author SHA1 Message Date
699ea90156 feat(pp): TOTAL TO via std.ch + __dbTotal RTL
`TOTAL TO <file> ON <key> [FIELDS <list>] [FOR ...] [WHILE ...]
[NEXT ...] [RECORD ...] [REST] [ALL]` joins the family of std.ch
DML rewrites. New RTL primitive __dbTotal:

  * Walk the source under dbEval-style FOR/WHILE/NEXT/RECORD/REST
    bounds. The source must already be sorted/indexed on the key —
    same precondition as Harbour's dbtotal.prg.
  * Track the current group key. On each key change, flush the
    accumulated row to the destination (writing the running totals
    back into the most recently appended record's sum-fields,
    preserving each field's declared length/decimals).
  * On the *first* record of every group, append a fresh dst row
    and copy all non-memo source fields into it; subsequent records
    in the group only contribute to the sums. Net effect: non-summed
    fields take the first record's value, summed fields hold the
    group total. Same shape as harbour-core/src/rdd/dbtotal.prg.
  * Memo fields are dropped from the destination structure (Harbour
    does the same).

Parser cleanup: TOTAL removed from the IDENT-statement no-op switch.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 15:24:41 +09:00
1cc2d94927 feat(pp): LIST / DISPLAY via std.ch + four PP completeness fixes
`LIST [<fields>] [OFF] [FOR ...] [WHILE ...] [NEXT ...] [RECORD ...]
[REST] [ALL]` and `DISPLAY [<fields>] [OFF] [FOR ...] ... [ALL]`
reach the parser as plain function calls to a new RTL primitive
__dbList (rtlDbList in hbrtl/database.go).

Implementation: walk the workarea under dbEval-style FOR/WHILE/NEXT/
RECORD/REST bounds. For each visible record, evaluate each column
block and emit the rendered values via valueToDisplay (the same
formatter QOut already uses). Empty fields list defaults to
"all fields". OFF suppresses the record-number prefix.
LIST always emits the full filtered range; DISPLAY without ALL emits
only the current record (encoded as nCount=1). TO PRINTER / TO FILE
clauses are not yet wired through — for now everything goes to
stdout.

Wiring up LIST/DISPLAY surfaced four further gaps in PP that were
silently masking bugs in any rule with multiple word-list / list /
optional clauses chained together:

  * matchSegment refused MarkerWordList inside `[...]`. The LIST
    rule's `[<off:OFF>]` clause therefore never set the off
    capture, and `<.off.>` substituted to nothing instead of .T./.F.
    matchSegment now matches WordList markers the same way the
    top-level matcher does.

  * `<v,...>` and `<(f)>` capture stop boundaries didn't include the
    values of following MarkerWordList markers. For
    `[<v,...>] [<off:OFF>] [<all:ALL>]` against `LIST id, name OFF`,
    the v list would happily eat OFF. New addStopFrom helper
    contributes both literal keywords and word-list values; both
    matchSegment's MarkerList branch and captureExpression now use
    it.

  * Optional-repeat loop in matchPattern merged a no-progress
    iteration's empty capture into the running multi-capture string
    (with the `\x01` separator) before the no-progress break check
    fired. So a successful first iteration's value got contaminated
    and the substitution loop then skipped it as multi-capture
    garbage. The merge now happens after the progress check.

  * Unreferenced `<.name.>` markers (optional clauses that didn't
    match in the input) were getting cleaned up to empty by the
    generic marker scrubber instead of the .F. sentinel Harbour's
    std.ch expects. New replaceUnreferencedLogify pass mirrors the
    existing replaceUnreferencedBlockify and runs just before the
    cleanup.

Parser cleanup: LIST and DISPLAY removed from the IDENT-statement
no-op switch in both parseIdentStmt and parseExprStmt.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 15:19:36 +09:00
989138d12e feat(pp): SORT TO via std.ch + __dbSort RTL
`SORT TO <file> [ON <key-list>] [FOR ...] [WHILE ...] [NEXT ...]
[RECORD ...] [REST] [ALL]` joins COPY in being a real preprocessor
rewrite to a function call. New RTL primitive __dbSort:

  * Buffer visible source records (FOR/WHILE/NEXT/RECORD/REST same
    as __dbCopy).
  * Multi-key stable insertion sort. Each key may carry `/D` for
    descending; ascending otherwise. /A and unknown suffixes fall
    through as ascending. Comparison delegates to the existing
    compareValues helper in sqlscan.go (numeric / string / NIL-aware).
  * Create destination DBF with the source's struct, append rows in
    sorted order, restore source selection.

Parser cleanup: SORT removed from the IDENT-statement no-op switch.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 15:04:18 +09:00
e961660f61 feat(pp): COPY TO via std.ch + four PP completeness fixes
`COPY TO <file> [FIELDS <list>] [FOR ...] [WHILE ...] [NEXT ...]
[RECORD ...] [REST] [ALL]` reaches the parser as a plain function
call to a new RTL primitive __dbCopy (rtlDbCopy in hbrtl/database.go).

Implementation: project the field list (case-insensitive name match
against the source's structure, full copy when omitted), dbCreate the
target file with that struct, open it under a temp alias, walk the
source under dbEval-style FOR/WHILE/NEXT/RECORD/REST bounds, and
GetValue/Append/PutValue per record into the target. SDF / DELIMITED
variants stay parser no-ops until those backends arrive.

Wiring up COPY surfaced four longstanding gaps in the PP that had to
be fixed for the rule to even reach the runtime:

  * `<(name)>` *pattern* marker was treated as a regular `<name>`
    with the parens baked into the captured key, so the matching
    result substitution `<(name)>` couldn't find it. parseOneMarker
    now strips the parens at parse time so capture key and result
    marker share the bare name. The smart-stringify result behavior
    is unchanged.
  * matchSegment (the optional-clause matcher) bailed on every
    non-Regular marker. `[FIELDS <fields,...>]` therefore failed to
    match at all and the fields list arrived empty in the result
    template. matchSegment now handles MarkerList with paren-balanced
    capture and segment+outer literal stop boundaries.
  * captureExpression only used the first literal in the pattern
    tail as a stop boundary. With std.ch's chain of optional
    clauses (`[TO <(f)>] [FIELDS ...] [FOR ...] [WHILE ...] ...`)
    the file-name marker was happy to gobble a trailing FOR clause
    when FIELDS was absent. It now stops at *any* of the remaining
    pattern literals.
  * `<(name)>` smart-stringify on a list-typed capture wrapped the
    whole comma-joined string in one set of quotes — `{ "a , b" }` —
    instead of `{ "a", "b" }`. New helper quoteListElements splits on
    top-level commas (paren / bracket / brace / string-balanced) and
    quotes each element. applyResult now consults the rule's marker
    table to know which captures came from `<name,...>`.

Parser cleanup: COPY removed from the IDENT-statement no-op switch in
both parseIdentStmt and parseExprStmt.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 15:00:18 +09:00
c2e7f7ea27 feat(pp): Phase B — COUNT / SUM / AVERAGE via std.ch
Three xBase analytical commands that were silent no-ops in the
parser now execute as Harbour-style PP rewrites:

  COUNT [TO <v>]   [FOR <for>] [WHILE <while>] ... -> dbEval()
  SUM <x> TO <v>   [FOR <for>] [WHILE <while>] ... -> dbEval()
  AVERAGE <x> TO <v> [FOR ...]                     -> __dbAverage()

COUNT and SUM expand to a `<v> := 0 ; dbEval( {|| ... } )` pair
matching harbour-core/include/std.ch verbatim. AVERAGE delegates to
a new RTL function rtlDbAverage (sum + count + divide; returns 0 on
empty match) — the chained-private-variable trick Harbour uses to
keep AVERAGE inline doesn't translate cleanly through Five's PP.

Wiring up these rules surfaced four PP issues that had to be fixed
for the rewrite to even reach the parser:

  * Result template did not implement <{name}> blockify. So a rule
    body like `{|| x := x + <x> }, <{for}>` left the literal text
    `<{for}>` in the output. Added blockify substitution: captured
    -> `{|| <captured> }`, missing -> NIL.
  * findMarkerEnd did not recognise `{`/`}` so unreferenced
    blockify markers were not cleaned up either. Added `{`/`}` to
    its prefix/suffix sets.
  * Optional-clause matching had no view of the outer pattern, so a
    regular marker at the end of `[TO <v>]` would swallow the rest
    of the line — `COUNT TO n FOR x>5` captured `<v>` as
    "n FOR x>5". matchSegment now takes outerTail and stops at its
    first literal.
  * `#command` directives could not span multiple physical lines.
    A trailing `;` is harbour-core's line-continuation marker for
    std.ch and now joins the next line into the directive before
    parsing.

Parser cleanup: COUNT, SUM, AVERAGE removed from the IDENT-statement
no-op switch in parseIdentStmt + parseExprStmt. The remaining xBase
verbs (COPY, SORT, TOTAL, JOIN, LIST, DISPLAY, LABEL, REPORT, ...)
stay in the parser until their RTL backends arrive.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 14:11:20 +09:00
f4ed42556b checkpoint: season-wide bug fix campaign + infra
Cumulative season's silent-bug hunting (~62 fixes) across the FiveSql2
SQL engine, the Five compiler/runtime, and the hbrdd RDD layer. Saved
as a single checkpoint before refactoring the parser to delegate xBase
command translation to the preprocessor.

Highlights:

FiveSql2 engine (_FiveSql2/src/)
- prefix-glob index attach -> explicit convention (<table>_pk.ntx,
  <table>_uq.ntx, <table>.cdx) — fixes silent multi-row INSERT row-drop
- DROP/CREATE TABLE FErase chain extended (.cdx, .fsc, .fsv, .dbt, .fpt)
- COUNT(DISTINCT col) parsed + aggregated via hSeen hash
- UNION column-count mismatch returns SQL_ERR_GRAMMAR (was silent)
- DISTINCT + ORDER BY hidden-col leak fixed (trim before DISTINCT)
- Derived table FROM (SELECT...) + JOIN right-side derived
- Self-FK CASCADE depth 2+ via SqlGetSingleColPK pre-collect
- LAG/LEAD default arg uses SqlEvalRowExpr (handles -N const exprs)
- DATE literal round-trip validation (Feb 29 non-leap rejected)
- CREATE OR REPLACE VIEW; CREATE VIEW errors on already-exists
- AlterTable type dispatcher comma-wrapped (1-char type "A" no longer
  matches CHARACTER)

Compiler / runtime
- gengo: HB_ -> FV_ prefix on emitted Go function names (Five identity)
- gengo split: emit_block.go, emit_stmt.go, folding.go extracted
- parser/stmtreg.go nudges
- hbrt: debug TUI/CLI restructure (debugcmd, debugkey, termios_*),
  windows debug stubs collapsed
- thread/vm/value/class/pcinterp tightening from panic traces

RDD layer (hbrdd/)
- dbf: null bitmap support (null.go + null_test.go), mmap split
  (mmap_posix.go / mmap_windows.go), byte-level numeric parse
- ntx/cdx: windows mmap parity
- workarea + mem RDD: cross-area state-bleed fixes

RTL (hbrtl/)
- errorlog rewrite with platform-specific FD (errorlog_fd_unix /
  errorlog_fd_other)
- sqlscan, sqlhelpers, indexrtl, datetime extensions

Gates green at checkpoint:
- go test ./...        : PASS
- FiveSql2 SQL:1999    : 43/43
- Harbour compat       : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 09:26:25 +09:00
d6c26104c9 feat(rtl): common.ch aliases — ISNIL/ISARRAY/ISNUMBER and friends
Harbour's common.ch exposes classic Clipper type-check shorthands
via #translate rules that map to HB_IS* RTL functions:

  #translate ISNIL(<x>)       => ((<x>) == NIL)
  #translate ISARRAY(<x>)     => HB_ISARRAY(<x>)
  #translate ISCHARACTER(<x>) => HB_ISSTRING(<x>)
  ... etc.

Five's preprocessor currently supports #translate only for lines
whose FIRST word is the rule keyword, not for substring matches
inside expressions. Real usage like `IF ISNIL(x)` fails the keyword
check (first word is IF, not ISNIL) and the rule never fires.

Rather than rewrite the PP substring engine (A2 scope), register
the nine short names as direct RTL symbols in register.go, each
pointing at the same Go function as its HB_IS* twin. ISMEMO maps
to HB_ISSTRING as a reasonable approximation for Five (no distinct
memo type at the VM level).

common.ch becomes a short stub that just #defines TRUE/FALSE/YES/NO
and documents where the ISxxx aliases live. DEFAULT / UPDATE
#xcommand forms remain unsupported pending A2.

Verified with /tmp/test_common.prg — ISNUMBER(42), ISCHARACTER("x"),
ISNIL(nilVar) all dispatch correctly. Analyzer still emits
"undeclared variable" warnings for the short names (the static
checker doesn't see runtime-registered RTL symbols) but the
generated code links and runs.

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:01:50 +09:00
2a662525b3 feat(rtl): DO(xTarget, [args...]) — dynamic dispatch
Harbour's DO() accepts a string (looked up as a function name), a
code block (evaluated with args), or a symbol, and invokes it. Used
for plugin systems and dynamic dispatch idioms like
`DO(cHandler, oRequest)`.

Five already had stmtDo rewrite `DO(...)` at statement-level to a
function-call expression, so callers in expression position just
work — but gengo refused to emit DO as a function call because it
was on the reserved-word guard list (which existed to catch stray
ENDIF/ENDDO from bad IF nesting). Remove DO from that list; the
statement form is still handled upstream by parseDoProc, so the
guard loses nothing.

rtlDo implements the dispatch:
  - String target → VM.FindSymbol + t.Function
  - Block target → EvalBlock path (same as Eval)
  - Anything else → NIL

Tested (/tmp/test_do.prg):
  DO("Greet", "World") → "hello, World"
  DO({|x,y| x*y+1}, 5, 6) → 31
  DO(NIL) → NIL (ValType "U")

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:33:09 +09:00
e089c81bcd feat(macro): &var / &(expr) runtime compilation
Harbour's macro operator was a stub: hbrt.MacroCompile only resolved
bare identifier names to memvars/functions and returned the source
string unchanged for any non-trivial expression. The gengo emit was
also broken — `t.MacroPush() + t.PushNil()` never pushed the inner
expression's value, so MacroPush popped whatever happened to be on
the stack.

Wire it up properly:

1. Gengo fix: `case *ast.MacroExpr` now emits `emitExpr(e.Expr);
   t.MacroPush()`. The inner expression produces the source string;
   MacroPush consumes it and pushes the evaluated result.

2. Hook pattern in hbrt: `SetMacroEvalHook(fn)` lets hbrtl install
   the real evaluator without creating an import cycle (genpc
   already imports hbrt). MacroPush delegates to the hook when
   installed; otherwise falls back to the legacy stub for hbrt
   unit tests.

3. hbrtl.init registers macroEval, which reuses compileExprSource
   (factored out of PcCompile) so macro lookups share the same
   sync.Map-backed pcode cache — repeat evaluations of the same
   macro source are free after the first hit.

4. ExecPcode leaves the result in retVal; macroEval copies it to
   the operand stack via PushRetValue.

Tested (/tmp/test_macro.prg):
  &"10 + 20"                    → 30
  &"Sqrt(16)"                   → 4
  &"Upper('hello')"             → HELLO
  &("30 * " + Str(nX, 1))       → 210  (runtime-built source)
  &"5 > 3 .AND. .T."            → .T.
  &("Str(" + Str(nX*10,2) + ",2)") → 70

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:02:16 +09:00
935883bb88 perf(fivesql2): Go-native FetchRow fast path — 1.3-1.7x on agg/window
TSqlExecutor:FetchRow was the per-row workhorse for aggregation,
HAVING, and window queries. Even with the pre-built aFetchCache
binding columns to (nWA, nFPos), the PRG FOR loop paid one method
dispatch per column per row (dbSelectArea, FieldGet, AllTrim,
AAdd) — profile pinned it at ~30% of B4 CPU.

SqlFetchRowFast collapses the cache-path loop into a single Go
call:
  - bound entry: SelectByNum + area.GetValue directly
  - unbound (aggregate/expression): self:EvalExpr via Send
  - character values: TrimSpace inline
The PRG FetchRow keeps its original cache-miss fallback path
unchanged for rare queries where aFetchCache isn't built.

Bench deltas (median of 3 steady runs, 1000 iters):
  B4_GROUP_HAVING 418 → 327 us  -22% (1.28x)
  B9_ROW_NUMBER   191 → 120 us  -37% (1.59x)
  B10_RANK_PART   228 → 135 us  -41% (1.69x)
  B11_SUM_OVER    249 → 156 us  -37% (1.60x)
  B14_COUNT       235 → 219 us  -7%
  B15_CTE_WIN_JOIN 1577 → 1452 us  -8%
Single-table SELECT (B1-B3, B5-B7, B8) stays flat — those already
hit the column-binding fast path and don't need aggregate dispatch.

FiveSql2 43/43, Harbour compat 56/56.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:50:02 +09:00
c84cde6175 perf(fivesql2): Go-native SqlIsAggName — drop per-row substring scan
B4 GROUP+HAVING profile showed SqlIsAggName at ~9% of CPU —
SqlEvalFunc checks it for every function in every row, and the
PRG body was two string allocations + a substring scan:
  RETURN ("," + c + ",") $ ("," + AGG_FUNCTIONS + ",")

Replace with a hash lookup against the existing aggFuncSet map
in hbrtl/sqlexpr.go (already populated for SqlExprHasAgg, same
AGG_FUNCTIONS list). Upper-casing skips the allocation when the
input is already upper, which it almost always is in practice.

Bench deltas (median of 3 steady runs, 1000 iters):
  B4_GROUP_HAVING 447 → 418 us  -6.5%
  B14_COUNT       252 → 235 us  -7%
  B15_CTE_WIN_JOIN 1595 → 1577 us  -1%
Other benches unchanged (no aggregate calls per row).

FiveSql2 43/43, Harbour compat 56/56.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:40:19 +09:00
dd270d5d9d perf: RTL Go-native migration — 27 optimizations, DML up to 70-90x
Systematic pass through PRG hot paths, promoting them to Go RTL while
preserving Harbour/FiveSql2 semantics. Full log in
docs/RTL-Go-Native-Migration.md.

Bench (bench_sql) vs 2026-04-08 baseline
 - B1  SELECT *             2,192 → 114   µs   (19x)
 - B6  INNER JOIN           9,291 → 233   µs   (40x)
 - B7  CTE simple           8,037 → 129   µs   (62x)
 - B9  ROW_NUMBER           3,705 → 265   µs   (14x)
 - B10 RANK PARTITION       4,748 → 309   µs   (15x)
 - B12 INSERT (WA cache)    4,319 →  63   µs   (69x)
 - B13 UPDATE (WA cache)    6,144 →  68   µs   (90x)
 - B15 CTE+WIN+JOIN        18,395 → 1,873 µs   (10x)

Infrastructure
 - HbHash O(1) Index preserving insertion order (Harbour KEEPORDER)
 - HbDeepClone Go RTL (scalar-sharing, immutable hash keys)
 - MEMRDD auto-imported via gengo; all Five programs get mem:name driver
 - SQL plan + pcode caches (s_hPlanCache, s_hDmlPcodeCache)
 - Opt-in SqlWACacheEnable — dbUseArea/Close/Commit batched for DML

SQL engine
 - FiveSql2 lexer ported to Go (byte FSM) with combined automatic
   template parameterization (literals → ?, concat queries share plan)
 - Go RTL: SqlDistinct, SqlGroupRows, SqlWindowPartitions,
   SqlWindowSortPartition, SqlWindowAssignRank, SqlComputeAggSimple,
   SqlBulkInsert, SqlBulkUpdate, SqlExprHasAgg, SqlEvalHaving
 - CTE / subquery / driving-table materialize paths use MEMRDD
 - SqlCoerce/SqlCmp/SqlIsTrue helpers moved from PRG to Go
 - SqlBulkUpdate defers Flush when WA cache active (APFS fsync was
   dominant B13 cost — 1.6ms/call → gone)

Correctness fixes uncovered during migration
 - ASort default path now sorts dates/logicals/timestamps (was no-op)
 - ORDER BY default NULL placement matches PRG SqlRowCompare across
   Go fast path; explicit NULLS FIRST/LAST honored by both paths
 - SqlBulkUpdate respects EXCLUSIVE vs SHARED mode record locks
 - SqlCmp/SqlCmpEq normalize NumInt vs Double (caught by test 6b)

Verification
 - go test ./...              ALL PASS
 - FiveSql2 test_sql1999      43/43
 - tests/compat_harbour       56/56 (+5 new: ASort dates/logicals,
                              AScan int cross-type)
 - Regression test test_null_order.prg for ORDER BY NULL ordering

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 20:20:14 +09:00
3caadb23b9 perf: SqlOrderBy + SqlGroupBy Go RTL — native sort and aggregation
SqlOrderBy: Go sort.Slice for ORDER BY, 10-50x faster than PRG ASort.
SqlGroupBy: Go map-based GROUP BY accumulation (ready for integration).
TryBuildSortSpec detects simple ORDER BY columns and routes to Go.
Fallback to PRG for complex ORDER BY expressions.

43/43 + 41/41 verify + 51/51 compat + go test ALL PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 14:41:41 +09:00
5fc9c3bbea perf: SqlHashJoin Go RTL — 3-way JOIN 4.2s→61ms (69x)
Go-native multi-table hash join bypasses per-row PRG overhead.
TryGoJoin detects equi-join + plain-col SELECT, aggregate cols
get placeholder. 2-way 73→3ms, 3-way 3.9s→61ms.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 07:16:09 +09:00
bfc6ded8cb perf(FiveSql2): SqlHashBuild + FetchRow column binding — 3-way JOIN 3x
Complex-query benchmarking turned up two hot paths that the earlier
SqlScan/SqlEach work didn't touch: multi-table JOIN and nested-scan
row fetching. This commit hits both.

--- Part 1: SqlHashBuild — Go-native hash-join build ---

FiveSql2's HashJoin previously built the inner-side hash in PRG:

    WHILE !Eof()
      xVal := FieldGet(nFPos)
      cKey := SqlValToStr(xVal)
      IF !hb_HHasKey(hHash, cKey) ; hHash[cKey] := {} ; ENDIF
      AAdd(hHash[cKey], RecNo())
      dbSkip()
    ENDDO

That loop runs at ~40μs per row from class dispatch + hb_HHasKey
lookups + AAdd growth + SqlValToStr formatting. On a 50k-row inner
table that's ~2 seconds wasted on what should be a sub-50ms
housekeeping op.

New hbrtl.SqlHashBuild does the same thing in one Go-native pass:

  - Direct *dbf.DBFArea loop (no interface dispatch, same devirt as
    SqlScan)
  - Go `map[string][]int64` accumulates RecNos by key — one
    allocation per distinct key
  - Inline ASCII-only digit formatter for numeric keys (strconv.Itoa
    is allocation-heavy for small ints)
  - CHAR keys are right-trimmed to match SqlCmpEq semantics so the
    hash probe matches what EvalExpr would compute
  - Final Five hash is built once from Keys/Values/Order slices
    directly, skipping the per-key hb_HSet path

HashJoin now calls `SqlHashBuild(nFPos)` instead of running the
PRG loop.

--- Part 2: TSqlExecutor:BuildFetchCache ---

The JOIN fallback loop calls FetchRow per row. FetchRow was already
column-ref-aware but did the string parse (`At + SubStr + Upper`)
and `::FindWA` linear scan every single invocation. For a 50k-row
join emitting 50k result rows, that's ~200k redundant resolutions.

New BuildFetchCache walks the SELECT list once before the scan and
pre-binds each plain-column expression to `{nWA, nFPos}`. FetchRow's
new fast path checks ::aFetchCache and jumps straight to
`dbSelectArea + FieldGet` when bound. Complex exprs (functions,
CASE, subqueries) still fall through to EvalExpr.

::aFetchCache is set right before the join WHILE loop and cleared
after — no cross-query bleed.

--- Bench (50k ord × 10k emp × 100 dept, 3-run steady state) ---

  Query                        Before      After     Speedup
  ────────────────────────────────────────────────────────────
  2-way INNER JOIN, 10k rows   91ms        68ms      1.34x
  2-way JOIN + GROUP BY        110ms       94ms      1.17x
  3-way INNER JOIN COUNT       2610ms      610ms     4.28x
  3-way JOIN + GROUP BY        2860ms      830ms     3.45x

The 3-way speedup is almost entirely SqlHashBuild. The 2-way case
benefits from the fetch cache because its per-row cost is dominated
by FetchRow (no second hash build to amortize).

--- Limits still standing ---

CTE + JOIN queries (Q7 in bench_complex: ~4.5s) aren't affected by
either optimization — CTE materialization goes through a different
path that writes/reads a temp DBF. Follow-up target.

Validation:
  - FiveSql2 43/43
  - Harbour compat 51/51
  - go test ./... ALL PASS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:47:20 +09:00
d2ed140273 feat(FiveSql2): SqlEach block callback — beats raw RDD on end-to-end timing
The structural 1.38x gap vs raw RDD for no-WHERE full scans wasn't
a limit of our engine — it was a limit of the result shape. SqlScan
materializes N rows as HbArray wrappers over a flat Value buffer,
then the PRG caller iterates that materialized array. Two passes
over the data. Raw RDD is one pass.

SqlEach folds both passes into one. The caller supplies a code block
that receives the selected column values as positional parameters;
SqlEach invokes it per matching row. No result array is ever built.

Usage (drop-in replacement for the common "scan + process" idiom):

    five_SQLEach( "SELECT id, name, salary FROM emp WHERE salary > 50000",
                  {|nID, cName, nSalary| Process(nID, cName, nSalary) } )

API shape borrows Harbour's AEval/ASort block-callback convention,
so there's nothing new to learn. Positional params also sidestep
the `SELECT COUNT(*)` naming problem — no need to invent names for
anonymous expressions.

Implementation notes:
  - 4-way loop specialization ({DBF, generic Area} × {WHERE, none}),
    matching SqlScan. Each path is zero-allocation in the steady state.
  - Block invocation uses the direct pendingParams + blk.Fn(t) protocol
    rather than EvalBlock, which would allocate a temporary args slice
    on every call (50k scans × small slice adds up).
  - FastFieldGetter is installed the same way as SqlScan so PcOpFieldGet
    in the WHERE predicate skips the PushSymbol + Function dispatch.

Bench (50k rows, end-to-end including user-code loop, steady state):

  Path                           Time     vs raw RDD
  ─────────────────────────────────────────────────────
  Raw PRG loop, WHERE + sum      8.7ms    1.00x
  SqlScan + PRG FOR, WHERE       5.1ms    0.59x
  SqlEach block, WHERE           4.1ms    0.47x  ← beats raw
  ─────────────────────────────────────────────────────
  Raw PRG loop, no WHERE         6.1ms    1.00x
  SqlEach block, no WHERE        3.8ms    0.62x  ← beats raw

SqlEach is faster than a hand-rolled `DO WHILE !Eof()` loop because
the per-row FieldGet in raw PRG still goes through a full Frame +
RTL dispatch, whereas SqlEach's FastFieldGetter captures the concrete
*dbf.DBFArea directly. The SQL abstraction now costs nothing — it
pays you to use it.

Validation:
  - FiveSql2 43/43
  - Harbour compat 51/51
  - go test ./... ALL PASS

Next step (not in this commit): FiveSql2 TSqlExecutor integration —
detect when five_SQL is called with a block argument and route to
SqlEach instead of SqlScan + array build.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 15:16:36 +09:00
5dd212c761 perf(sqlscan): specialize four loop variants (DBF×WHERE matrix)
SqlScan's inner scan was written as a single loop with `if whereFn
!= nil` and a `keep` shadow variable. Branch-predictable for sure,
but still a few extra ops per row and it prevented Go from inlining
the non-nil interface call on the Area branch.

Split into four specialized loop bodies on the two axes that drive
per-row cost:

  1. dbfArea != nil && whereFn != nil
  2. dbfArea != nil && whereFn == nil       ← tightest path (SELECT *)
  3. dbfArea == nil && whereFn != nil       ← generic Area
  4. dbfArea == nil && whereFn == nil

Each body has exactly the instructions it needs — no dead branches,
no shadow variables, no interface dispatch where avoidable. Copy-paste
cost is real but each row save adds up at 50k iterations.

Bench impact (50k rows, 3-run steady state):

  No WHERE            9.1ms → 8.7ms   1.38x vs raw (was 1.47x)
  Numeric WHERE       6.9ms → 7.0ms   ~flat (within noise)
  String WHERE        6.2ms → 6.4ms   ~flat (within noise)
  Raw RDD             6.3ms baseline

Validation:
  - FiveSql2 43/43
  - Harbour compat 51/51
  - go test ./hbrtl/... PASS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 14:04:48 +09:00
f9ffd4050e perf(FiveSql2): FieldGet peephole + DBFArea devirt — WHERE at ~1.15x raw RDD
Two stacked optimizations land on the SqlScan hot path. Combined
effect on the 50k-row benchmark:

                       Before    After   vs raw
  Numeric WHERE        10.2ms    7.8ms   1.15x
  String WHERE         10.5ms    7.9ms   1.15x
  No WHERE              9.2ms   10.0ms   1.45x
  Raw RDD baseline      6.8ms    6.8ms   1.00x

WHERE-predicate paths are now within 15% of the raw Harbour-style
RDD scan loop. The no-WHERE path is unchanged (slight jitter from
the added devirt branch); FieldGet peephole doesn't apply there.

--- Optimization 1: PcOpFieldGet peephole ---

Adds a new pcode opcode `PcOpFieldGet <fieldIdx>` (0x46) that skips
the usual PushSymbol+Function+Frame+FieldGet-RTL+EndProc chain and
calls a direct field getter closure instead. genpc recognizes the
shape `FieldGet(<int-literal>)` during emitCall and emits the
specialized opcode automatically — no SQL-side API change.

Integration:
  * hbrt.Thread.FastFieldGetter  — hot-path closure set by scan loops.
                                   Non-nil → pcode bypasses dispatch.
                                   Nil → pcode resolves FIELDGET via
                                   the RTL symbol table (correctness
                                   fallback for any other callers).
  * compiler/genpc/genpc.go      — peephole in emitCall.
  * hbrt/pcinterp.go             — PcOpFieldGet handler.

This alone cut numeric WHERE from 10.2 → 7.9ms: eliminated roughly
one full Frame/EndProc + RTL dispatch per row × 50k rows.

--- Optimization 2: DBFArea devirtualization ---

SqlScan type-asserts the workarea to *dbf.DBFArea once and runs a
dedicated loop that calls GoTop/EOF/Skip/GetValue directly on the
concrete type. Go's compiler inlines these, skipping the interface
vtable per row. Non-DBF drivers still work via the generic Area
branch.

The FastFieldGetter closure also captures *DBFArea directly in the
DBF branch, so the WHERE predicate side of the hot loop is now
entirely devirtualized: no interface dispatch between the pcode
dispatch loop and the DBF record buffer.

Validation:
  - FiveSql2 43/43
  - Harbour compat 51/51
  - go test ./... ALL PASS

Remaining gap to raw RDD on no-WHERE (~1.45x) is dominated by the
two-column row construction + ArraySlab + flat backing bookkeeping
that the raw loop doesn't do. Going below that requires changing
the SQL engine's result shape — out of scope here.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:23:31 +09:00
5c067f35a4 perf(hbrt): ExecPcodeFast — pcode variant without defer/recover
Pcode expressions compiled from SQL WHERE clauses (via genpc.CompileExpr)
never contain BEGIN SEQUENCE and can't raise BreakValue, so the defer +
recover dance in ExecPcode's EndProc is pure overhead. For FiveSql2's
per-row WHERE evaluation on a 50k-row scan, that's 50k × ~15ns = ~750µs
of pointless recover bookkeeping.

Split ExecPcode into two variants sharing execPcodeBody:

  ExecPcode     — full: Frame + defer EndProc. General-purpose,
                  handles panics. Behavior unchanged.

  ExecPcodeFast — hot: Frame + execPcodeBody + EndProcFast. No defer,
                  no recover. Caller guarantees the pcode body can't
                  panic with HbError / BreakValue.

SqlScan now uses ExecPcodeFast for per-row WHERE evaluation. Measured
impact on 50k-row no-WHERE benchmark: 10.6ms → 9.2ms steady state
(~13% faster). Effect is smaller on numeric-WHERE because per-row
cost there is dominated by the opcode dispatch itself, not the frame
exit.

Validation:
  - FiveSql2 43/43
  - go test ./hbrt/... PASS (pcode tests)
  - go test ./hbrtl/... PASS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:07:54 +09:00
85541a3035 perf(sqlscan): flat backing buffer — 30% faster no-WHERE scan
The prior loop allocated one small `[]hbrt.Value` per matching row
(for the row body) plus one HbArray header. For a 50k-row full scan
that's 100k allocations of which the small-slice allocs dominated
fragmentation and GC pressure.

SQLite-inspired fix: pre-allocate a single flat []hbrt.Value of
capacity `RecCount * nFields` at scan start and hand each row a
three-index sub-slice (flat[off:end:end]). The capped sub-slice
still forces a reallocation if PRG code later does `AAdd(row, x)`,
so neighbor rows can't get clobbered.

Sizing the initial buffer off RecCount(err-ignored) was the actual
win — the previous naive grow-from-1024 policy caused five mid-scan
reallocations of a ~200 KB buffer, each memcpy'ing everything so far.
One upfront allocation amortizes much better.

Bench (50k rows, ~/tmp ext4, 3 runs steady-state):

                          Before        After       Δ
  no WHERE                14.6ms       10.6ms     −27%
  numeric WHERE           11.7ms       10.0ms     −15%
  string WHERE            10.5ms       11.0ms     ~=
  raw RDD baseline         6.8ms        7.0ms

Gap to raw RDD: 2.1x → 1.4x on the dominant no-WHERE case. What's
left is pcode WHERE dispatch (ExecPcode frame per row), the Area
interface boundary, and the HbArray header allocation per row —
all structural costs that would need a wider refactor to close.

Validation:
  - FiveSql2 43/43
  - go test ./hbrtl/... PASS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:57:05 +09:00
d74014a235 feat(rdd): dbInfo / dbOrderInfo — implement the stubs
Replaces the `return NIL` stubs with real implementations that read
from the current workarea. Covers the info codes actually used by
downstream code (FiveSql2 TSqlIndex, standalone callers):

DBINFO:
  DBI_ISDBF, DBI_CANPUTREC, DBI_FULLPATH, DBI_TABLEEXT, DBI_MEMOEXT,
  DBI_SHARED, DBI_ISREADONLY, DBI_GETRECSIZE, DBI_DBVERSION,
  DBI_RDDVERSION, DBI_BOF, DBI_EOF, DBI_FOUND, DBI_FCOUNT, DBI_ALIAS,
  DBI_POSITIONED

DBORDERINFO:
  DBOI_EXPRESSION, DBOI_NAME, DBOI_NUMBER, DBOI_POSITION,
  DBOI_ORDERCOUNT, DBOI_KEYCOUNT, DBOI_KEYCOUNTRAW

Unknown info codes still return NIL (Harbour's forgiving fallback).

New accessors on DBFArea (FullPath, IsShared, IsReadOnly) expose the
private filePath/shared/readOnly fields to the hbrtl layer without
plumbing them through the generic Area interface.

Unblocks TSqlIndex:FindExclusive's original DBI_FULLPATH/DBI_SHARED
scan — though the short-circuit there stays in place for now since
it's a correctness workaround that no longer masks a crash thanks
to the recent gengo PushMemvar fallback.

Validation:
  - FiveSql2 43/43 (0 warnings)
  - Harbour compat 51/51
  - go test ./... ALL PASS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:42:18 +09:00
3a00aa5435 feat(hbrtl): field metadata + index creation RTL — TSqlIndex warnings to zero
TSqlIndex.prg had five undefined identifiers and six undefined
constants that the new CLASS-method analyzer surfaced after the
gengo PushMemvar fallback stopped crashing on them. All real tech
debt, not false positives. This lands the implementations.

New RTL functions (hbrtl/indexrtl.go + register.go):
  - FieldType(n) → "C"/"N"/"L"/"D"/"M"/... one-letter type
  - FieldLen(n)  → length in bytes
  - FieldDec(n)  → decimal places
  - ordCreate(cBag, cTag, cExpr [, bExpr] [, lUnique])
      → DBFArea.OrderCreate with TagName set (CDX tag or NTX tag)
  - dbCreateIndex(cFile, cExpr [, bExpr] [, lUnique])
      → legacy Clipper single-tag NTX without TagName
  - dbClearIndex() → OrderListClear

All pass through the existing Indexer interface; key expressions go
through the MacroEval slow path since callers pass string literals.
When callers are updated to pass compiled key blocks, the existing
KeyFunc fast path kicks in automatically.

New header files (include/):
  - dbinfo.ch  — DBI_* and DBOI_* constants with Harbour-compatible
                 values (FULLPATH=10, SHARED=42, EXPRESSION=2, etc.)
  - dbstruct.ch — DBS_NAME/TYPE/LEN/DEC field descriptor indices

TSqlIndex.prg already did `#include "dbinfo.ch"` and `#include
"dbstruct.ch"` but Five's preprocessor silently ignored the missing
files. Both headers land in include/ where cmd/five's include-dir
chain already looks.

Analyzer RTL allow-list updated with the six new function names so
the warning pipeline stays clean.

Result: FiveSql2 build goes from 17 WARN → 0. Both tracked test
suites still pass.

Note: dbInfo() / dbOrderInfo() themselves remain stubbed (return NIL)
— the constants exist for compile-time resolution and for future use
when the stubs are replaced. Callers that depend on actual dbInfo
values still get NIL at runtime.

Validation:
  - FiveSql2 43/43
  - Harbour compat 51/51
  - go test ./... ALL PASS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:11:57 +09:00
8aaed994f4 perf(FiveSql2): hybrid fast path — 11x speedup on string WHERE scans
Implements hybrid execution model: keep AST tree-walk for SQL:2013+
features (Window, Recursive CTE, JOIN, aggregates) while compiling
simple SELECT hot paths to Go + pcode. See docs/FiveSql2-Hybrid-Plan.md
for the full architecture rationale (why not SQLite-style VDBE).

Hot path (single table, no joins/groups/aggregates):
  - TryBuildFieldPositions: resolves SELECT column list to FieldPos
    array once per query (bails to PRG loop on any complex expr).
  - TryCompileWhere + SqlExprToPrg: walks WHERE AST, emits equivalent
    PRG source, runs it through PcCompile to get a PcodeFunc.
  - SqlScan RTL: Go-native scan loop — GoTop/EOF/Skip/GetValue
    direct, ExecPcode per row for WHERE, result array pre-alloc.

WHERE compiler scope:
  - ND_LIT numeric/logical/string (string literals AllTrim'd to match
    SqlCmpEq CHAR-padding semantics; rejects embedded quotes/newlines)
  - ND_COL: CHAR fields auto-wrapped with AllTrim(FieldGet(n)) based
    on dbStruct() lookup cached once per query in aCompileStruct
  - ND_BIN: = <> != < <= > >= AND OR + - * /
  - ND_UNI: NOT -
  - Anything else (ND_FN, ND_CASE, ND_SUB, ND_PAR, LIKE, IN, IS NULL,
    BETWEEN, dates) returns NIL → falls back to PRG tree-walk.

Bench (50k rows, ~/tmp ext4):
                        Before      After     Speedup
  Numeric WHERE         ~150ms     11.7ms     ~13x
  String WHERE          119.3ms    10.5ms     11.4x
  No WHERE               -         14.6ms      -
  Raw RDD baseline        6.8ms     6.8ms      1.0x

Remaining gap to raw RDD (~1.5x) is structural: Value boxing, result
array construction, per-row ExecPcode frame overhead. Would need a
Value-pool or SoA refactor to close further.

Side fixes bundled:
  - TSqlIndex:FindExclusive short-circuited. Originally called
    dbInfo(DBI_FULLPATH)/DBI_SHARED which are unresolved symbols in
    Five (dbInfo is a stub, DBI_* never defined). Panic'd with
    "local variable index out of range: 0" whenever a standalone PRG
    had a workarea Used before calling five_SQL. 43-test masked the
    bug because it only reached FindExclusive with no open workareas.
    Restore the scan once dbInfo lands in hbrtl.
  - cmd/five/main.go: FIVE_KEEP_BUILD=1 env var keeps the temp Go
    project around for debugging gengo output.

Validation:
  - FiveSql2 43/43
  - Harbour compat 51/51
  - go test ./... ALL PASS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 09:15:08 +09:00
6b26f1b642 feat: genpc.CompileExpr + PcCompile/PcEval runtime bytecode API
Expose Five's existing FRB bytecode compiler for single-expression
compilation, enabling prepared-statement-style caching in dynamic
query engines (FiveSql2, scripting layers, rule engines).

1. genpc.CompileExpr(ast.Expr) *hbrt.PcodeFunc
   - New public API that compiles a single expression to a
     standalone pcode function
   - Reuses genpc's mature emitExpr (no new emit logic)
   - ExecPcode manages the frame around the generated code

2. hbrtl.PcCompile(cPrgExpr) -> pFunc
   - RTL entry point for runtime compilation
   - Wraps the expression in a FUNCTION stub, uses the full PRG
     parser pipeline (pp + parser + genpc), extracts the compiled
     pcode function, returns it as an opaque pointer
   - Callers pay parse+compile cost ONCE per expression

3. hbrtl.PcEval(pFunc) -> xValue
   - RTL entry point for runtime execution
   - Calls hbrt.ExecPcode; the pcode's RetValue opcode sets retVal,
     which our EndProc preserves as PcEval's return value
   - ~1.2x slower than direct FieldGet (pcode interpreter overhead),
     but eliminates AST tree-walk per row for complex expressions

Usage (FiveSql2 hot path, planned):
   pc := PcCompile("FieldGet(4) > 50000")  // parse+compile once
   WHILE !Eof()
      IF PcEval(pc)                         // ~10us per row
         AAdd(aRows, ...)
      ENDIF
      dbSkip()
   ENDDO

Benchmark (50k records, WHERE salary > 50000):
   Raw FieldGet:      7.9 ms  (baseline)
   FieldPos+Get:     10.2 ms  (with O(1) FieldPos cache)
   PcEval bytecode:  10.1 ms  (interpreted bytecode)
   MacroEval:        parse+eval per row — orders of magnitude slower

Tests:
   go test ./...        ALL PASS (14 packages)
   FiveSql2 43/43       100%
   compat_harbour       51/51
   PcCompile/PcEval     verified on 50k-row scan

FiveSql2 engine integration deferred — requires careful PRG-level
refactoring to thread pcode pointers through the plan structure.
The Go-level infrastructure is now in place for that work.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 07:57:52 +09:00
ed33af41c5 perf: FieldPos O(1) cache + xbase import detection for function-call PRGs
Two SQLite-style optimizations for RDD and SQL workloads:

1. FieldPos() O(1) column binding cache

   Before: FieldPos(name) linear scan — O(n) per call with string
           comparison. In SQL engines that call FieldPos per row per
           column, this is hundreds of thousands of calls.

   After:  DBFArea builds a map[UPPER(name)]→pos on first lookup.
           All subsequent lookups are O(1) hash. SQLite calls this
           "column affinity binding" — positions resolved at prepare,
           not per row.

   Implementation:
     - hbrdd/dbf/dbf.go: DBFArea.FieldPosCache(name) method
     - hbrtl/procinfo.go: FieldPos RTL uses fieldPosCacher interface
     - Lazy init: only pays for tables that get queried

2. hbrdd import auto-detection for function-call style PRGs

   Before: compiler only added hbrdd import when PRG used xBase commands
           (USE, SKIP, INDEX...). Pure function-call style like
           `dbUseArea(.T.,,"t")`, `FieldPut(1, val)` was missed —
           generated Go failed to compile ("undefined: hbrdd").

   After:  scanStmtsForXBase walks ExprStmt bodies too, detecting
           CallExpr to any of the ~40 xBase RTL function names.
           FIELD->NAME alias expressions also trigger the import.

   Resolves: small PRGs that use only dbUseArea/FieldGet/FieldPut.

Benchmark notes (50k records):
  Raw RDD scan:              7 ms    (baseline)
  FiveSql2 SELECT WHERE:   157 ms    (unchanged — bottleneck is
                                      not FieldPos, it's PRG-level
                                      expression tree walk per row)
  compat_harbour 51/51:    PASS
  FiveSql2 43/43:          100%

The FieldPos cache helps heavy field-name-based code paths but the
primary FiveSql2 bottleneck is the PRG interpreter walking expression
ASTs per row (needs bytecode compilation to close the gap).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 07:42:00 +09:00
3adc9d7d59 fix: PCount, Break/RECOVER, SET INDEX TO — 3 Harbour compat fixes
Release-blocking compatibility issues discovered during the 258-test
pre-release validation suite (100 syntax + 44 RDD + 114 RTL).

1. PCount() always returned 0 in PRG code

   Root cause: ParamCount() returned t.pendingParams, which is
   overwritten by every nested Function() call. By the time the
   PCount() RTL's Frame() executes, pendingParams is already 0.

   Fix: Frame() now stores pendingParams in frame.paramCount.
   PCount() RTL uses CallerParamCount() which reads callSP-2
   (the PRG caller's frame), while RTL functions still use
   ParamCount() (reads pendingParams before their own Frame).

   Verified: PCount(1,2,3)=3, PCount(1)=1, PCount()=0

2. Break("string") panicked instead of being caught by RECOVER USING

   Root cause: Generated SEQUENCE code only caught *HbError panics.
   Break() panics with BreakValue (a different type), which fell
   through to EndProc's "runtime error" message and re-panic.

   Fix (two parts):
   a) gengo emitBeginSequence: recover closure now catches any
      panic (interface{}), then dispatches via type switch:
      - *HbError → extract .Error() string
      - hasValue interface (BreakValue) → extract .GetValue()
      - other → static "error" string
   b) hbrtl/error.go: BreakValue gets GetValue() method for
      duck-type detection without import cycles
   c) hbrt/thread.go EndProc: BreakValue type name check added
      so it re-panics silently (no stderr noise)

3. SET INDEX TO a, b, c only opened the last file

   Root cause: Parser's parseSet() called parseExpr() once for
   INDEX setting, stopping at the first comma. Remaining file
   names were consumed by the "eat rest of line" loop.

   Fix: Parser now collects comma-separated identifiers into a
   single string literal "a,b,c". gengo splits on comma and
   calls OrderListAdd() for each file.

   Verified: SET INDEX TO si_name, si_city → OrdCount=2

All tests pass:
  go test ./...          14 packages OK
  FiveSql2               43/43  100%
  compat_harbour         51/51
  Syntax test           100/100
  RDD test               44/44
  RTL test              114/114
  Windows cross-compile  OK
  Linux cross-compile    OK

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:06:28 +09:00
fc1dca9551 feat(rdd): real POSIX file/record locking + gap analysis doc
Replaces the FLOCK/DBRLOCK/DBRUNLOCK no-op stubs with actual
fcntl(F_SETLK) byte-range advisory locks, matching Harbour's
hb_fsLockLarge implementation.

Before: rtlDbRLock always returned .T. regardless of contention.
        Multi-process writers could silently corrupt records.

After:  Non-blocking POSIX byte-range locks per file descriptor.
        Cross-process exclusion verified by a subprocess-spawning
        Go test that witnesses BUSY vs OK transitions.

New files:
  hbrdd/dbf/locks_posix.go    fcntl F_WRLCK/F_UNLCK wrappers
  hbrdd/dbf/locks_windows.go  stub (TODO: LockFileEx)
  hbrdd/dbf/lock_multi_test.go   cross-process verification
  docs/gap-analysis.md        honest Harbour parity assessment

Modified:
  hbrdd/dbf/dbf.go
    - DBFArea gains fileLocked bool + lockedRecs map
    - Close() calls releaseAllLocks() before dropping the fd
  hbrtl/database.go
    - rtlDbRLock / rtlDbRUnlock now delegate to DBFArea.LockRecord /
      UnlockRecord instead of returning fixed .T./NIL
    - New rtlFLock / rtlDbUnlock for FLOCK() / DBUNLOCK()
  hbrtl/register.go
    - FLOCK and DBUNLOCK symbols registered (were missing entirely)
  compiler/analyzer/analyzer.go
    - FLOCK / DBUNLOCK added to RTL known-function set

Lock region layout (non-overlapping on purpose):
  FLOCK region       [0, HeaderLen+1)
  Record N region    [RecordOffset(N), RecordLen)

So a workarea can hold FLOCK and multiple DBRLOCK simultaneously
on the same fd without conflict.

Design rationale (captured in locks_posix.go header):
  * POSIX fcntl, not flock(2) — byte-range + NFS-safe
  * Non-blocking F_SETLK — matches Clipper FLOCK() → .F. semantics
  * Released explicitly on Close to avoid workarea-sharing races
  * Windows falls back to no-op (TODO: LockFileEx)

Verification:
  go test ./hbrdd/dbf/ -run TestFLockBlocksAcrossProcesses  PASS
  go test ./hbrdd/dbf/ -run TestRLockBlocksAcrossProcesses  PASS
  go test ./...                                             ALL PASS
  FiveSql2 43/43                                            100%
  compat_harbour 51/51                                      100%

The gap-analysis doc (docs/gap-analysis.md) is a running inventory
of what works vs what's still missing vs Harbour 3.2, written for
users evaluating Five for production — not a sales pitch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 17:58:03 +09:00
e95afad4ee feat: Harbour RDD parity — NTX/CDX 100% compatible, FIELD-> works
Five RDD engine now matches Harbour DBFNTX and DBFCDX byte-for-byte
in ordering, seek, navigation, and field access. Verified against
Harbour 3.2.0dev with a 281-line comparison test covering:
  - Natural/NAME/CITY/AGE/SALARY/UPPER ordering
  - SEEK (exact/not-found), GoTop/GoBottom per order
  - DELETE/RECALL with SET DELETED
  - CDX compound index read with 5 tags (BYNAME, BYCITY, BYAGE, BYSAL, BYUNAME)
  - Reverse traversal

Fixes:

1. FIELD->NAME returned NIL
   GetAliasField returned interface{} but runtime expected hbrt.Value,
   so the type assertion in PushAliasField failed and pushed NIL.
   - workarea.go: change return type to hbrt.Value, handle FIELD/_FIELD
     as current-workarea alias, add SetAliasField
   - gengo.go: emit SetAliasField() for alias->field := value in both
     statement and expression contexts

2. OrdSetFocus(n) silently switched to natural order
   v.AsString() returns "" for a numeric Value, so OrderListFocus("")
   set current=-1.
   - indexrtl.go: convert numeric param via fmt.Sprintf("%d", ...)

3. CDX compound tag order mismatched Harbour
   Five decoded the structural B-tree which is alphabetical, but
   Harbour sorts tags by TagBlock (file offset = creation order).
   - cdx/cdx.go: sort tagEntries by offset ascending after decoding,
     matching hb_cdxIndexLoadAvailTags in dbfcdx1.c

4. OutStd()/OutErr() not registered — caused panic on call
   - hbrtl/console.go: add rtlOutStd/rtlOutErr implementations
   - hbrtl/register.go: register OUTSTD and OUTERR
   - analyzer.go: add OUTSTD/OUTERR to RTL known-functions

5. FIELD keyword triggered "undeclared variable" warnings
   - analyzer.go: add FIELD, _FIELD, M, MEMVAR as builtin constants

Tests:
  go test ./...        — ALL PASS (17 packages)
  FiveSql2 43/43       — 100%
  compat_harbour 51/51 — 100%
  Harbour diff         — 0 lines differ (281-line comparison)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 16:37:47 +09:00
486e466592 feat: FiveSql2 43/43, @byref, mutable closure, RTL 479, DateTime fix
Major changes since last commit:
- FiveSql2 SQL:1999 engine (10,458 LOC) — 43/43 ALL PASS
- 21 compiler/runtime bugs fixed (short-circuit AND/OR, FOR LOOP, etc.)
- @byref pass-by-reference via RefCell pattern
- Mutable closure capture (EnsureLocalRef + RefCell sharing)
- RTL: 400 → 479 functions (+79: file, string, datetime, hash, UTF-8)
- DateTime/Timestamp fully working (hb_DateTime, hb_Hour/Min/Sec, display)
- Reserved word guard (39 keywords blocked from function calls)
- AEval arg order fix (element before index)
- Closure capture redecl fix (unique _cap_ names per block)
- Hash/string indexing in ArrayPush/ArrayPop
- Harbour compat test suite: 51/51
- 4 docs: Porting Report, Implementation Plan, Optimization Plan, Commercialization

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:35:37 +09:00
d451b836a6 perf: inline Str/PadR/PadL/SubStr/Left/Right/At/IIF in gengo
13 more RTL functions inlined — no Frame/EndProc, no VM dispatch:
- Str(n,w,d) → fmt.Sprintf("%*.*f", w, d, n)
- PadR(s,n) → s + hbrtl.Spaces(n-len(s))
- PadL(s,n[,fill]) → Spaces(pad) + s or Repeat(fill, pad) + s
- SubStr(s,p,l) → s[p:p+l] with bounds check
- Left(s,n) → s[:n], Right(s,n) → s[len-n:]
- At(search,target) → strings.Index + 1
- IIF(cond,a,b) → if/else without function call

Also: Spaces() exported for generated code access.

50K SEEK random: 62ms (Harbour 67ms — Five FASTER!)
82/82 stress PASS. 14 packages ALL PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:16:38 +09:00
05ccef05e2 perf: EndProcFast — eliminate defer recover() from RTL hot paths
Problem: every RTL function calls defer t.EndProc() which does recover().
50K SEEK loop = 250K recover() calls = ~12ms wasted.

Solution: EndProcFast() skips recover (only needs endFrame restore).
Applied to ALL RTL functions in strings.go, rdd.go, missing.go, database.go.
EndProc() with recover kept for generated PRG code (needs BEGIN SEQUENCE).

Analysis (50K sequential SEEK breakdown):
  Go NTX Seek direct: 7ms (faster than Harbour 27ms!)
  PRG VM overhead:    38ms (Frame + RTL calls + key generation)
  Key generation:     25ms (Str+LTrim+PadL+PadR = 5 RTL Frame/EndProc per iter)

With EndProcFast: RTL overhead reduced ~30%.

CDX SCOPE: 2ms (Harbour 4ms — 2x FASTER!)
82/82 stress PASS. 14 packages ALL PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 21:43:39 +09:00
5b378318a0 perf: RTL optimization — cached WA, spaces pool, stack-alloc fmt_int64
rdd.go:
- getWA() cached type assertion (avoid repeated interface check)
- waCache stores last WA pointer → O(1) for repeated calls

strings.go:
- spacesCache[257]: pre-built space strings for pad sizes 0-256
- spaces(n) returns cached string (no Repeat allocation)
- PadR/PadL use spaces() for fill=" " (most common case)
- Str() uses spaces() for right-padding

missing.go:
- fmt_int64: stack-allocated [20]byte array (was heap make([]byte))
- Reverse iteration (no prepend overhead)
- PadC uses spaces() for left/right padding

Benchmark (ext4, home dir):
  10K APPEND: 28ms → 26ms (Harbour 27ms!)
  50K APPEND: 130ms → 113ms (13% improvement)
  50K SCAN: 24ms → 23ms
  50K DUPKEY: 42ms → 35ms (17% improvement)
  CDX SCOPE: 12ms → 10ms (17% improvement)

82/82 stress PASS. 14 packages ALL PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 18:16:22 +09:00
b7028791d6 fix: 5 seek/dbf bugs — 77/77 thorough Harbour compatibility
1. SOFTSEEK: use idx.CurRecNo() for positioning (was checking recNo > 0)
   - SEEK with SET SOFTSEEK ON now positions at next higher key
   - SEEK command reads SET SOFTSEEK at runtime (was compile-time only)
   - rtlDbSeek defaults to GetSetSoftSeek() when no explicit param

2. SET DELETED ON + INDEX: SkipIndexed skips deleted records
   - GoTopIndexed: skip deleted record at top position
   - SkipIndexed: inner loop continues past deleted records

3. Compound key (CITY+NAME): field name TrimSpace before lookup
   - evalKeyExprInner: TrimSpace on fieldName after FIELD-> strip
   - Fixed "CITY " != "CITY" mismatch from + operator splitting

4. SET INDEX TO filename: treated as string, not variable
   - gengo uses exprToString for SET INDEX TO (was emitExpr)
   - Prevents identifier being resolved as local variable

5. hasXBaseCommands: recursive scan into nested blocks
   - BEGIN SEQUENCE, IF, FOR, DO WHILE, SWITCH bodies now scanned
   - Fixes missing hbrdd import for DB commands inside blocks

Thorough test: 77 items (14 sections) covering exact/partial/soft seek,
SET DELETED, duplicate keys, numeric keys, compound keys, empty/single
table, state consistency, order switching, full traversal — all identical.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 14:08:51 +09:00
7e2a159b88 feat: CDX support + ORDSCOPE + cross-read Harbour compatibility
CDX Integration:
- IndexEngine interface: common for NTX Index and CDX Tag
- OrderListAdd: auto-detects .cdx/.ntx extension, opens CDX tags
- decodeCompoundLeaf: proper bit-packed tag directory decoding
  (was stub falling through to scanCompoundLeaves with wrong names)
- CDX Tag: added KeyLen(), KeyExpr(), ForExpr(), IsDescending(), Close()
- CDX compound recNo = direct byte offset (not page number)

ORDSCOPE:
- SetScope/ClearScope/SetScopeTop/SetScopeBottom on DBFArea
- GoTopIndexed: seeks to scopeTop, validates within scopeBottom
- GoBottomIndexed: seeks to scopeBottom boundary
- SkipIndexed: stops at scope boundaries (top and bottom)
- OrdScope RTL function registered (nScope: 0=TOP, 1=BOTTOM)
- scopeKeyFromValue: converts Value to padded key bytes

Index Order Management:
- OrderListFocus: handles numeric order ("2" → order 2)
- SET ORDER TO n: gengo emits hbrt.NtoS for int-to-string conversion
- IndexOrd/OrdCount/OrdName/OrdKey: real implementations (were stubs)
- OrderCount/CurrentOrder/OrderName/OrderKeyExpr accessors on DBFArea
- ClearScope on order switch (prevents stale scope)

Cross-read test: Harbour-created CDX → Five reads, 20/20 items match:
  NAME/CITY/ID seek, ORDSCOPE count, GoTop/GoBottom all identical

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 12:21:26 +09:00
6e78d12cc2 fix: 3 RDD compat bugs — FIELD->, AsNumInt Double, PACK/ZAP with index
Bug 1: FIELD->NAME in INDEX ON expression
- evalKeyExprInner: strip FIELD->/alias-> prefix before field lookup
- exprToString: handle AliasExpr (FIELD->NAME → "FIELD->NAME")

Bug 2: AsNumInt() on Double returned IEEE 754 raw bits
- Value.AsNumInt(): check tDouble and convert via Float64frombits
- Fixed array index crash when index is result of % modulo

Bug 3: PACK/ZAP crash with open indexes
- OrderListRebuild: fully implemented (was TODO stub)
  Saves index info, closes all, sets idxState=nil, recreates
- OrderCreate: set current=-1 during key evaluation (natural GoTo)
- PACK/ZAP: save/restore idxState, rebuild after operation
- Register __DBPACK, __DBZAP, DBRECALL symbol aliases

Harbour vs Five: 45/47 match (96%), 2 diffs are duplicate-key sort order

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 04:41:19 +09:00
21fd9dc65c feat: SET DELETED filtering, SEEK/LOCATE/CONTINUE, SET command codegen
- skipFilter: skip deleted records in GoTop/GoBottom/Skip when SET DELETED ON
- hbrdd.IsSetDeleted callback: avoids circular import hbrdd→hbrtl
- Parser: capture ON/OFF for boolean SET commands (DELETED, EXACT, SOFTSEEK, etc.)
- Parser: capture TO expr for SET DATE/DECIMALS/EPOCH
- Gengo: emit proper t.Do() calls for 11 SET toggles + 3 value SETs
- stmtSet: was stub (skipToEOL), now calls parseSet()
- RTL: register 11 SET toggle functions (SETDELETED, SETEXACT, etc.)
- RTL: DBLOCATE/DBCONTINUE for sequential search
- RTL: DBSETFILTER/DBCLEARFILTER/DBFILTER
- PadL/PadR: support 3rd param fill character
- Area interface: added SetFound, SetLocate, LocateBlock, filter methods
- MemRDD: implements new Area interface methods
- Comprehensive PRG test: test_search.prg (7 test suites all pass)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:33:59 +09:00
827adeeb99 feat: SET commands + ErrorBlock/Break error handling
SET commands (setcmd.go):
  - SetDateFunc: __SetDateFormat([cNew]) → cOld
  - SetDecimalsFunc: SET DECIMALS TO n
  - SetEpochFunc: SET EPOCH TO n
  - 11 toggle functions: SetExact, SetDeleted, SetSoftSeek, SetExclusive,
    SetFixed, SetCancel, SetBell, SetConfirm, SetInsert, SetEscape, SetWrap
  - SET constants: _SET_EXACT, _SET_DELETED, etc. for PRG code
  - GetSetDateFormat(), GetSetDecimals(), GetSetEpoch() helpers
  - Default: DATE="mm/dd/yy", EPOCH=1900, DECIMALS=2

Error handling (error.go):
  - Break(xValue): panics with BreakValue, caught by BEGIN SEQUENCE
  - BreakBlock(): returns {|e| Break(e)} code block
  - LaunchError(): dispatches error through ErrorBlock handler
  - RuntimeError(): creates + launches standard runtime error
  - IsBreak(): checks if recovered panic is a BreakValue
  - createErrorHash(): builds Harbour-compatible error hash

Registration: ErrorBlock, ErrorNew, DosError, FError, Break + all SET functions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 15:33:07 +09:00
f950cb0784 fix: Phase 2 — HIGH #6,9,10,11,12,19,23,32,46,47
Files modified (5):
  compiler/gengo/gengo.go — #6,#32: Deduplicate 3 Generate functions into 1
    doGenerate(file, debug, library) replaces 170 lines of copy-paste
    Dead GenerateDebug method removed
  cmd/five/main.go — #9,10,11,12: Fix 5 silently ignored errors
    filepath.Abs, tidy.Run, tidyCmd.CombinedOutput now checked
  hbrtl/strings.go — #19: Str() now reads caller's nWidth/nDec params
    Was ignoring explicit Str(123, 10, 2) arguments
  compiler/pp/pp.go — #46: Fix stale "NOT implemented" comment
    #47: Extract maxIncludeDepth constant
  compiler/lexer/lexer.go — #23: Remove unused LookupKeyword result

Issues resolved: 10 (HIGH: 7, MEDIUM: 3)
Total fixed: 26/53

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 11:47:26 +09:00
7c61db70c3 fix: Critical code review fixes — race conditions, panic recovery, LTRIM/RTRIM
CRITICAL fixes:
- fileio.go: Add sync.Mutex to file handle table (race condition #2)
  allocHandle/getHandle/removeHandle thread-safe helpers
- goroutine.go: Add defer/recover to GoLaunch/GoLaunchBlock
  Goroutine panic no longer crashes entire process (#5)

HIGH fixes:
- strings.go: Implement proper LTrim (TrimLeft) and RTrim (TrimRight)
  Previously both aliased to AllTrim — silent semantic bug (#18)
- register.go: TRIM = RTrim (Harbour compatible)

From 53-issue senior code review.
Remaining: 47 issues (HIGH: 10, MEDIUM: 18, LOW: 16, CRITICAL: 3)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 10:17:30 +09:00
272576f6ce feat: Harbour-compatible VM shutdown sequence
Implements full cleanup on program exit (normal, Ctrl+C, crash):
- EXIT PROCEDURE auto-execution (reverse module order)
- AtExit callback registry (LIFO order)
- All WorkAreas auto-close (child before parent)
- Terminal restore (raw → normal) on signal/exit
- Static variables clear
- Signal handlers (SIGINT, SIGTERM) for clean shutdown
- shutdown.go: Harbour hb_vmQuit() 25-step sequence adapted for Five
- vm.go: Run() now calls Shutdown() via defer
- rawtty.go: terminal restore registered with shutdown system

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 10:07:42 +09:00
59568f3301 Five v0.9 — Harbour + Go fusion language
- Compiler: PP → Lexer → Parser → Analyzer → Gengo pipeline
- Parser: 232/236 (98%) Harbour compatibility, registry-based dispatch
- RTL: 351 Harbour-compatible functions
- RDD: DBF/NTX/CDX engines with Rushmore bitmap optimization
- Go Interop: IMPORT + pkg.Func() + obj:Method() with FastPath (15M calls/sec)
- HB_FUNC API: Full Harbour C API compatible Go bridge
- Concurrency: SPAWN/LAUNCH/GOROUTINE, <-, WATCH, PARALLEL FOR, ASYNC/AWAIT
- Extensions: Multi-return, DEFER, Slice, f-string, Nil-safe ?:, CONST
- Macro Compiler: Runtime AST parsing and evaluation
- Debugger: TUI debugger with source display, breakpoints, stepping
- FRB: Native + Pcode dual mode runtime binary
- Tests: 13 packages ALL PASS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 09:41:50 +09:00