Commit Graph

164 Commits

Author SHA1 Message Date
e961660f61 feat(pp): COPY TO via std.ch + four PP completeness fixes
`COPY TO <file> [FIELDS <list>] [FOR ...] [WHILE ...] [NEXT ...]
[RECORD ...] [REST] [ALL]` reaches the parser as a plain function
call to a new RTL primitive __dbCopy (rtlDbCopy in hbrtl/database.go).

Implementation: project the field list (case-insensitive name match
against the source's structure, full copy when omitted), dbCreate the
target file with that struct, open it under a temp alias, walk the
source under dbEval-style FOR/WHILE/NEXT/RECORD/REST bounds, and
GetValue/Append/PutValue per record into the target. SDF / DELIMITED
variants stay parser no-ops until those backends arrive.

Wiring up COPY surfaced four longstanding gaps in the PP that had to
be fixed for the rule to even reach the runtime:

  * `<(name)>` *pattern* marker was treated as a regular `<name>`
    with the parens baked into the captured key, so the matching
    result substitution `<(name)>` couldn't find it. parseOneMarker
    now strips the parens at parse time so capture key and result
    marker share the bare name. The smart-stringify result behavior
    is unchanged.
  * matchSegment (the optional-clause matcher) bailed on every
    non-Regular marker. `[FIELDS <fields,...>]` therefore failed to
    match at all and the fields list arrived empty in the result
    template. matchSegment now handles MarkerList with paren-balanced
    capture and segment+outer literal stop boundaries.
  * captureExpression only used the first literal in the pattern
    tail as a stop boundary. With std.ch's chain of optional
    clauses (`[TO <(f)>] [FIELDS ...] [FOR ...] [WHILE ...] ...`)
    the file-name marker was happy to gobble a trailing FOR clause
    when FIELDS was absent. It now stops at *any* of the remaining
    pattern literals.
  * `<(name)>` smart-stringify on a list-typed capture wrapped the
    whole comma-joined string in one set of quotes — `{ "a , b" }` —
    instead of `{ "a", "b" }`. New helper quoteListElements splits on
    top-level commas (paren / bracket / brace / string-balanced) and
    quotes each element. applyResult now consults the rule's marker
    table to know which captures came from `<name,...>`.

Parser cleanup: COPY removed from the IDENT-statement no-op switch in
both parseIdentStmt and parseExprStmt.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 15:00:18 +09:00
c2e7f7ea27 feat(pp): Phase B — COUNT / SUM / AVERAGE via std.ch
Three xBase analytical commands that were silent no-ops in the
parser now execute as Harbour-style PP rewrites:

  COUNT [TO <v>]   [FOR <for>] [WHILE <while>] ... -> dbEval()
  SUM <x> TO <v>   [FOR <for>] [WHILE <while>] ... -> dbEval()
  AVERAGE <x> TO <v> [FOR ...]                     -> __dbAverage()

COUNT and SUM expand to a `<v> := 0 ; dbEval( {|| ... } )` pair
matching harbour-core/include/std.ch verbatim. AVERAGE delegates to
a new RTL function rtlDbAverage (sum + count + divide; returns 0 on
empty match) — the chained-private-variable trick Harbour uses to
keep AVERAGE inline doesn't translate cleanly through Five's PP.

Wiring up these rules surfaced four PP issues that had to be fixed
for the rewrite to even reach the parser:

  * Result template did not implement <{name}> blockify. So a rule
    body like `{|| x := x + <x> }, <{for}>` left the literal text
    `<{for}>` in the output. Added blockify substitution: captured
    -> `{|| <captured> }`, missing -> NIL.
  * findMarkerEnd did not recognise `{`/`}` so unreferenced
    blockify markers were not cleaned up either. Added `{`/`}` to
    its prefix/suffix sets.
  * Optional-clause matching had no view of the outer pattern, so a
    regular marker at the end of `[TO <v>]` would swallow the rest
    of the line — `COUNT TO n FOR x>5` captured `<v>` as
    "n FOR x>5". matchSegment now takes outerTail and stops at its
    first literal.
  * `#command` directives could not span multiple physical lines.
    A trailing `;` is harbour-core's line-continuation marker for
    std.ch and now joins the next line into the directive before
    parsing.

Parser cleanup: COUNT, SUM, AVERAGE removed from the IDENT-statement
no-op switch in parseIdentStmt + parseExprStmt. The remaining xBase
verbs (COPY, SORT, TOTAL, JOIN, LIST, DISPLAY, LABEL, REPORT, ...)
stay in the parser until their RTL backends arrive.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 14:11:20 +09:00
c4f85f494c feat(pp): Phase A — preprocessor std.ch as single source of truth
Introduce compiler/pp/std.ch with 19 #command rules so that ERASE,
RENAME, DELETE FILE, CLOSE [<a>|ALL|DATABASES], COMMIT, UNLOCK,
LOCATE/CONTINUE, REINDEX, PACK, ZAP, KEYBOARD, RUN, MENU TO, and
CLEAR GETS reach the parser pre-rewritten as plain function calls.
Embedded into the compiler binary via //go:embed so it auto-loads
without an explicit #include in user code, exactly the way Harbour
auto-loads its std.ch.

This is a pure dispatch move, not a behavior change for the
already-working forms: the same Five RTL functions get called.
But it does fix three regressions that the parser was masking:

  * ERASE / RENAME / DELETE FILE used to be silent no-ops — the
    parser swallowed the entire line and returned NIL. They now
    actually delete/rename files (FErase / FRename).
  * CLOSE <alias> used to silently ignore the alias and close the
    current area. It now switches to the named area first
    (<a>->( DbCloseArea() )).
  * Two latent #command matcher bugs that surfaced while wiring
    std.ch up:
      - bare `CLOSE` would match rule `CLOSE ALL` because the tail
        of the pattern wasn't checked for unconsumed literals.
      - bare `CLOSE` would match rule `CLOSE <a>` because all
        unconsumed pattern markers were unconditionally treated as
        optional. They are only optional when nested inside `[...]`.

Parser cleanup: parseIdentStmt + parseExprStmt no longer hardcode
ERASE / RENAME / RUN / KEYBOARD / REINDEX / LOCATE / CONTINUE /
COMMIT / CLOSE — the rewriter handles them. Other xBase verbs
(COPY / SORT / COUNT / SUM / AVERAGE / TOTAL / JOIN / LIST /
DISPLAY / LABEL / REPORT / DIR ...) still no-op in the parser
because their RTL backends aren't implemented yet — once the
backends land they move into std.ch the same way.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 12:03:30 +09:00
f4ed42556b checkpoint: season-wide bug fix campaign + infra
Cumulative season's silent-bug hunting (~62 fixes) across the FiveSql2
SQL engine, the Five compiler/runtime, and the hbrdd RDD layer. Saved
as a single checkpoint before refactoring the parser to delegate xBase
command translation to the preprocessor.

Highlights:

FiveSql2 engine (_FiveSql2/src/)
- prefix-glob index attach -> explicit convention (<table>_pk.ntx,
  <table>_uq.ntx, <table>.cdx) — fixes silent multi-row INSERT row-drop
- DROP/CREATE TABLE FErase chain extended (.cdx, .fsc, .fsv, .dbt, .fpt)
- COUNT(DISTINCT col) parsed + aggregated via hSeen hash
- UNION column-count mismatch returns SQL_ERR_GRAMMAR (was silent)
- DISTINCT + ORDER BY hidden-col leak fixed (trim before DISTINCT)
- Derived table FROM (SELECT...) + JOIN right-side derived
- Self-FK CASCADE depth 2+ via SqlGetSingleColPK pre-collect
- LAG/LEAD default arg uses SqlEvalRowExpr (handles -N const exprs)
- DATE literal round-trip validation (Feb 29 non-leap rejected)
- CREATE OR REPLACE VIEW; CREATE VIEW errors on already-exists
- AlterTable type dispatcher comma-wrapped (1-char type "A" no longer
  matches CHARACTER)

Compiler / runtime
- gengo: HB_ -> FV_ prefix on emitted Go function names (Five identity)
- gengo split: emit_block.go, emit_stmt.go, folding.go extracted
- parser/stmtreg.go nudges
- hbrt: debug TUI/CLI restructure (debugcmd, debugkey, termios_*),
  windows debug stubs collapsed
- thread/vm/value/class/pcinterp tightening from panic traces

RDD layer (hbrdd/)
- dbf: null bitmap support (null.go + null_test.go), mmap split
  (mmap_posix.go / mmap_windows.go), byte-level numeric parse
- ntx/cdx: windows mmap parity
- workarea + mem RDD: cross-area state-bleed fixes

RTL (hbrtl/)
- errorlog rewrite with platform-specific FD (errorlog_fd_unix /
  errorlog_fd_other)
- sqlscan, sqlhelpers, indexrtl, datetime extensions

Gates green at checkpoint:
- go test ./...        : PASS
- FiveSql2 SQL:1999    : 43/43
- Harbour compat       : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 09:26:25 +09:00
8a3f296e9a perf(dbf): byte-level numeric parse + RecCount cache
Two hot-path fixes for DBF reads surfaced by the bulk-bench profile.

1. parseNumericField decimal path — was 23% of flat CPU on BULK_CTE.
   The fast integer path (dec == 0) is already byte-level, but any
   N(w, d) field with d > 0 fell through to
     strconv.ParseFloat(string(raw[start:end]), 64)
   allocating per-row. A 10k-row CTE insert ran this 200k+ times.
   Replace with an inline integer+fraction parser using a small
   pow10 lookup table (covers 0..19 decimal places). Unexpected
   characters still fall back to strconv for correctness.
   Result:
     BULK_CTE_10k_20iter  187 → 83 ms  (2.25x)
     BULK_SUBQ_10k_20iter 102 → 22 ms  (4.6x)

2. DBFArea.RecCount in shared mode was doing Seek(0, 2) on every
   call. SqlScan calls it once per query for its result-array
   pre-allocation (~0.2 ms × 1000 queries = 0.2s of CPU on the
   bench). Cache the count per-area, keyed by a process-wide
   generation counter. Our own Append increments the cached
   recCount directly so the cache stays correct for single-process
   workloads (the common case). Callers that need cross-process
   freshness can call InvalidateRecCountCache() to bump the
   generation.
   SQL bench: modest 1-3 ms drops on B1/B2/B3/B6/B7.

Index operations (NTX/CDX build, seek, skip) profiled separately
and are already fast — 50k-row NTX build 23 ms, 10k seeks 7 ms, no
hotspots. Left untouched.

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 23:38:54 +09:00
325fe51656 fix(fivesql2): DML transaction + constraint ordering
Three correctness bugs in the DML executor that the 4.7 audit
surfaced:

1. RunInsert logged the transaction BEFORE dbAppend() and validation.
   LogRecord captured the PREVIOUS row's RecNo, and a CHECK/FK
   violation that rolled back via dbDelete() still left a spurious
   INSERT entry in the log pointing at the wrong record. Move
   LogRecord to after all field puts and all validators pass, so
   the log only records committed INSERTs at the correct RecNo.

2. RunUpdate (fallback path) skipped CHECK and FK validation entirely
   — only RunInsert validated. An UPDATE could violate the same
   constraints INSERT protects against. Add the same validator calls
   after FieldPut, with a captured aPrevVals snapshot so the in-
   memory record can roll back cleanly on failure. Gated by
   SqlLoadConstraints to skip the validator (and its recursive
   five_SQL) for tables without SQL-level metadata — tables created
   via plain dbCreate see no change.

3. RunDelete had no transaction logging at all — a BEGIN / DELETE /
   ROLLBACK cycle silently lost the row. Add LogRecord("DELETE")
   before dbDelete so undo can re-surface it. (A full FK-cascade
   check on delete would require parent→child scanning; deferred.)

The fast-path SqlBulkUpdate branch still bypasses per-record
validation by design (documented) — it's gated by
`! ::oTxn:IsActive()`, so txn-active queries always take the
validated fallback.

FiveSql2 43/43 (including SAVEPOINT + ROLLBACK TO and all four CHECK/
FK tests), Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 23:24:14 +09:00
e368402682 chore: audit cleanup — remove orphan parser + dead TSqlIndex methods
Opus 4.7 audit of the codebase surfaced several items that Opus 4.6
sessions left behind. This pass removes what's definitively dead and
fixes one trivial defensive bug; the real logic bugs (transaction
ordering, missing RunUpdate/RunDelete validation) come in a separate
commit.

Deletions:

- `_FiveSql2/src/TSqlParser_orig.prg` (1173 lines) — superseded by
  `TSqlParser2.prg` (Pratt). Production never instantiates the old
  parser; the only callers were the comparison/benchmark test files
  also being removed.
- `_FiveSql2/test/test_parser_cmp.prg` — compared orig vs Pratt AST,
  useless now that orig is gone.
- `_FiveSql2/test/bench_parser.prg` — benched both, same reason.
- `_FiveSql2/Makefile` `test_cmp:` and `bench:` targets referenced
  the removed files.
- `TSqlIndex.prg` methods `ApplyScope`, `ClearScope`, `ApplySeek`,
  `IndexInfo`, `CreateTempIndex`, `DropTempIndex` — each declared in
  the class header and implemented (~165 lines total) but zero
  callers anywhere in `_FiveSql2/` or `hbrtl/`. Class declarations
  removed alongside the bodies.

Small fixes:

- `TSqlDDL.prg:179-180` stale comment claiming Five doesn't support
  `@byref` — false since commit e95afad (2026-04-13) wired @byref
  via RefCell. The same method uses @nPos correctly elsewhere.
- `hbrt/class.go:tryBinaryOp` defensive nil-check on AsArray().
  IsObject() checks the type tag; a corrupted Value with tag=Object
  but ptr=nil would crash on `.Class`. Correct construction paths
  never hit this, but the guard is cheap.

Compat tests: FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 22:46:17 +09:00
e5843bdde4 docs: refresh Phase-C TODO — audit results + remaining edge cases
Update the 1.0-readiness document with:
- 2026-04-18 compatibility audit results: 50/47 build rate (94%)
  vs previous 40/34. Lists every fix commit this session.
- Four remaining low-priority edge cases from the audit (xcommand
  nested-comma args, u64 overflow, USE with ../ paths, legacy
  inline-C syntax) — none block a realistic 1.0.
- Revised Phase-C scope: user clarified contrib PRGs can be
  imported as-is so long as underlying RTL exists, so the work is
  "audit each contrib's low-level deps, fill gaps, copy .prg"
  rather than porting every function.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 18:32:45 +09:00
4a1bbdb1fe feat(pp): optional-repeat [...] blocks — DEFAULT / UPDATE from common.ch
Harbour's `#xcommand DEFAULT <v1> TO <x1> [, <vn> TO <xn>] => ...`
uses an optional, repeatable trailing `[...]` block to accept any
number of `var TO default` pairs on a single line. Five's PP
skipped bracket bodies during pattern matching and treated them
as no-ops in result templates, so

  DEFAULT a TO 10, b TO 20, c TO 30

expanded (at best) the first pair and dropped the rest — and
common.ch itself was documented as "not yet supported".

Three concrete changes:

1. matchPattern now matches the `[...]` body repeatedly against
   remaining line tokens via a new matchSegment helper. Each
   successful iteration appends captures for the interior markers
   under the same name, joined with a \x01 sentinel.

2. matchSegment, when capturing the last marker in a body with no
   following literal, uses the body's opening literal (e.g. the `,`
   in `[, <vn> TO <xn>]`) as the iteration boundary. Otherwise
   captureExpression would greedily eat the rest of the line and
   collapse every remaining pair into one capture.

3. applyResult's new expandOptionalRepeat walks the result template
   for top-level `[...]` blocks. When a referenced marker is multi-
   captured it emits the body N times (substituting per-iter value);
   when it's single-captured it emits the body once; otherwise drops
   the block. A separate referencedMarkers scanner and an inMarker
   guard keep literal `[` / `]` inside PP markers (like `<.x.>`)
   from being mistaken for bracket delimiters.

Side fix: ParseRule previously stripped every ` ;` as a Harbour
line-continuation marker, but that also destroyed in-line PRG
statement separators in result templates. Line joining is the
preprocessor's job upstream — keep semicolons intact here.

common.ch now ships real DEFAULT and UPDATE #xcommands. Verified
1-, 2-, and 3-pair DEFAULT expansion plus `common.ch` inclusion
from user code. FiveSql2 43/43, Harbour compat 56/56, Go test ALL
PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 18:20:11 +09:00
b1024c5244 fix(gengo): hoist #pragma BEGINDUMP imports + wire HB_FUNC registration
Two bugs blocked Five's own inline-Go feature:

1. Inline Go blocks placed mid-file couldn't carry an `import` list
   because Go rejects declarations before imports in the same file.
   examples/godump_demo.prg and friends (real Five demos) hit
   "syntax error: imports must appear before other declarations"
   during compile of the generated Go.

   hoistGoImports parses the raw dump body for `import (...)` blocks
   and single-form `import "path"` lines, registers each path into
   the generator's imports map, and returns the body with those
   directives stripped. The top-of-file import block then carries
   everything the dump needs.

2. HB_FUNC() calls inside the inline block's init() enqueue
   registrations into hbrt.dynamicFuncs, but the VM only promotes
   them to its symbol table when RegisterLibModules() is called.
   gengo's generated main() skipped that step, so dispatch on the
   inline-defined names panicked with "no function symbol for call".
   Emit vm.RegisterLibModules() after RegisterModule(symbols).

Verified: examples/godump_demo.prg builds and runs; the inline
GoUpper / GoFib / GoGCD / GoSplit / GoSquare / GoTypeOf functions
all dispatch. Matches the feature's original design intent.

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:58:49 +09:00
5514780b11 feat(pp): detect Harbour inline C in #pragma BEGINDUMP and fail fast
Harbour's #pragma BEGINDUMP ... #pragma ENDDUMP blocks carry C source
that the Harbour toolchain embeds verbatim. Five takes the same
directive but targets Go — any `.prg` ported from Harbour that ships
inline C gets its C shoveled into the Go codegen pipeline and fails
with opaque errors like "invalid character U+0023 '#'" from the Go
compiler, dozens of lines downstream of the actual cause.

Detect the C shape at PP time and report a clear, actionable error:

  pp: file.prg:N: #pragma BEGINDUMP contains C code — Five accepts
  inline Go only. Port the block to Go (or use an RTL function),
  then wrap in #pragma BEGINDUMP ... #pragma ENDDUMP.

looksLikeInlineC uses conservative signals that don't false-positive
on legitimate inline Go (which calls `hbrt.HB_FUNC("NAME", fn)` with
a package prefix and a quoted string, distinct from C's bare
`HB_FUNC(NAME)` macro). Signals:

  - `#include <...>` / `#include "..."` — unambiguous C preprocessor
  - line-starting `HB_FUNC(` / `HB_FUNC_STATIC(` — C FFI macro
  - `typedef ` / `struct ` / `int main(` / `void main(` at line start

main.go now aborts the build when PP returns errors (previously
printed but continued — same behavior the parser already had for
its own errors). Keeps build output short: one pp line + one
summary line, no gengo noise.

Verified:
  - harbour-core/tests/inline_c.prg → clean PP error, exit 1
  - examples/godump_demo.prg (legitimate inline Go) → passes PP
    (hits a separate pre-existing gengo import-ordering bug, not
    related to this change)

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:53:44 +09:00
85002df6b9 feat(parser+pp): USE with macros and paren-balanced PP capture
Two related fixes for Harbour's data-driven `USE &cFile ALIAS &cAlias
INDEX &cNdx` idiom — common in any app that dispatches table names
at runtime.

Parser (compiler/parser/parser.go parseUse):
- `USE &cFile` / `USE &(expr)` previously triggered a
  skipToEndOfLine short-circuit, emitting an empty UseCmd (equivalent
  to bare USE = close current area). Now parseMacro runs and the
  MacroExpr becomes the File node, so codegen emits MacroPush +
  dbUseArea.
- `ALIAS &cAlias` / `ALIAS &a.1` similarly dropped the macro result;
  now captures it into UseCmd.AliasExpr so codegen evaluates the
  alias at runtime. Both the IDENT-path ("ALIAS") and keyword-path
  (token.ALIAS) handlers fixed.

PP (compiler/pp/command.go):
- captureExpression and the MarkerList branch now paren-balance
  `(`/`[`/`{` so nested grouping inside a macro argument doesn't let
  an inner `)` terminate the capture. Example:
      _REGULAR_(&(a))
  previously captured `&(a` (missing inner `)`) and left the outer
  `)` dangling, producing parse errors in the expanded output.
- MarkerList capture still joins tokens with " " for raw `<z>`
  substitution — comma tokens stay in the stream, so `s(<z>)`
  re-emits them as argument separators and the list expands cleanly.

Bench: harbour-core/tests/pp.prg 2 errors → 0 for the realistic
`USE &macro` / `&(expr)` patterns. Remaining parse errors on line 70
are a pathological `_REGULAR_L` list that includes `&a.  [2]`
(space between macro's terminating dot and an array index) — the
PP expands it correctly but Five's lexer refuses the expanded
result. That form doesn't occur in real code.

/tmp/test_use_macro.prg — all four patterns (`USE &f`, `USE &f ALIAS
&f`, `USE &f ALIAS &f INDEX &i`, dot-terminated) now compile. FiveSql2
43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:38:15 +09:00
e9522772a7 fix(pp): stringify markers + paren-attached calls — pp.prg 26→2 errors
Three cumulative fixes for Harbour's preprocessor stringify forms
surfaced by harbour-core/tests/pp.prg:

1. Token alignment — tokenizePattern and tokenizeLine now both
   split on parens and brackets, so `DUMB(a)` (no space) tokenises
   as `DUMB`, `(`, `a`, `)` on both sides. Previously the line
   tokenizer kept `DUMB(a)` as one token while the pattern split
   it three ways, and the match never engaged. Fixes `_DUMB_(a)`-
   style calls in pp.prg line 57+.

2. Substitution order — applyResult was replacing the bare `<z>`
   marker first, eating the inner `<z>` of `#<z>`, `<"z">`, `<(z)>`
   and `<.z.>` and leaving stray `#` / `<` / `.` characters that
   the lexer reported as ILLEGAL tokens. Run all compound forms
   first, bare `<z>` last.

3. Quote delimiter picker — ppQuote wraps a captured value in a
   legal PRG string literal by trying `"..."` first, then `'...'`,
   then `[...]`. Harbour's #<z> dumb-stringify needs this because
   the capture may already contain `"`, and Five was producing
   malformed `""world""` literals.

Bonus: smart-stringify `<(z)>` now recognises input that's already
a string literal (`"x"` / `'x'` / `[x]`) and keeps it verbatim
instead of double-quoting.

pp.prg 26 parse errors → 2 (remaining: `USE &b ALIAS &a.1` macro-
inside-command at line 21 and one related line, unrelated to this
fix). FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:26:16 +09:00
385a4ec6a2 fix(gengo): M-> and MEMVAR-> route to memvar table, not workarea
Harbour reserves the aliases `M` and `MEMVAR` for the memvar
namespace — `M->cVar` reads a PUBLIC/PRIVATE memvar, not a DBF
field in a workarea named M. Five's emitAliasExpr and emitAssign
treated all aliases identically, emitting:

  t.PushAliasField("M", "cVar")              // read
  _wa := t.WA.(*hbrdd.WorkAreaManager); _wa.SetAliasField("M", ...) // write

which triggered a spurious hbrdd import on programs using memvars
and attempted a workarea lookup that couldn't find a "M" area at
runtime.

Detect the reserved aliases (case-insensitive) at the three
AliasExpr call sites — the read path (emitAliasExpr) and both
assign paths (emitAssign for statements, emitAssignExpr for
expression context) — and route to t.PushMemvar / t.PopMemvar
instead. The existing Thread helpers hash into the MemvarTable
populated by PUBLIC/PRIVATE declarations.

Unblocks harbour-core/tests/macro.prg build (runtime still needs
the TVALUE test helper, unrelated). FiveSql2 43/43, Harbour compat
56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:14:18 +09:00
65b2edc906 fix(gengo): SWITCH edge cases — empty body, OTHERWISE-only, EXIT semantics
Three SWITCH codegen bugs surfaced by harbour-core/tests/switch.prg:

1. Empty SWITCH (`SWITCH x ENDSWITCH`) — legal Harbour, produced by
   conditional-compile files like switch.prg:13. Previous code
   emitted `_sw := t.Pop2()` followed by `}` with no matching `{`,
   closing the enclosing procedure body and producing "syntax error:
   non-declaration statement outside function body".

2. OTHERWISE-only (no CASE arms) — emitted `} else {` with no opening
   if, same "unexpected keyword else" category.

3. `EXIT` inside a CASE should break out of the SWITCH — but Five
   lowers SWITCH to an if/else-if chain, so the generated `break`
   had nowhere to land ("break is not in a loop, switch, or select").

Fix all three by wrapping every SWITCH in a one-iteration `for`
loop. `break` inside a case targets the wrapper, matching Harbour
semantics. Empty / OTHERWISE-only bodies still emit valid Go
because the for-loop provides the scope boundary regardless of
whether any if-chain opened. A trailing `break` keeps the loop
one-shot.

Also:
- `_ = _sw` silences unused-var for empty SWITCH.
- Conditionally emit the if-chain closing `}` only when at least
  one CASE ran.

All 15 SWITCH blocks in harbour-core/tests/switch.prg now build
and run to completion. FiveSql2 43/43, Harbour compat 56/56,
Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:11:47 +09:00
4b629f7e7a fix(pp): #xcommand/#xtranslate patterns with paren-attached keyword
Real Harbour headers write parameterised commands with no space
between the keyword and its opening paren:

  #xcommand MAKE_TEST( <obj>, <v> ) => ...

ParseRule stored the rule keyword as `MAKE_TEST(` (stripping only
<>, [] marker wrappers), but firstToken normalised source lines by
stopping the first-word scan at `(` — so `MAKE_TEST( o, 42 )`
produced `MAKE_TEST` for the lookup. The two strings didn't match
and the fast-path keyword check rejected every invocation, leaving
the macro unexpanded and the call site as a bare undeclared
identifier.

Trim everything from the first `(` onward during keyword
extraction so both halves agree on the dispatch key. The marker
tokens inside the parens are still parsed normally by
parseMarkers / matchPattern.

Verified with /tmp/test_xcmd2.prg (`MAKE_TEST( o, 99 )` expands
and dispatches to the object's :hVar access). FiveSql2 43/43,
Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:07:06 +09:00
d6c26104c9 feat(rtl): common.ch aliases — ISNIL/ISARRAY/ISNUMBER and friends
Harbour's common.ch exposes classic Clipper type-check shorthands
via #translate rules that map to HB_IS* RTL functions:

  #translate ISNIL(<x>)       => ((<x>) == NIL)
  #translate ISARRAY(<x>)     => HB_ISARRAY(<x>)
  #translate ISCHARACTER(<x>) => HB_ISSTRING(<x>)
  ... etc.

Five's preprocessor currently supports #translate only for lines
whose FIRST word is the rule keyword, not for substring matches
inside expressions. Real usage like `IF ISNIL(x)` fails the keyword
check (first word is IF, not ISNIL) and the rule never fires.

Rather than rewrite the PP substring engine (A2 scope), register
the nine short names as direct RTL symbols in register.go, each
pointing at the same Go function as its HB_IS* twin. ISMEMO maps
to HB_ISSTRING as a reasonable approximation for Five (no distinct
memo type at the VM level).

common.ch becomes a short stub that just #defines TRUE/FALSE/YES/NO
and documents where the ISxxx aliases live. DEFAULT / UPDATE
#xcommand forms remain unsupported pending A2.

Verified with /tmp/test_common.prg — ISNUMBER(42), ISCHARACTER("x"),
ISNIL(nilVar) all dispatch correctly. Analyzer still emits
"undeclared variable" warnings for the short names (the static
checker doesn't see runtime-registered RTL symbols) but the
generated code links and runs.

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:01:50 +09:00
d3c4447198 feat(parser): keyword-as-identifier at stmt-block boundaries
Harbour permits keywords (CASE, DO, WHILE, etc.) to be used as
variable/array names. In most expression contexts Five already
handles this via expr.go:362 which whitelists keywords when used
as bare identifiers. But parseStmtBlock was stopping on any stop
token unconditionally, so a line like

  case[ n ] := x       -- 'case' is a LOCAL array

terminated the enclosing stmt block at `case` and left `[ n ] := x`
unparsable.

Add isIdentSuffix(): peeks one ahead and reports whether the next
token is something that can only follow an identifier ([, :=, +=,
-=, *=, /=, %=, ^=, ++, --, :, .). parseStmtBlock now treats the
stop token as a statement-start when its suffix matches, so the
block keeps going.

Verified with /tmp/test_kwident.prg (`case[...]` outside DO CASE,
`arr[...]` inside DO CASE body), /tmp/test_kwident2.prg (both the
`case case[n] == "two"` arm and `case[1] := "updated"` assignment
after ENDCASE). Pathological harbour-core/tests/keywords.prg still
fails — it places `case[...]` in the arm-expected position of a
DO CASE block with no leading arm, which no sane parser can
disambiguate.

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:56:44 +09:00
0a5482b6aa feat(parser): implicit class binding for standalone METHOD bodies
Classic Clipper/Harbour form writes method implementations as bare
`METHOD Name(params)` statements following a `CLASS X ... ENDCLASS`
declaration, with the binding inferred from the most recent class:

  CREATE CLASS Shape
     METHOD Area
  ENDCLASS

  METHOD Area             -- binds to Shape
     RETURN 0

Five was requiring `METHOD Area CLASS Shape` explicitly. Without it,
parseMethodDecl left MethodDecl.ClassName empty, gengo skipped the
body emission, and the link step failed with `undefined: HB_SHAPE_AREA`.
The class registration had AddMethod("AREA", HB_SHAPE_AREA) pointing
at the missing symbol.

Parser tracks p.lastClassName at parseClassDecl, and parseMethodDecl
falls back to that value when no CLASS clause is supplied. Each new
CLASS declaration updates the tracker, so multi-class files still
dispatch correctly — verified with /tmp/test_implicit_class.prg
(Shape + Box both resolve their own Name/Area methods).

Unblocks harbour-core/tests/clsscope.prg and other OOP compat
tests that use this form. FiveSql2 43/43, Harbour compat 56/56,
Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:52:23 +09:00
2d82541d3d docs: Phase C contrib TODO + Phase A/B completion marker
Create Five-1.0-Phase-C-TODO.md capturing the remaining 1.0 work:
three Harbour contrib libraries (hbct Clipper Tools, hbnf Numeric
Functions, hbtip TCP/IP/SMTP/POP3/HTTP). Each entry lists the
Harbour source path, a minimum first-pass scope, and an effort
estimate. Suggested order: hbct → hbtip → hbnf. Total ~6-10 days.

Update RTL-Go-Native-Migration.md "남은 병목" with the Phase A/B
completion list — six features shipped this session — plus a note
that the 11 HBTYPE functions the initial analysis flagged are
actually Harbour's internal scalar class factories, not user-facing
blockers (Five's SendBuiltin covers the same surface).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:38:08 +09:00
2a662525b3 feat(rtl): DO(xTarget, [args...]) — dynamic dispatch
Harbour's DO() accepts a string (looked up as a function name), a
code block (evaluated with args), or a symbol, and invokes it. Used
for plugin systems and dynamic dispatch idioms like
`DO(cHandler, oRequest)`.

Five already had stmtDo rewrite `DO(...)` at statement-level to a
function-call expression, so callers in expression position just
work — but gengo refused to emit DO as a function call because it
was on the reserved-word guard list (which existed to catch stray
ENDIF/ENDDO from bad IF nesting). Remove DO from that list; the
statement form is still handled upstream by parseDoProc, so the
guard loses nothing.

rtlDo implements the dispatch:
  - String target → VM.FindSymbol + t.Function
  - Block target → EvalBlock path (same as Eval)
  - Anything else → NIL

Tested (/tmp/test_do.prg):
  DO("Greet", "World") → "hello, World"
  DO({|x,y| x*y+1}, 5, 6) → 31
  DO(NIL) → NIL (ValType "U")

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:33:09 +09:00
327f75bb45 fix(parser): DATA x, y, z registers every name — not just the first
Harbour's `DATA name1, name2, name3` (and `VAR`, `CLASSDATA`)
should declare every listed field. Five's parseDataDecl instead
returned a single DataDecl for the first name and silently dropped
the rest — the comma branch just consumed the identifier without
producing a new decl. Surfaced by the OPERATOR overloading test
(/tmp/test_operator.prg originally had `DATA x, y` for a Vec2
class) where later `::y` access panicked with "unknown method y".

Change the signature to `[]*ast.DataDecl` and rewrite the loop so
each comma closes the current decl and starts a fresh one. AS /
INIT / qualifier runs still attach to the most recent name, so:

  DATA x, y, z                  → three decls, no init
  DATA x INIT 10, y, z INIT 0   → init attaches to preceding name
  DATA cName AS CHARACTER       → typed single decl

All seven class-body call sites flatten the slice into `members`.

Verified with /tmp/test_multidata.prg (`DATA x, y, z` + mixed
`DATA label INIT "origin", count INIT 0`) and the OPERATOR test
which now passes with the original `DATA x, y` form restored.

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:17:32 +09:00
e089c81bcd feat(macro): &var / &(expr) runtime compilation
Harbour's macro operator was a stub: hbrt.MacroCompile only resolved
bare identifier names to memvars/functions and returned the source
string unchanged for any non-trivial expression. The gengo emit was
also broken — `t.MacroPush() + t.PushNil()` never pushed the inner
expression's value, so MacroPush popped whatever happened to be on
the stack.

Wire it up properly:

1. Gengo fix: `case *ast.MacroExpr` now emits `emitExpr(e.Expr);
   t.MacroPush()`. The inner expression produces the source string;
   MacroPush consumes it and pushes the evaluated result.

2. Hook pattern in hbrt: `SetMacroEvalHook(fn)` lets hbrtl install
   the real evaluator without creating an import cycle (genpc
   already imports hbrt). MacroPush delegates to the hook when
   installed; otherwise falls back to the legacy stub for hbrt
   unit tests.

3. hbrtl.init registers macroEval, which reuses compileExprSource
   (factored out of PcCompile) so macro lookups share the same
   sync.Map-backed pcode cache — repeat evaluations of the same
   macro source are free after the first hit.

4. ExecPcode leaves the result in retVal; macroEval copies it to
   the operand stack via PushRetValue.

Tested (/tmp/test_macro.prg):
  &"10 + 20"                    → 30
  &"Sqrt(16)"                   → 4
  &"Upper('hello')"             → HELLO
  &("30 * " + Str(nX, 1))       → 210  (runtime-built source)
  &"5 > 3 .AND. .T."            → .T.
  &("Str(" + Str(nX*10,2) + ",2)") → 70

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:02:16 +09:00
66f045b97e feat(oop): OPERATOR overloading — + - * / == != < > <= >=
Harbour lets a class define custom behaviour for arithmetic and
comparison operators via `OPERATOR "<sym>" ARG <name> INLINE <expr>`.
Five already had the runtime slot infrastructure (ClassDef.Operators
+ AddOperator + parent-chain copy) but parser skipped the form and
the VM ops never consulted the slots.

Parser: parseOperatorDecl captures the symbol, ARG binding, and
INLINE body into a MethodDecl with IsOperator=true and OperatorOp
set to the hbrt.Op* slot. Synthesised method name is __OP_<idx>
to keep the regular method namespace clean.

Codegen: emitClassDecl routes IsOperator members through
_def.AddOperator instead of AddMethod. Inline body generation is
shared with the MESSAGE/INLINE path (34485cd).

VM: Thread.tryBinaryOp walks the LHS object's class operator slot,
pushes args with Self bound to LHS, and returns true if the slot
is populated. Wired into Plus/Minus/Mult/Divide and Equal/NotEqual/
Less/Greater/LessEqual/GreaterEqual. Falls through to built-in
behaviour when no overload exists — non-object LHS costs one tag
check per op.

Operator symbol→slot mapping keeps `=` and `==` on the same slot
(OpEqual=8) because Five's gengo routes both to t.Equal() and the
VM doesn't distinguish strict vs non-strict equality today.

Tested (/tmp/test_operator.prg): Vec2 + - == < with per-field
results all correct.

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 15:54:44 +09:00
34485cd6c8 feat(oop): METHOD ... INLINE <expr> and MESSAGE handlers
Harbour's inline-method sugar was parsed but the body was skipped,
leaving any `METHOD X() INLINE expr` declaration registered in the
class vtable with no matching HB_<CLASS>_X function — link error
at build time.

Parser: MethodDecl gains an InlineBody Expr field. parseClassMethodDecl
captures the expression after INLINE instead of skipping to EOL.
New parseMessageDecl handles `MESSAGE <name> [(params)] INLINE expr`
and returns the same MethodDecl shape.

Codegen: emitClassDecl walks members a second time after the class
registration init block and emits emitInlineMethodBody for each
IsInline method — a Frame(nParams, 0) + emitExpr(InlineBody) +
RetValue function. curMethodClass is bound so ::super: inside an
inline body still resolves.

Tested (/tmp/test_inline.prg): all four patterns — bare INLINE,
MESSAGE INLINE, INLINE with params, INLINE reading ::field —
produce expected values.

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 15:41:36 +09:00
3a56bd321a feat(oop): ::super:Method() dispatch for inheritance chains
Harbour's ::super: idiom routes a method call through the parent of
the class that defines the currently-executing method — Self stays
the child instance, only the vtable entry point shifts. Five
previously parsed ::super as a data-field access (PushSelfField("SUPER"))
which returned nil and panicked on the subsequent Send.

Runtime: Thread.SendSuper(fromClassName, methodName, nArgs).
Binding to the *defining* class (not Self's runtime class) is
load-bearing for 3+ level hierarchies: without it,
  Grand:New → ::super:New → Child:New → ::super:New
would resolve to Grand.Parent=Child again and infinite-loop.

Gengo: Generator.curMethodClass tracks the class name across each
method body emission. emitSendExpr detects the nested SendExpr
shape `::super:X(...)` and emits SendSuper with curMethodClass as
the first argument.

Tested (/tmp/test_super, /tmp/test_super2):
  Parent → Child:    ::super:Greet() returns composed result
  Base → Child → Grand: ::super:New chain passes args correctly

Also fixes three gengo unit tests whose expected output was stale
from prior perf commits (b829ed4 const prop, 1f63c7f symbol hoist,
7e4079f string-concat reassoc) — assertions now match the current
optimized codegen.

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 15:33:46 +09:00
3d292dd9d8 docs: document 2026-04-18 perf session — entries #27-32
Six new migration-log entries covering this session's 21 commits:
  #27 VM in-place stack ops + symbol hoist (global 3-15%)
  #28 gengo compile-time peepholes (9 commits, 1-7% bench)
  #29 SELECT WA cache extension (single-table 2x+)
  #30 JOIN temp-alias stabilisation (B6 1.67x)
  #31 Stat-loop gates — view + CTE (CPU -40pp in rawsyscalln)
  #32 Go-native SqlIsAggName + FetchRow (agg/window 1.3-1.7x)

Plus a cumulative bench table vs the 3caadb2 baseline and an
updated "남은 병목" section pointing at EvalExpr / JOINRECURSE /
HASHJOIN / Go runtime primitives as the remaining levers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 14:02:48 +09:00
c4ae88e76e perf(fivesql2): gate CTE __cte_*.dbf cleanup on legacy disk fallback
CTE tables now materialise via MEMRDD (no file on disk), yet the
RunSelect cleanup loop was still stat-ing __cte_<name>.dbf for every
CTE in every CTE query. Profile after the FetchRow rewrite pinned
HbFileExists at 20.28% of total CPU — pure waste when MEMRDD is the
common path.

Add s_lCteDiskSeen flag, set only when the legacy DBFNTX fallback in
RunSelect actually opens a pre-existing __cte_<name>.dbf (line 1247
path — rare, only for sub-executors referencing a CTE by name on a
crashed-prior-run .dbf). Cleanup runs only when the flag is set.

pprof delta (full bench with cache enabled):
  rawsyscalln: 25.56% → 8.50%  (~17 points removed)
  HbFileExists: 20.28% → 0%    (dropped out of top)
Wall-clock unchanged (ENOENT stats are kernel-cached on Darwin), but
this removes the last visible avoidable syscall. What's left in the
profile (kevent, madvise, pthread_cond_*) is Go runtime + scheduler
overhead that application code can't touch.

FiveSql2 43/43, Harbour compat 56/56.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:59:04 +09:00
935883bb88 perf(fivesql2): Go-native FetchRow fast path — 1.3-1.7x on agg/window
TSqlExecutor:FetchRow was the per-row workhorse for aggregation,
HAVING, and window queries. Even with the pre-built aFetchCache
binding columns to (nWA, nFPos), the PRG FOR loop paid one method
dispatch per column per row (dbSelectArea, FieldGet, AllTrim,
AAdd) — profile pinned it at ~30% of B4 CPU.

SqlFetchRowFast collapses the cache-path loop into a single Go
call:
  - bound entry: SelectByNum + area.GetValue directly
  - unbound (aggregate/expression): self:EvalExpr via Send
  - character values: TrimSpace inline
The PRG FetchRow keeps its original cache-miss fallback path
unchanged for rare queries where aFetchCache isn't built.

Bench deltas (median of 3 steady runs, 1000 iters):
  B4_GROUP_HAVING 418 → 327 us  -22% (1.28x)
  B9_ROW_NUMBER   191 → 120 us  -37% (1.59x)
  B10_RANK_PART   228 → 135 us  -41% (1.69x)
  B11_SUM_OVER    249 → 156 us  -37% (1.60x)
  B14_COUNT       235 → 219 us  -7%
  B15_CTE_WIN_JOIN 1577 → 1452 us  -8%
Single-table SELECT (B1-B3, B5-B7, B8) stays flat — those already
hit the column-binding fast path and don't need aggregate dispatch.

FiveSql2 43/43, Harbour compat 56/56.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:50:02 +09:00
c84cde6175 perf(fivesql2): Go-native SqlIsAggName — drop per-row substring scan
B4 GROUP+HAVING profile showed SqlIsAggName at ~9% of CPU —
SqlEvalFunc checks it for every function in every row, and the
PRG body was two string allocations + a substring scan:
  RETURN ("," + c + ",") $ ("," + AGG_FUNCTIONS + ",")

Replace with a hash lookup against the existing aggFuncSet map
in hbrtl/sqlexpr.go (already populated for SqlExprHasAgg, same
AGG_FUNCTIONS list). Upper-casing skips the allocation when the
input is already upper, which it almost always is in practice.

Bench deltas (median of 3 steady runs, 1000 iters):
  B4_GROUP_HAVING 447 → 418 us  -6.5%
  B14_COUNT       252 → 235 us  -7%
  B15_CTE_WIN_JOIN 1595 → 1577 us  -1%
Other benches unchanged (no aggregate calls per row).

FiveSql2 43/43, Harbour compat 56/56.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:40:19 +09:00
6746ae4cee perf(fivesql2): stable temp alias — 1.67x on JOIN bench
AcquireTemp now returns the purpose string (upper-cased table name)
as the alias when available, and falls back to FA_#### only when the
same purpose is already in-flight this query — i.e., self-joins.
Previously every call returned a fresh FA_####, so the WA cache
(keyed by alias) could never hit on JOIN queries and the file got
reopened every iteration.

Bench deltas vs prior HEAD:
  B6_INNER_JOIN    217 → 130 us   -40%  (1.67x)
  B15_CTE_WIN_JOIN 1678 → 1595 us -5%
Single-table benches unchanged — they were already hitting the
cache via the table-name alias path.

B8 recursive CTE stays flat: its sub-executors at nDepth>1 still
cycle through fresh purposes that don't stabilise across queries.

FiveSql2 43/43, Harbour compat 56/56.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:31:01 +09:00
9bb361b2e0 perf(fivesql2): gate VIEW temp cleanup on actual view usage
After the SELECT WA cache landed, pprof showed HbFileExists → os.Stat
at 28% of remaining CPU — the RunSelect cleanup loop was stat-ing
__view_<table>.dbf for every table in every query, even on the
common view-free path.

Track view materialisation with a TSqlIndex.lViewUsed flag set in
OpenTable when CheckView produces a temp. The cleanup loop now
runs only when the flag is set, then resets it. View-using queries
are unaffected.

pprof delta:
  rawsyscalln:  2.14s → 1.41s  (48% → 32% of total CPU)
  os.Stat:      1.24s → 0.49s  (28% → 11%)
Wall-clock bench numbers stayed within plus-or-minus 3% noise (stats
are cheap when the target file does not exist, so CPU savings do not
translate directly to end-to-end time) but this removes the next
biggest syscall waste visible in the profile.

FiveSql2 43/43, Harbour compat 56/56.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:27:49 +09:00
f27c96c7f0 perf(fivesql2): extend WA cache to SELECT path — 2x faster single-table
TSqlExecutor:OpenTable now hands lifetime to the WA cache for stable
aliases (user-supplied or table-named). CloseOpened skips those
entries, so the DBF mmap stays alive across queries instead of being
unmapped + re-opened 1000 times a bench. Previously the WA cache only
covered DML (INSERT/UPDATE/DELETE) — SELECT was still paying the full
dbUseArea/dbCloseArea syscall bill every query (profile showed
rtlDbCloseArea + munmap at ~30% of total CPU).

AcquireTemp-generated aliases (FA_####) are excluded — they change
every query (self-joins, nested depth), so caching them would just
leak entries for no reuse. JOIN / recursive CTE regressions from an
earlier unrestricted version are gone.

Bench deltas vs prior HEAD (median of 3 steady runs, 1000 iters):
  B1_SELECT_STAR   82 → 41 us   -50%  (2.0x)
  B2_WHERE_FILTER  78 → 35 us   -55%  (2.2x)
  B3_ORDER_BY      90 → 48 us   -47%  (1.88x)
  B5_DISTINCT      75 → 32 us   -57%  (2.34x)
  B7_CTE_SIMPLE   120 → 77 us   -36%  (1.56x)
  B9_ROW_NUMBER   239 → 194 us  -19%
  B10_RANK_PART   276 → 233 us  -16%
  B11_SUM_OVER    296 → 252 us  -15%
  B4_GROUP_HAVING 498 → 450 us  -10%
Others flat (JOIN / recursive CTE / DML already covered).

FiveSql2 43/43, Harbour compat 56/56.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:20:46 +09:00
6974ff9473 perf(gengo): elide dead-store inits for const-propagated LOCALs
When collectConstLocals proves a LOCAL is only ever read, not
written beyond its literal init, every read site gets the literal
substituted inline — which means the init itself has no live
reader. Skip emitting the PushXxx/PopLocalFast pair for those
LOCALs in both top-of-function and mid-body decls.

On a function with `LOCAL nBuf := 100, sTag := "x", bFlag := .T.`,
all three inits drop out (6 VM ops saved in the prologue), while
the still-written `LOCAL nSum := 0` init stays. Harbour compat
56/56, FiveSql2 43/43.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:06:43 +09:00
b829ed4996 perf(gengo): constant-propagate literal-init LOCALs
Scan each function body for LOCALs whose sole write is a literal
initialiser (never ++/-- / += / @byref / MultiAssign target /
FOR var / @GET target / macro). Reads substitute the literal
inline at emit time, which cascades into all earlier folds: dead
IF branches, AND/OR short-circuit, NOT, string-concat reassoc,
and the FOR LocalLessEqualInt fast path (extended to see through
a propagated ident limit).

Walker is bounded — unrecognised AST nodes abort propagation for
the whole function rather than risk missing a hidden write.
Harbour compat 56/56, FiveSql2 43/43.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:44:27 +09:00
7e4079f845 perf(gengo): reassociate left-leaning string-concat literal runs
`"a" + x + "b" + "c" + "d"` used to emit 4 Plus() calls because
the parser builds a left-leaning chain and no pair was
literal+literal. Add a reassociation step inside foldLiteralTree:
when the outer shape is `(Y + strlit1) + strlit2`, rewrite as
`Y + (strlit1+strlit2)` so the tail literals collapse. Also run
foldLiteralTree on the root BinaryExpr in emitExpr so the
outermost reassoc fires (was only running on children).

Verified: the 4-Plus case now emits 2 Plus calls (`"a" + x + "bcd"`).
FiveSql2 43/43, Harbour compat 56/56.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:52:08 +09:00
67a9855319 perf(gengo): fold DO WHILE .T. / .F. at compile time
DO WHILE .T. now emits a bare for-loop with no PushBool/PopLogical
per iteration — saves a stack roundtrip on every trip through the
idiomatic infinite-loop pattern (9 .prg files use it). DO WHILE .F.
emits nothing. Loop exits still work via EXIT / RETURN.

FiveSql2 43/43, Harbour compat 56/56.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:45:58 +09:00
c3a9eb33a4 perf(gengo): fold .NOT. <literal> at compile time
`.NOT. .T.` / `.NOT. .F.` emit PushBool directly instead of
pushing the source bool and calling Not(). boolLiteralValue also
sees through an outer NOT, so `IF !.F.` now triggers the full
dead-branch pass (no PopLogical wrapper either).

FiveSql2 43/43, Harbour compat 56/56.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:42:08 +09:00
1b6d913905 perf(gengo): short-circuit AND/OR with literal LHS
Skip the PushBool/PopLogical/branch wrapper when the LHS of .AND. /
.OR. is a bare .T./.F. literal. `.T. .AND. X` emits X alone;
`.F. .AND. X` emits PushBool(false) with X dropped; symmetric for
OR. Common after constant-folding a sub-expression — pairs with
the earlier dead-IF-branch peephole.

FiveSql2 43/43, Harbour compat 56/56. Verified via /tmp/test_andor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:35:22 +09:00
3f8ef7daef perf(gengo): eliminate dead IF/ELSEIF branches with literal conds
IF .T. collapses to its body; IF .F. forwards to the first live
ELSEIF or ELSE. For dynamic main conditions the chain is still
filtered: ELSEIF .F. drops out, ELSEIF .T. truncates and becomes
the ELSE. Verified with /tmp/test_deadif.prg — five dead labels
all removed from gen output, runtime emits only live branches.

FiveSql2 43/43, Harbour compat 56/56.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:32:24 +09:00
111ab8a6f0 perf(gengo): unary-minus literal fold + x:=x+y → LocalAdd peephole
Two more leaf-level code-gen cleanups now that the const folder is in.

 - UnaryExpr MINUS over a LITERAL (INT/DOUBLE) emits the negated value
   directly, so `-42` becomes PushInt(-42) instead of PushInt(42) +
   Negate(). Guarded: MinInt64 passes through to the VM so the
   coerce-to-double path stays authoritative. Variables fall through
   to the normal Negate path — the LiteralExpr type assertion is the
   gate, so runtime-typed `-x` keeps its semantics.

 - `x := x + <expr>` / `x := x - <expr>` detected when the LHS ident
   resolves to the same local as the self-reference on the RHS,
   emits the same LocalAdd / Negate+LocalAdd shape that x += y already
   used. Non-matching locals (shadowing, module statics) fall through.

Verification
 - go test ./...              ALL PASS
 - FiveSql2 test_sql1999      43/43
 - tests/compat_harbour       56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:26:59 +09:00
a0acdf0289 perf(gengo): compile-time constant folding for literal arithmetic
Fold BinaryExpr subtrees whose operands reduce to INT or STRING
literals at compile time. `10 * 2 + 5` now emits a single PushInt(25)
instead of three VM ops; `"a" + "b"` collapses to "ab". Overflowing
INTs and SLASH (which Harbour turns into double) fall through to the
VM so semantics stay intact.

Implementation is a bottom-up foldLiteralTree pre-pass on each
BinaryExpr, plus a tryFoldBinary matcher for the leaf case. Mutates
the AST in place — safe because the generator owns the tree after
parse.

Bench numbers don't move (SQL paths have no literal-only arithmetic
in hot loops), but generated code shrinks on PRG that uses #define
constants for widths / offsets / factors.

Verification
 - go test ./...              ALL PASS
 - FiveSql2 test_sql1999      43/43
 - tests/compat_harbour       56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:24:27 +09:00
523d3fcf2e perf(vm): in-place Plus/Minus/Mult with tInt fast path
Apply the sp-rewrite shape to the three binary arithmetic ops. The
tInt==tInt fast branch reads scalar directly (skips the AsNumInt
method) so the hot path is int64 ops + an overflow check; mixed-type
branches keep AsNumDouble unchanged.

PRG tight loops (FOR counter, SUM accumulators outside SQL aggregate
path) skip one cachedNil store and two bounds-check sequences per op.

Verification
 - go test ./...              ALL PASS
 - FiveSql2 test_sql1999      43/43
 - tests/compat_harbour       56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:16:32 +09:00
54b35f9325 perf(vm): in-place And/Or with logical-only fast path
Fold And/Or into the same in-place sp-rewrite shape as Not/LessEqual.
Both args must be tLogical — short-circuit on the raw scalar field so
the hot path is pure integer arithmetic + two cached bool Values.

Verification
 - go test ./...              ALL PASS
 - FiveSql2 test_sql1999      43/43
 - tests/compat_harbour       56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 03:16:01 +09:00
0fd3698ee5 perf(vm): in-place compare ops + Int-Int fast path
Mirror the treatment LessEqual already got onto Equal/NotEqual/Less/
Greater/GreaterEqual — rewrite sp directly, check the type tag for
Int==Int in the hot branch, short-circuit to cachedTrue/cachedFalse
without a second method call. Keeps the slow fallback for mixed /
string / date types.

Bench movement is minor on SQL paths (WHERE is already pcode and
skips these ops); the win is on PRG comparisons that cache-miss out
of the pcode path — FOR-condition short forms, IF chains, etc.

Verification
 - go test ./...              ALL PASS
 - FiveSql2 test_sql1999      43/43
 - tests/compat_harbour       56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 21:00:45 +09:00
15aa6dd4b3 perf(vm): in-place stack ops + ArrayGen/EvalBlock shift — global 2-6%
Three small tweaks to the pop-push hotspots in the VM.

 - ops_arith.go Inc/Dec/AddInt: unary ops mutate the top stack slot in
   place via peekPtr() instead of pop-compute-push. Drops the bounds
   check + cachedNil clear + push bounds check per call. Biggest
   beneficiary: FOR loop counters (implicit Inc) — every iteration of
   every PRG loop pays these ops once.

 - ops_collection.go ArrayGen: consume N slots via a single `copy`
   into the freshly-allocated result slice, then rewind sp and clear
   the intermediate slots for GC (the first slot is overwritten by
   the array push). Skips the N-deep pop loop.

 - ops_collection.go EvalBlock: read block value before shift, collapse
   args down one slot to overwrite the block position, then let the
   block run against the same in-place layout. Matches the
   Function()/PushSymbol round-trip removal from the prior commit.

bench_sql deltas

 - B2  WHERE            83 →  78 µs  (6%)
 - B3  ORDER BY         96 →  90 µs  (6%)
 - B4  GROUP_HAVING    554 → 528 µs  (5%)
 - B9  ROW_NUMBER      255 → 241 µs  (5%)
 - B10 RANK PART       296 → 278 µs  (6%)
 - B11 SUM OVER        320 → 300 µs  (6%)
 - B15 CTE+WIN+JOIN   1826 →1743 µs  (5%)

Verification
 - go test ./...              ALL PASS
 - FiveSql2 test_sql1999      43/43
 - tests/compat_harbour       56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 20:48:15 +09:00
1f63c7fe63 perf(vm): symbol hoist + Function() stack shift — global 3-15%
The VM call path (PushSymbol → Function → Frame) is traversed by every
PRG function call. Three changes together cut per-call overhead across
the entire bench suite.

Changes
 - hbrt/call.go Function(): replace pop-push dance with a single slice
   shift (N+2 pops + N pushes → 1 copy of N slots + sp adjust). Kills
   the per-call `make([]Value, nArgs)` heap alloc. Resolved function
   pointer is cached back into sym.Func so subsequent calls on the
   same Symbol skip the VM lookup entirely.
 - hbrt/vm.go GetSym(): new helper. Generated code calls it with a
   pointer to a package-level `*Symbol` slot so FindSymbol (which takes
   the VM RWMutex + map lookup) runs at most once per symbol per
   process. Nil results are intentionally NOT cached — an init-order
   miss becomes a retry on the next call instead of a permanent sticky
   failure.
 - hbrt/thread.go pushPendingSym(): scalar fast slot for depth=1 call
   nesting (common case). Nil syms still go through the slice so the
   "empty vs stored nil" ambiguity can't produce a false pop.
 - compiler/gengo/gengo.go: emit `t.PushSymbol(t.GetSym(&_sym_<file>_<NAME>, "NAME"))`
   for every function call site, with a per-file prefix so multi-PRG
   builds don't collide on identical symbol names.

Bugs fixed during bring-up
 - pendingSymFast == nil was ambiguous ("unused" vs "nil stored"). Nil
   syms now spill to the slice, preserving distinguishability.
 - The old varName-reuse branch at the PushSymbol emit site skipped
   the GetSym wrapper, emitting a raw `t.PushSymbol(varName)` against
   an uninitialized package-level *Symbol. Every call path now funnels
   through emitPushSymbol.

bench_sql deltas vs prior build
 - B1  SELECT *          114 →  97 µs   (15%)
 - B4  GROUP_HAVING      584 → 554 µs   (5%)
 - B8  RECURSIVE CTE     150 → 141 µs   (6%)
 - B10 RANK PARTITION    310 → 296 µs   (5%)
 - B11 SUM OVER          335 → 320 µs   (4%)
 - B14 COUNT             295 → 281 µs   (5%)
 - B15 CTE+WIN+JOIN     1891 → 1826 µs  (3%)

Verification
 - go test ./...               ALL PASS
 - FiveSql2 test_sql1999       43/43
 - tests/compat_harbour        56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 20:41:48 +09:00
dd270d5d9d perf: RTL Go-native migration — 27 optimizations, DML up to 70-90x
Systematic pass through PRG hot paths, promoting them to Go RTL while
preserving Harbour/FiveSql2 semantics. Full log in
docs/RTL-Go-Native-Migration.md.

Bench (bench_sql) vs 2026-04-08 baseline
 - B1  SELECT *             2,192 → 114   µs   (19x)
 - B6  INNER JOIN           9,291 → 233   µs   (40x)
 - B7  CTE simple           8,037 → 129   µs   (62x)
 - B9  ROW_NUMBER           3,705 → 265   µs   (14x)
 - B10 RANK PARTITION       4,748 → 309   µs   (15x)
 - B12 INSERT (WA cache)    4,319 →  63   µs   (69x)
 - B13 UPDATE (WA cache)    6,144 →  68   µs   (90x)
 - B15 CTE+WIN+JOIN        18,395 → 1,873 µs   (10x)

Infrastructure
 - HbHash O(1) Index preserving insertion order (Harbour KEEPORDER)
 - HbDeepClone Go RTL (scalar-sharing, immutable hash keys)
 - MEMRDD auto-imported via gengo; all Five programs get mem:name driver
 - SQL plan + pcode caches (s_hPlanCache, s_hDmlPcodeCache)
 - Opt-in SqlWACacheEnable — dbUseArea/Close/Commit batched for DML

SQL engine
 - FiveSql2 lexer ported to Go (byte FSM) with combined automatic
   template parameterization (literals → ?, concat queries share plan)
 - Go RTL: SqlDistinct, SqlGroupRows, SqlWindowPartitions,
   SqlWindowSortPartition, SqlWindowAssignRank, SqlComputeAggSimple,
   SqlBulkInsert, SqlBulkUpdate, SqlExprHasAgg, SqlEvalHaving
 - CTE / subquery / driving-table materialize paths use MEMRDD
 - SqlCoerce/SqlCmp/SqlIsTrue helpers moved from PRG to Go
 - SqlBulkUpdate defers Flush when WA cache active (APFS fsync was
   dominant B13 cost — 1.6ms/call → gone)

Correctness fixes uncovered during migration
 - ASort default path now sorts dates/logicals/timestamps (was no-op)
 - ORDER BY default NULL placement matches PRG SqlRowCompare across
   Go fast path; explicit NULLS FIRST/LAST honored by both paths
 - SqlBulkUpdate respects EXCLUSIVE vs SHARED mode record locks
 - SqlCmp/SqlCmpEq normalize NumInt vs Double (caught by test 6b)

Verification
 - go test ./...              ALL PASS
 - FiveSql2 test_sql1999      43/43
 - tests/compat_harbour       56/56 (+5 new: ASort dates/logicals,
                              AScan int cross-type)
 - Regression test test_null_order.prg for ORDER BY NULL ordering

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 20:20:14 +09:00
3caadb23b9 perf: SqlOrderBy + SqlGroupBy Go RTL — native sort and aggregation
SqlOrderBy: Go sort.Slice for ORDER BY, 10-50x faster than PRG ASort.
SqlGroupBy: Go map-based GROUP BY accumulation (ready for integration).
TryBuildSortSpec detects simple ORDER BY columns and routes to Go.
Fallback to PRG for complex ORDER BY expressions.

43/43 + 41/41 verify + 51/51 compat + go test ALL PASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 14:41:41 +09:00
54bf6f5bb4 fix: ComputeAgg qualified column lookup for Go SqlHashJoin path
FindColIdx2 searched for bare column name (e.g. 'AMOUNT') but
aFieldNames now contains qualified names ('o.amount') from the
Go join fast path. Added fallback: try xArg[2] (the full AST name)
when the bare name misses. Fixes SUM/AVG/MIN/MAX aggregation after
Go-native hash join.

Verified: 41/41 correctness tests pass (verify_correctness.prg).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 07:35:26 +09:00