Audit follow-up after Wave 1's pcode `+=` fix surfaced a parallel
class of silent miscompiles in the *gengo* (native-Go) emit path.
Three real bugs hiding behind happy-path test coverage:
* `arr[i] += x` was ASSIGN-only — the IndexExpr branch returned
after emitting `arr[i] := x`, dropping the original element.
Now: PushArray + Push index, ArrayPush to read, fold with RHS,
re-do PushArray + index, ArrayPop to store.
* `alias->field += x` (and the M-> / MEMVAR-> namespace variants)
were ASSIGN-only too. Same shape of bug — `x->v += 7` compiled
as `x->v := 7`. Compound branch reads via PushAliasField (or
PushMemvar for M->), folds, stores via SetAliasField (or
PopMemvar).
* PRIVATE / PUBLIC mid-function declarations were treated as
extra LOCAL slots. emitMidVarDecl extended `locals` past the
function's declared count and emitted `PopLocalFast(idx)` for
the init. The slot didn't exist at runtime, so the init either
silently scribbled past the frame (small N) or panicked with
"local variable index out of range" once exercised. New logic:
PRIVATE/PUBLIC declarations bypass the locals table and emit
`PopMemvar(name)` for the init expression. The runtime auto-
creates the memvar.
* Memvar assignment fallback. After the LOCAL/STATIC checks miss
in emitAssign, the bottom path used to be a one-line WARN that
emitted RHS + `Pop()` — silently discarding the value. PRIVATE
pSum stayed at its initial value forever. Now: ASSIGN goes
through PopMemvar; compound forms read via PushMemvar, fold,
write back via PopMemvar.
Test fixture (tests/std_ch/test_compound_lhs.prg) covers all four
shapes. The std.ch runner picks it up so the regression suite now
stands at 15/15.
Other gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
std.ch suite : 15/15
FRB suite : 5/5
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Compiling _FiveSql2/test/test_sql_extreme.prg + a sweep of the FRB
demos surfaced four real bugs in the dynamic-compilation pipeline.
All fixes shipped together because they were on the same critical
path; each is independently revertible.
* **pcode FOR loop ignored STEP and direction.** emitFor in
compiler/genpc emitted a fixed `<= to` comparison and a hardcoded
`+1` increment, then deleted the actual step expression with
slice arithmetic on the byte buffer. Result: `FOR 5 TO 1 STEP
-1` exited on the first iteration; `FOR 1 TO 10 STEP 2` summed
1..10 (55) instead of 1+3+5+7+9 (25). Rewritten to mirror
gengo's emitFor: detect negative step from a literal `-N` or
unary MINUS, pick `<=` vs `>=` accordingly, and emit a clean
`var := var + step` increment per iteration.
* **pcode compound `+=` operator stored only the RHS.** emitAssign
looked at AssignExpr.Op only for the := case; +=/-=/etc.
silently took the same path, so `n += i` compiled as `n := i`,
discarding the accumulator. Loop reduces were wrong: `Reverse`
returned "" and `n := 0; FOR i ... n += i; NEXT` returned only
the last increment. New compoundBinOp helper maps PLUSEQ /
MINUSEQ / STAREQ / SLASHEQ / PERCENTEQ / POWEREQ to their
matching binary opcode; emitAssign emits `local + rhs ; pop
local` for compound forms.
* **Pcode body stack leaks polluted the caller's frame.** A pcode
function whose body left intermediate values on the data stack
(FOR control values, etc.) returned with extra entries past
its declared retVal. FrbDoFunc / FrbExecFunc / FrbRunFunc then
pushed retVal on top of those leaks, so the caller saw the
leaked values where its own preceding arguments should have
been: `? "Fibonacci(10) =", FrbDo(...), "(expect 55)"` printed
`1 55 (expect 55)` because the FOR loop's `1` lived in arg-1's
slot. Two new Thread methods (`SP()` / `SetSP(int)`) let the
three FRB dispatchers snapshot stack depth before the inner
call and clamp it back afterward, so the leaks evaporate before
they reach the caller's frame.
* **FrbExec / FrbRun recursed into the host's Main forever.** Both
looked up "MAIN" via t.VM().FindSymbol, which always resolved
to the OUTER program's Main since FRB modules deliberately keep
Main local. Compile + run + unload became compile + recurse +
OOM. Both now look up Main via mod.FindFunc("MAIN") (module
scope) — Frbload's policy of leaving Main module-local now
actually has the intended effect.
Plus an architectural improvement: in-memory compilation no longer
depends on shelling out to an external `five` binary. New
hbrtl.frbCompileInProc parses + preprocesses + generates pcode in
process, building a FrbModule directly. FrbCompile and FrbExec use
this exclusively, which means dynamic compilation works from any
directory regardless of PATH and without a second process. The
plugin-mode path (with its runtime-version-mismatch fragility) is
left available via hbrt.FrbCompileSource for callers that want it,
but FrbCompile no longer reaches for it by default.
Test suite: tests/frb/ holds five fixtures + a runner. 5/5 pass:
test_frb_simple / test_frb_pcode_load / test_frb_compile /
test_frb_loop / test_frb_step.
Other gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
std.ch suite : 14/14
FRB suite : 5/5
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Until now applyRules looked at the *first* token of each physical
line. PRG legitimately packs multiple statements on a single line
with `;` as an intra-line separator (e.g. `dbCommit(); CLOSE ALL`),
and after Wave 1 removed the parser's xBase fallback for CLOSE/
COMMIT/etc., a `;`-separated `CLOSE ALL` on a line that started
with another statement would slip past std.ch entirely. The parser
then saw `CLOSE` / `ALL` as IDENTifiers, the runtime tried to
dispatch `CLOSE` as a function, and the user got a "no function
symbol for call" panic at execution time.
Fix: at applyRules entry, check for top-level `;` (paren / bracket
/ brace / string-literal balanced), split the line into statement
segments, recursively apply rules to each, rejoin with `;`. Two
new helpers (`hasTopLevelSemi` / `splitTopLevelSemi`) keep the
balancing logic small and self-contained.
Found by compiling _FiveSql2/test/test_sql_extreme.prg, which packs
the typical xBase one-liner DBF setup `dbAppend(); FieldPut(...);
...; dbCommit(); CLOSE ALL` across many rows of test data. The
test was panicking at the first such line; with this fix it now
runs to completion: 15/15 PASS.
All FiveSql2 SQL tests green together for the first time:
test_sql1999 : 43/43
test_sql1999_hard : 10/10
test_sql_extreme : 15/15
test_sql_challenge : 15/15
--
83 / 83
Other gates green:
go test ./... : PASS
Harbour compat : 56/56
std.ch suite : 14/14
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wire up TO FILE for both LIST and DISPLAY: __dbList grows a 9th
parameter cFile, opens it (truncating any prior content) when non-
empty, and writes the formatted rows there via fmt.Fprintln. Default
behavior (no TO FILE) still goes to stdout.
std.ch gets two new rules placed *before* the regular LIST/DISPLAY
patterns so they win when TO FILE is present:
LIST [<v,...>] TO FILE <(f)> [OFF] [FOR] [WHILE] [NEXT] ...
DISPLAY [<v,...>] TO FILE <(f)> [OFF] [FOR] [WHILE] [NEXT] ...
Open failure raises a clear *HbError ("LIST/DISPLAY TO FILE: cannot
create <path> — <syscall reason>") so callers know exactly what went
wrong instead of getting partial-or-empty output.
TO PRINTER stays rejected via __dbNotImpl — Five doesn't drive a
printer port. Test coverage: tests/std_ch/test_list_to_file.prg
exercises four shapes (full LIST, single-row DISPLAY, OFF + FOR with
explicit fields, and confirms TO PRINTER still raises). Wired into
the std.ch runner so the regression suite now stands at 14/14.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
std.ch suite : 14/14
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three audit findings around polish + a release-readiness commit:
* #UX1 LIST/DISPLAY output: dropped \r\n (unix terminals showed a
stray ^M), moved the newline to AFTER each row (no more leading
blank line), and added the `*` deleted-record marker after the
record number — matches xBase LIST/DISPLAY convention. With
SET DELETED ON the marker is unreachable since the row would
have been skipped at Area.Skip level; with SET DELETED OFF the
user now sees which rows are tombstoned.
* #26 temp aliases: `__copytmp` / `__sorttmp` / `__totaltmp` /
`__jointmp` were process-global string constants. A nested
invocation (e.g., COPY inside a FOR clause whose expression
runs another COPY) collided on the alias and the inner Open
failed with "alias already in use" — surfacing as `.F.` with
no clear cause. Each Open now goes through a new helper
`nextTmpAlias(prefix)` backed by an atomic counter, so every
call gets `__copytmp_1`, `__copytmp_2`, etc. — no collisions.
* #J test coverage gap: the 13 std.ch regression tests were all
sitting in `/tmp` — lost on tmpfs reboot, never in git, never
in CI. Move them into `tests/std_ch/` and add a simple
`run.sh` runner that builds + executes each one in a temp
scratch directory and grep-asserts on FAIL / NOT REJECTED /
expectation-mismatch markers. 13/13 pass against the current
head:
PASS test_pp_stdch PASS test_count
PASS test_sum_avg PASS test_sum_multi
PASS test_copy PASS test_sort
PASS test_list PASS test_total
PASS test_join PASS test_update
PASS test_set_deleted PASS test_unsupported
PASS test_block_comma
test_block_comma in particular guards the gengo SeqExpr fix
from Wave 1 — without it the comma-in-block miscompile would
silently come back.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
std.ch suite : 13/13
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop the toy O(n²) insertion-sort that __dbSort had been using and
delegate to the stdlib's sort.SliceStable. Reasoning: SORT TO is an
operation a user reaches for *because* their dataset is too big to
just iterate manually — interactive DBFs routinely have 10k–1M rows,
which the old impl would chew on for minutes to hours. SliceStable
gives O(n log n) and preserves the original-input ordering for
equal keys, which is what the previous implementation also tried to
do.
The function signature is unchanged (`stableSort(rows, less)`), so
all the multi-key / /D / /C dispatch logic from earlier waves keeps
working unmodified.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four audit findings around correctness/consistency in std.ch and the
SORT/UPDATE/TOTAL handlers:
* #13: TOTAL/UPDATE key idiom inconsistency documented as inherent.
TOTAL evaluates `<key>` only in the source workarea so verbatim
`<{key}>` (alias-qualified or `_FIELD->`-prefixed by the user)
works. UPDATE evaluates the same block in BOTH master and detail
context, so it must wrap as `_FIELD-><key>` to dispatch to
whichever WA is selected at eval time. The two rules look alike
but their evaluation contexts differ — also documented in
std.ch alongside both rules so the asymmetry isn't a surprise.
Plus: TOTAL TO and ON are now mandatory (matching the COUNT/
UPDATE pattern from Wave 1) — bare TOTAL would have produced
broken syntax via the unconditional `<(f)>`/`<{key}>` template
references.
* #15/#16: SDF / DELIMITED variants of COPY and TO PRINTER /
TO FILE variants of LIST / DISPLAY are now matched by stub
rules (placed *before* the regular rules so they win) that
expand to a new `__dbNotImpl(reason)` RTL primitive raising a
clear `&hbrt.HbError`. BEGIN SEQUENCE / RECOVER catches the
panic, so callers get a real error instead of the previous
silent dispatch-to-regular-DBF-copy.
* #19: SORT /C (case-insensitive) now actually folds case before
the string compare, instead of being silently treated as
ascending. Suffix parser also rebuilt as a multi-letter scanner
so `name/CD`, `name/DC`, `name/C/D`, `name/D/C` all parse the
same way — combine /C and /D freely. Unknown suffix letters
(e.g., `name/X`) leave the suffix attached to the field name
so a stray slash in user input doesn't get silently mangled
into a broken field reference.
* #27 SET DELETED: verified with a regression test that
`SET DELETED ON` causes COUNT/COPY (and by extension
SORT/TOTAL/JOIN/UPDATE — all of which iterate via Area.Skip)
to skip rows marked deleted. The filtering is implemented at
the workarea level (skipFilter in dbf.go honors hbrdd.IsSetDeleted)
so no RTL changes were needed; this commit just adds the
coverage so the behavior doesn't silently regress.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five concrete gaps the audit flagged in the new __dbCopy / __dbSort /
__dbTotal / __dbJoin / PP code:
* wam.Close() errors were dropped on the floor. Caller saw `.T.`
even when the just-written DBF wasn't durable, leading to the
classic "delete the source after the COPY succeeds" data-loss
pattern. All four functions now capture the close error and
return `.F.` if it fired.
* drv.Create succeeded → wam.Open failed → orphaned-on-disk DBF.
The user-named target file was left around with zero records,
and the next call's drv.Create silently truncated it instead of
surfacing the original error. Add `os.Remove(cFile)` on the
Open-failure cleanup path for COPY/SORT/TOTAL/JOIN.
* __dbTotal would write the DBF codec's overflow sentinel
(`*****`) into the destination's sum-fields when a group total
didn't fit in the source's declared field width, and still
return `.T.`. Now: precompute each sum-field's max representable
magnitude (10^(Len-Dec)) at start, mark the run as overflowed if
any flush sees an out-of-range or NaN value, and propagate
`.F.` to the caller so they don't trust the file.
* cleanUnreferencedMarkers walked byte-by-byte and stripped any
`<ident>` token in the result, INCLUDING ones that appear
inside `"..."` / `'...'` string literals. A user expression
like `LIST FOR url == "<a>x</a>"` got the `<a>` and `</a>`
eaten on output. Now: track string-literal state and skip the
cleanup pass while inside one. Bracket-strings `[…]` are
intentionally not treated as strings here — the result template
uses `[...]` as the optional-repeat marker, and disambiguating
needs context the cleanup pass doesn't have.
* (#8 SET SAFETY honoring) deferred. Harbour default is SAFETY
OFF, so the current always-overwrite behavior matches default
Harbour. The divergence only matters when user explicitly does
`SET SAFETY ON`, which Five doesn't support yet — so the
no-overwrite-protection is consistent end-to-end. Tracked as a
separate followup.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six audit-driven blockers landed together because they're tangled:
* MENU TO removed from std.ch — the rule expanded to a call to a
nonexistent __MenuTo() RTL symbol, so any user code with `MENU
TO choice` compiled clean and panicked at runtime. Behavior
pre-this-round was a parser silent no-op, which is at least
consistent. Restore that until @ PROMPT (the companion command)
actually lands.
* COUNT now requires `TO <var>`. The earlier `[TO <v>]` optional
bracket was a Harbour-pattern transcription error: the result
template references `<v>` unconditionally, so a bare `COUNT`
expanded to ungrammatical ` := 0 ; dbEval(...)` and the
PRG parser rejected it. Match Harbour's std.ch which makes TO
mandatory.
* UPDATE FROM ... REPLACE now requires `FROM`/`ON`/`REPLACE` all
three. Same root cause as COUNT: the result template uses
`<key>`, `<f1>`, `<x1>` unconditionally; missing any of them
produced broken syntax. Tightened to fail loudly rather than
silently mis-expand.
* CLOSE <unknown_alias> no longer closes the *current* workarea.
SelectByAlias was a silent no-op when the alias was missing,
leaving WASaveAndSelectAlias to evaluate the inner DbCloseArea()
against the originally-selected WA — a real data-loss footgun.
SelectByAlias now returns bool; WASaveAndSelectAlias switches to
the no-area sentinel (0) on miss so the inner expression's
Current() returns nil and short-circuits.
* SUM <x1>, <xN> TO <v1>, <vN> — multi-pair form supported.
Required two pieces:
1. matchSegment's regular-marker stop-boundary now combines
outerTail literals AND the segment's repeat boundary so
`[, <xN>]` doesn't let `<xN>` swallow past the next ','.
2. **Five parser miscompiled comma-separated expressions in
code blocks.** `{|| e1, e2, e3 }` kept only the last expr
and threw away earlier ones at *AST level*, so all their
side effects vanished. New SeqExpr AST node + emitter
(emit each, pop intermediate results) + folding/walk
updates fix the underlying bug, which also unbreaks any
other block that relied on comma sequencing.
* pp.go's `;` continuation joiner now strips exactly one trailing
`;` per iteration, preserving Harbour's `;;` convention (literal
`;` followed by a continuation marker). Without this the SUM
rule's chained `<v1> :=[ <vN> :=] 0 ; ; dbEval(...)` collapsed
to a missing statement separator.
* parseExprStmt's xBase fallback switch is back in sync with
parseIdentStmt — COPY/SORT/COUNT/SUM/AVERAGE/TOTAL/UPDATE/JOIN/
DISPLAY/LIST removed (std.ch handles all of them now). Leaving
them in the fallback masked typos as silent no-ops.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Record the 9-commit Phase B run that landed Harbour-style #command
rewrites for ERASE/RENAME/CLOSE/COMMIT/UNLOCK/LOCATE/CONTINUE/
REINDEX/PACK/ZAP/KEYBOARD/RUN plus COUNT/SUM/AVERAGE/COPY/SORT/
LIST/DISPLAY/TOTAL/JOIN/UPDATE — 13 commands that were silent
no-ops in the parser before this round.
Also catalog the 14 PP completeness fixes the rules surfaced
(partial-pattern false-match, blockify substitution, list-aware
smart-stringify and blockify, MarkerList/MarkerWordList in optional
clauses, multi-delimiter capture, line-continuation in directives,
no-progress iteration leak, unreferenced logify/blockify cleanup,
nested `[...]`).
LABEL / REPORT explicitly deferred — niche xBase output-formatting
engines whose `.lbl` / `.frm` binary readers and pagination/group
machinery would be ~800–1500 LOC for near-zero modern users. Parser
keeps the silent no-op behavior for both keywords; entry points
documented in OPTIMIZATION_TODO.md if a real demand ever appears.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`UPDATE [FROM <alias>] [ON <key>] [RANDOM] REPLACE <f1> WITH <x1>
[, <fN> WITH <xN>]` becomes a preprocessor rewrite to a new RTL
primitive __dbUpdate. For each detail record, find the master
record with matching key (forward-walk if both sorted, full scan
when RANDOM) and apply the REPLACE clauses in master's context.
Same shape as harbour-core/src/rdd/dbupdat.prg. The REPLACE clauses
expand to comma-separated assignments inside one block —
`{|| _FIELD->total := del->amt, _FIELD->status := "OK" }` — using
the multi-pair `[, <fN> WITH <xN>]` optional-repeat that std.ch
already establishes for SUM and DEFAULT.
Five-specific tweak: ON <key> wraps as `{|| _FIELD-><key> }` rather
than Harbour's bare `<{key}>`. Five doesn't auto-resolve a bare
identifier in a code block to the current workarea's field, and the
UPDATE block must evaluate against both detail and master so an
explicit alias prefix won't do — _FIELD-> dispatches to whichever
area is selected at eval time, which is what's needed.
Wiring up UPDATE surfaced one further matchSegment gap that fell
out of the multi-pair `[REPLACE ... [, ...]]` shape:
* matchSegment didn't handle nested `[...]` inside its body.
`[REPLACE <f1> WITH <x1> [, <fN> WITH <xN>]]` gave the inner
`[` as a literal token to match against the line, so even the
single-pair `REPLACE total WITH del->amt` form failed and f1/x1
came back empty. Now matchSegment runs the same repeat-loop on
inner `[...]` blocks that the top-level matcher uses, with its
own outer-tail computed from the segment tail past the inner
`]`.
Parser cleanup: UPDATE removed from the IDENT-statement no-op switch.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`JOIN WITH <alias> TO <file> [FIELDS <list>] [FOR <expr>]` becomes a
preprocessor rewrite to a new RTL primitive __dbJoin. Cartesian
product of the current ("master") workarea and the named "detail"
alias, filtered by the FOR expression.
Output structure:
* No FIELDS clause: master's fields followed by detail's, dropping
any detail-side name that clashes with master.
* FIELDS list: one column per name in declaration order, resolved
against master first then detail.
Same shape as harbour-core/src/rdd/dbjoin.prg. Five-specific
simplifications: alias->name in FIELDS not yet supported (bare
names with master-precedence lookup); RDD/codepage args dropped
since Five only has DBFNTX.
Note for callers: don't name a workarea `M` or `MEMVAR` — both are
Harbour-reserved memvar aliases, so `M->field` and `MEMVAR->field`
always go through the memory-variable namespace, not the workarea.
This is gengo behavior matching Harbour, not new in this commit.
Parser cleanup: JOIN removed from the IDENT-statement no-op switch.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`TOTAL TO <file> ON <key> [FIELDS <list>] [FOR ...] [WHILE ...]
[NEXT ...] [RECORD ...] [REST] [ALL]` joins the family of std.ch
DML rewrites. New RTL primitive __dbTotal:
* Walk the source under dbEval-style FOR/WHILE/NEXT/RECORD/REST
bounds. The source must already be sorted/indexed on the key —
same precondition as Harbour's dbtotal.prg.
* Track the current group key. On each key change, flush the
accumulated row to the destination (writing the running totals
back into the most recently appended record's sum-fields,
preserving each field's declared length/decimals).
* On the *first* record of every group, append a fresh dst row
and copy all non-memo source fields into it; subsequent records
in the group only contribute to the sums. Net effect: non-summed
fields take the first record's value, summed fields hold the
group total. Same shape as harbour-core/src/rdd/dbtotal.prg.
* Memo fields are dropped from the destination structure (Harbour
does the same).
Parser cleanup: TOTAL removed from the IDENT-statement no-op switch.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`LIST [<fields>] [OFF] [FOR ...] [WHILE ...] [NEXT ...] [RECORD ...]
[REST] [ALL]` and `DISPLAY [<fields>] [OFF] [FOR ...] ... [ALL]`
reach the parser as plain function calls to a new RTL primitive
__dbList (rtlDbList in hbrtl/database.go).
Implementation: walk the workarea under dbEval-style FOR/WHILE/NEXT/
RECORD/REST bounds. For each visible record, evaluate each column
block and emit the rendered values via valueToDisplay (the same
formatter QOut already uses). Empty fields list defaults to
"all fields". OFF suppresses the record-number prefix.
LIST always emits the full filtered range; DISPLAY without ALL emits
only the current record (encoded as nCount=1). TO PRINTER / TO FILE
clauses are not yet wired through — for now everything goes to
stdout.
Wiring up LIST/DISPLAY surfaced four further gaps in PP that were
silently masking bugs in any rule with multiple word-list / list /
optional clauses chained together:
* matchSegment refused MarkerWordList inside `[...]`. The LIST
rule's `[<off:OFF>]` clause therefore never set the off
capture, and `<.off.>` substituted to nothing instead of .T./.F.
matchSegment now matches WordList markers the same way the
top-level matcher does.
* `<v,...>` and `<(f)>` capture stop boundaries didn't include the
values of following MarkerWordList markers. For
`[<v,...>] [<off:OFF>] [<all:ALL>]` against `LIST id, name OFF`,
the v list would happily eat OFF. New addStopFrom helper
contributes both literal keywords and word-list values; both
matchSegment's MarkerList branch and captureExpression now use
it.
* Optional-repeat loop in matchPattern merged a no-progress
iteration's empty capture into the running multi-capture string
(with the `\x01` separator) before the no-progress break check
fired. So a successful first iteration's value got contaminated
and the substitution loop then skipped it as multi-capture
garbage. The merge now happens after the progress check.
* Unreferenced `<.name.>` markers (optional clauses that didn't
match in the input) were getting cleaned up to empty by the
generic marker scrubber instead of the .F. sentinel Harbour's
std.ch expects. New replaceUnreferencedLogify pass mirrors the
existing replaceUnreferencedBlockify and runs just before the
cleanup.
Parser cleanup: LIST and DISPLAY removed from the IDENT-statement
no-op switch in both parseIdentStmt and parseExprStmt.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`<{name}>` previously wrapped a list-typed capture's whole
comma-joined string in one code block: `{|| id , name }`. Harbour's
std.ch expects per-element wrapping so `{ <{v}> }` against
`LIST id, name` yields `{ {|| id }, {|| name } }` — an array of
column blocks the call site can evaluate per row.
applyResult now consults the marker table for blockify the same way
it already does for smart-stringify, splits the captured list on
top-level commas, and emits one `{|| expr }` per element.
Prereq for the upcoming LIST / DISPLAY rules; no user-visible
behavior change for the rules already in std.ch (their `<{for}>` /
`<{while}>` markers are scalar).
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`SORT TO <file> [ON <key-list>] [FOR ...] [WHILE ...] [NEXT ...]
[RECORD ...] [REST] [ALL]` joins COPY in being a real preprocessor
rewrite to a function call. New RTL primitive __dbSort:
* Buffer visible source records (FOR/WHILE/NEXT/RECORD/REST same
as __dbCopy).
* Multi-key stable insertion sort. Each key may carry `/D` for
descending; ascending otherwise. /A and unknown suffixes fall
through as ascending. Comparison delegates to the existing
compareValues helper in sqlscan.go (numeric / string / NIL-aware).
* Create destination DBF with the source's struct, append rows in
sorted order, restore source selection.
Parser cleanup: SORT removed from the IDENT-statement no-op switch.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`COPY TO <file> [FIELDS <list>] [FOR ...] [WHILE ...] [NEXT ...]
[RECORD ...] [REST] [ALL]` reaches the parser as a plain function
call to a new RTL primitive __dbCopy (rtlDbCopy in hbrtl/database.go).
Implementation: project the field list (case-insensitive name match
against the source's structure, full copy when omitted), dbCreate the
target file with that struct, open it under a temp alias, walk the
source under dbEval-style FOR/WHILE/NEXT/RECORD/REST bounds, and
GetValue/Append/PutValue per record into the target. SDF / DELIMITED
variants stay parser no-ops until those backends arrive.
Wiring up COPY surfaced four longstanding gaps in the PP that had to
be fixed for the rule to even reach the runtime:
* `<(name)>` *pattern* marker was treated as a regular `<name>`
with the parens baked into the captured key, so the matching
result substitution `<(name)>` couldn't find it. parseOneMarker
now strips the parens at parse time so capture key and result
marker share the bare name. The smart-stringify result behavior
is unchanged.
* matchSegment (the optional-clause matcher) bailed on every
non-Regular marker. `[FIELDS <fields,...>]` therefore failed to
match at all and the fields list arrived empty in the result
template. matchSegment now handles MarkerList with paren-balanced
capture and segment+outer literal stop boundaries.
* captureExpression only used the first literal in the pattern
tail as a stop boundary. With std.ch's chain of optional
clauses (`[TO <(f)>] [FIELDS ...] [FOR ...] [WHILE ...] ...`)
the file-name marker was happy to gobble a trailing FOR clause
when FIELDS was absent. It now stops at *any* of the remaining
pattern literals.
* `<(name)>` smart-stringify on a list-typed capture wrapped the
whole comma-joined string in one set of quotes — `{ "a , b" }` —
instead of `{ "a", "b" }`. New helper quoteListElements splits on
top-level commas (paren / bracket / brace / string-balanced) and
quotes each element. applyResult now consults the rule's marker
table to know which captures came from `<name,...>`.
Parser cleanup: COPY removed from the IDENT-statement no-op switch in
both parseIdentStmt and parseExprStmt.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three xBase analytical commands that were silent no-ops in the
parser now execute as Harbour-style PP rewrites:
COUNT [TO <v>] [FOR <for>] [WHILE <while>] ... -> dbEval()
SUM <x> TO <v> [FOR <for>] [WHILE <while>] ... -> dbEval()
AVERAGE <x> TO <v> [FOR ...] -> __dbAverage()
COUNT and SUM expand to a `<v> := 0 ; dbEval( {|| ... } )` pair
matching harbour-core/include/std.ch verbatim. AVERAGE delegates to
a new RTL function rtlDbAverage (sum + count + divide; returns 0 on
empty match) — the chained-private-variable trick Harbour uses to
keep AVERAGE inline doesn't translate cleanly through Five's PP.
Wiring up these rules surfaced four PP issues that had to be fixed
for the rewrite to even reach the parser:
* Result template did not implement <{name}> blockify. So a rule
body like `{|| x := x + <x> }, <{for}>` left the literal text
`<{for}>` in the output. Added blockify substitution: captured
-> `{|| <captured> }`, missing -> NIL.
* findMarkerEnd did not recognise `{`/`}` so unreferenced
blockify markers were not cleaned up either. Added `{`/`}` to
its prefix/suffix sets.
* Optional-clause matching had no view of the outer pattern, so a
regular marker at the end of `[TO <v>]` would swallow the rest
of the line — `COUNT TO n FOR x>5` captured `<v>` as
"n FOR x>5". matchSegment now takes outerTail and stops at its
first literal.
* `#command` directives could not span multiple physical lines.
A trailing `;` is harbour-core's line-continuation marker for
std.ch and now joins the next line into the directive before
parsing.
Parser cleanup: COUNT, SUM, AVERAGE removed from the IDENT-statement
no-op switch in parseIdentStmt + parseExprStmt. The remaining xBase
verbs (COPY, SORT, TOTAL, JOIN, LIST, DISPLAY, LABEL, REPORT, ...)
stay in the parser until their RTL backends arrive.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduce compiler/pp/std.ch with 19 #command rules so that ERASE,
RENAME, DELETE FILE, CLOSE [<a>|ALL|DATABASES], COMMIT, UNLOCK,
LOCATE/CONTINUE, REINDEX, PACK, ZAP, KEYBOARD, RUN, MENU TO, and
CLEAR GETS reach the parser pre-rewritten as plain function calls.
Embedded into the compiler binary via //go:embed so it auto-loads
without an explicit #include in user code, exactly the way Harbour
auto-loads its std.ch.
This is a pure dispatch move, not a behavior change for the
already-working forms: the same Five RTL functions get called.
But it does fix three regressions that the parser was masking:
* ERASE / RENAME / DELETE FILE used to be silent no-ops — the
parser swallowed the entire line and returned NIL. They now
actually delete/rename files (FErase / FRename).
* CLOSE <alias> used to silently ignore the alias and close the
current area. It now switches to the named area first
(<a>->( DbCloseArea() )).
* Two latent #command matcher bugs that surfaced while wiring
std.ch up:
- bare `CLOSE` would match rule `CLOSE ALL` because the tail
of the pattern wasn't checked for unconsumed literals.
- bare `CLOSE` would match rule `CLOSE <a>` because all
unconsumed pattern markers were unconditionally treated as
optional. They are only optional when nested inside `[...]`.
Parser cleanup: parseIdentStmt + parseExprStmt no longer hardcode
ERASE / RENAME / RUN / KEYBOARD / REINDEX / LOCATE / CONTINUE /
COMMIT / CLOSE — the rewriter handles them. Other xBase verbs
(COPY / SORT / COUNT / SUM / AVERAGE / TOTAL / JOIN / LIST /
DISPLAY / LABEL / REPORT / DIR ...) still no-op in the parser
because their RTL backends aren't implemented yet — once the
backends land they move into std.ch the same way.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two hot-path fixes for DBF reads surfaced by the bulk-bench profile.
1. parseNumericField decimal path — was 23% of flat CPU on BULK_CTE.
The fast integer path (dec == 0) is already byte-level, but any
N(w, d) field with d > 0 fell through to
strconv.ParseFloat(string(raw[start:end]), 64)
allocating per-row. A 10k-row CTE insert ran this 200k+ times.
Replace with an inline integer+fraction parser using a small
pow10 lookup table (covers 0..19 decimal places). Unexpected
characters still fall back to strconv for correctness.
Result:
BULK_CTE_10k_20iter 187 → 83 ms (2.25x)
BULK_SUBQ_10k_20iter 102 → 22 ms (4.6x)
2. DBFArea.RecCount in shared mode was doing Seek(0, 2) on every
call. SqlScan calls it once per query for its result-array
pre-allocation (~0.2 ms × 1000 queries = 0.2s of CPU on the
bench). Cache the count per-area, keyed by a process-wide
generation counter. Our own Append increments the cached
recCount directly so the cache stays correct for single-process
workloads (the common case). Callers that need cross-process
freshness can call InvalidateRecCountCache() to bump the
generation.
SQL bench: modest 1-3 ms drops on B1/B2/B3/B6/B7.
Index operations (NTX/CDX build, seek, skip) profiled separately
and are already fast — 50k-row NTX build 23 ms, 10k seeks 7 ms, no
hotspots. Left untouched.
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three correctness bugs in the DML executor that the 4.7 audit
surfaced:
1. RunInsert logged the transaction BEFORE dbAppend() and validation.
LogRecord captured the PREVIOUS row's RecNo, and a CHECK/FK
violation that rolled back via dbDelete() still left a spurious
INSERT entry in the log pointing at the wrong record. Move
LogRecord to after all field puts and all validators pass, so
the log only records committed INSERTs at the correct RecNo.
2. RunUpdate (fallback path) skipped CHECK and FK validation entirely
— only RunInsert validated. An UPDATE could violate the same
constraints INSERT protects against. Add the same validator calls
after FieldPut, with a captured aPrevVals snapshot so the in-
memory record can roll back cleanly on failure. Gated by
SqlLoadConstraints to skip the validator (and its recursive
five_SQL) for tables without SQL-level metadata — tables created
via plain dbCreate see no change.
3. RunDelete had no transaction logging at all — a BEGIN / DELETE /
ROLLBACK cycle silently lost the row. Add LogRecord("DELETE")
before dbDelete so undo can re-surface it. (A full FK-cascade
check on delete would require parent→child scanning; deferred.)
The fast-path SqlBulkUpdate branch still bypasses per-record
validation by design (documented) — it's gated by
`! ::oTxn:IsActive()`, so txn-active queries always take the
validated fallback.
FiveSql2 43/43 (including SAVEPOINT + ROLLBACK TO and all four CHECK/
FK tests), Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Opus 4.7 audit of the codebase surfaced several items that Opus 4.6
sessions left behind. This pass removes what's definitively dead and
fixes one trivial defensive bug; the real logic bugs (transaction
ordering, missing RunUpdate/RunDelete validation) come in a separate
commit.
Deletions:
- `_FiveSql2/src/TSqlParser_orig.prg` (1173 lines) — superseded by
`TSqlParser2.prg` (Pratt). Production never instantiates the old
parser; the only callers were the comparison/benchmark test files
also being removed.
- `_FiveSql2/test/test_parser_cmp.prg` — compared orig vs Pratt AST,
useless now that orig is gone.
- `_FiveSql2/test/bench_parser.prg` — benched both, same reason.
- `_FiveSql2/Makefile` `test_cmp:` and `bench:` targets referenced
the removed files.
- `TSqlIndex.prg` methods `ApplyScope`, `ClearScope`, `ApplySeek`,
`IndexInfo`, `CreateTempIndex`, `DropTempIndex` — each declared in
the class header and implemented (~165 lines total) but zero
callers anywhere in `_FiveSql2/` or `hbrtl/`. Class declarations
removed alongside the bodies.
Small fixes:
- `TSqlDDL.prg:179-180` stale comment claiming Five doesn't support
`@byref` — false since commit e95afad (2026-04-13) wired @byref
via RefCell. The same method uses @nPos correctly elsewhere.
- `hbrt/class.go:tryBinaryOp` defensive nil-check on AsArray().
IsObject() checks the type tag; a corrupted Value with tag=Object
but ptr=nil would crash on `.Class`. Correct construction paths
never hit this, but the guard is cheap.
Compat tests: FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Update the 1.0-readiness document with:
- 2026-04-18 compatibility audit results: 50/47 build rate (94%)
vs previous 40/34. Lists every fix commit this session.
- Four remaining low-priority edge cases from the audit (xcommand
nested-comma args, u64 overflow, USE with ../ paths, legacy
inline-C syntax) — none block a realistic 1.0.
- Revised Phase-C scope: user clarified contrib PRGs can be
imported as-is so long as underlying RTL exists, so the work is
"audit each contrib's low-level deps, fill gaps, copy .prg"
rather than porting every function.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour's `#xcommand DEFAULT <v1> TO <x1> [, <vn> TO <xn>] => ...`
uses an optional, repeatable trailing `[...]` block to accept any
number of `var TO default` pairs on a single line. Five's PP
skipped bracket bodies during pattern matching and treated them
as no-ops in result templates, so
DEFAULT a TO 10, b TO 20, c TO 30
expanded (at best) the first pair and dropped the rest — and
common.ch itself was documented as "not yet supported".
Three concrete changes:
1. matchPattern now matches the `[...]` body repeatedly against
remaining line tokens via a new matchSegment helper. Each
successful iteration appends captures for the interior markers
under the same name, joined with a \x01 sentinel.
2. matchSegment, when capturing the last marker in a body with no
following literal, uses the body's opening literal (e.g. the `,`
in `[, <vn> TO <xn>]`) as the iteration boundary. Otherwise
captureExpression would greedily eat the rest of the line and
collapse every remaining pair into one capture.
3. applyResult's new expandOptionalRepeat walks the result template
for top-level `[...]` blocks. When a referenced marker is multi-
captured it emits the body N times (substituting per-iter value);
when it's single-captured it emits the body once; otherwise drops
the block. A separate referencedMarkers scanner and an inMarker
guard keep literal `[` / `]` inside PP markers (like `<.x.>`)
from being mistaken for bracket delimiters.
Side fix: ParseRule previously stripped every ` ;` as a Harbour
line-continuation marker, but that also destroyed in-line PRG
statement separators in result templates. Line joining is the
preprocessor's job upstream — keep semicolons intact here.
common.ch now ships real DEFAULT and UPDATE #xcommands. Verified
1-, 2-, and 3-pair DEFAULT expansion plus `common.ch` inclusion
from user code. FiveSql2 43/43, Harbour compat 56/56, Go test ALL
PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two bugs blocked Five's own inline-Go feature:
1. Inline Go blocks placed mid-file couldn't carry an `import` list
because Go rejects declarations before imports in the same file.
examples/godump_demo.prg and friends (real Five demos) hit
"syntax error: imports must appear before other declarations"
during compile of the generated Go.
hoistGoImports parses the raw dump body for `import (...)` blocks
and single-form `import "path"` lines, registers each path into
the generator's imports map, and returns the body with those
directives stripped. The top-of-file import block then carries
everything the dump needs.
2. HB_FUNC() calls inside the inline block's init() enqueue
registrations into hbrt.dynamicFuncs, but the VM only promotes
them to its symbol table when RegisterLibModules() is called.
gengo's generated main() skipped that step, so dispatch on the
inline-defined names panicked with "no function symbol for call".
Emit vm.RegisterLibModules() after RegisterModule(symbols).
Verified: examples/godump_demo.prg builds and runs; the inline
GoUpper / GoFib / GoGCD / GoSplit / GoSquare / GoTypeOf functions
all dispatch. Matches the feature's original design intent.
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour's #pragma BEGINDUMP ... #pragma ENDDUMP blocks carry C source
that the Harbour toolchain embeds verbatim. Five takes the same
directive but targets Go — any `.prg` ported from Harbour that ships
inline C gets its C shoveled into the Go codegen pipeline and fails
with opaque errors like "invalid character U+0023 '#'" from the Go
compiler, dozens of lines downstream of the actual cause.
Detect the C shape at PP time and report a clear, actionable error:
pp: file.prg:N: #pragma BEGINDUMP contains C code — Five accepts
inline Go only. Port the block to Go (or use an RTL function),
then wrap in #pragma BEGINDUMP ... #pragma ENDDUMP.
looksLikeInlineC uses conservative signals that don't false-positive
on legitimate inline Go (which calls `hbrt.HB_FUNC("NAME", fn)` with
a package prefix and a quoted string, distinct from C's bare
`HB_FUNC(NAME)` macro). Signals:
- `#include <...>` / `#include "..."` — unambiguous C preprocessor
- line-starting `HB_FUNC(` / `HB_FUNC_STATIC(` — C FFI macro
- `typedef ` / `struct ` / `int main(` / `void main(` at line start
main.go now aborts the build when PP returns errors (previously
printed but continued — same behavior the parser already had for
its own errors). Keeps build output short: one pp line + one
summary line, no gengo noise.
Verified:
- harbour-core/tests/inline_c.prg → clean PP error, exit 1
- examples/godump_demo.prg (legitimate inline Go) → passes PP
(hits a separate pre-existing gengo import-ordering bug, not
related to this change)
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related fixes for Harbour's data-driven `USE &cFile ALIAS &cAlias
INDEX &cNdx` idiom — common in any app that dispatches table names
at runtime.
Parser (compiler/parser/parser.go parseUse):
- `USE &cFile` / `USE &(expr)` previously triggered a
skipToEndOfLine short-circuit, emitting an empty UseCmd (equivalent
to bare USE = close current area). Now parseMacro runs and the
MacroExpr becomes the File node, so codegen emits MacroPush +
dbUseArea.
- `ALIAS &cAlias` / `ALIAS &a.1` similarly dropped the macro result;
now captures it into UseCmd.AliasExpr so codegen evaluates the
alias at runtime. Both the IDENT-path ("ALIAS") and keyword-path
(token.ALIAS) handlers fixed.
PP (compiler/pp/command.go):
- captureExpression and the MarkerList branch now paren-balance
`(`/`[`/`{` so nested grouping inside a macro argument doesn't let
an inner `)` terminate the capture. Example:
_REGULAR_(&(a))
previously captured `&(a` (missing inner `)`) and left the outer
`)` dangling, producing parse errors in the expanded output.
- MarkerList capture still joins tokens with " " for raw `<z>`
substitution — comma tokens stay in the stream, so `s(<z>)`
re-emits them as argument separators and the list expands cleanly.
Bench: harbour-core/tests/pp.prg 2 errors → 0 for the realistic
`USE ¯o` / `&(expr)` patterns. Remaining parse errors on line 70
are a pathological `_REGULAR_L` list that includes `&a. [2]`
(space between macro's terminating dot and an array index) — the
PP expands it correctly but Five's lexer refuses the expanded
result. That form doesn't occur in real code.
/tmp/test_use_macro.prg — all four patterns (`USE &f`, `USE &f ALIAS
&f`, `USE &f ALIAS &f INDEX &i`, dot-terminated) now compile. FiveSql2
43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three cumulative fixes for Harbour's preprocessor stringify forms
surfaced by harbour-core/tests/pp.prg:
1. Token alignment — tokenizePattern and tokenizeLine now both
split on parens and brackets, so `DUMB(a)` (no space) tokenises
as `DUMB`, `(`, `a`, `)` on both sides. Previously the line
tokenizer kept `DUMB(a)` as one token while the pattern split
it three ways, and the match never engaged. Fixes `_DUMB_(a)`-
style calls in pp.prg line 57+.
2. Substitution order — applyResult was replacing the bare `<z>`
marker first, eating the inner `<z>` of `#<z>`, `<"z">`, `<(z)>`
and `<.z.>` and leaving stray `#` / `<` / `.` characters that
the lexer reported as ILLEGAL tokens. Run all compound forms
first, bare `<z>` last.
3. Quote delimiter picker — ppQuote wraps a captured value in a
legal PRG string literal by trying `"..."` first, then `'...'`,
then `[...]`. Harbour's #<z> dumb-stringify needs this because
the capture may already contain `"`, and Five was producing
malformed `""world""` literals.
Bonus: smart-stringify `<(z)>` now recognises input that's already
a string literal (`"x"` / `'x'` / `[x]`) and keeps it verbatim
instead of double-quoting.
pp.prg 26 parse errors → 2 (remaining: `USE &b ALIAS &a.1` macro-
inside-command at line 21 and one related line, unrelated to this
fix). FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour reserves the aliases `M` and `MEMVAR` for the memvar
namespace — `M->cVar` reads a PUBLIC/PRIVATE memvar, not a DBF
field in a workarea named M. Five's emitAliasExpr and emitAssign
treated all aliases identically, emitting:
t.PushAliasField("M", "cVar") // read
_wa := t.WA.(*hbrdd.WorkAreaManager); _wa.SetAliasField("M", ...) // write
which triggered a spurious hbrdd import on programs using memvars
and attempted a workarea lookup that couldn't find a "M" area at
runtime.
Detect the reserved aliases (case-insensitive) at the three
AliasExpr call sites — the read path (emitAliasExpr) and both
assign paths (emitAssign for statements, emitAssignExpr for
expression context) — and route to t.PushMemvar / t.PopMemvar
instead. The existing Thread helpers hash into the MemvarTable
populated by PUBLIC/PRIVATE declarations.
Unblocks harbour-core/tests/macro.prg build (runtime still needs
the TVALUE test helper, unrelated). FiveSql2 43/43, Harbour compat
56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three SWITCH codegen bugs surfaced by harbour-core/tests/switch.prg:
1. Empty SWITCH (`SWITCH x ENDSWITCH`) — legal Harbour, produced by
conditional-compile files like switch.prg:13. Previous code
emitted `_sw := t.Pop2()` followed by `}` with no matching `{`,
closing the enclosing procedure body and producing "syntax error:
non-declaration statement outside function body".
2. OTHERWISE-only (no CASE arms) — emitted `} else {` with no opening
if, same "unexpected keyword else" category.
3. `EXIT` inside a CASE should break out of the SWITCH — but Five
lowers SWITCH to an if/else-if chain, so the generated `break`
had nowhere to land ("break is not in a loop, switch, or select").
Fix all three by wrapping every SWITCH in a one-iteration `for`
loop. `break` inside a case targets the wrapper, matching Harbour
semantics. Empty / OTHERWISE-only bodies still emit valid Go
because the for-loop provides the scope boundary regardless of
whether any if-chain opened. A trailing `break` keeps the loop
one-shot.
Also:
- `_ = _sw` silences unused-var for empty SWITCH.
- Conditionally emit the if-chain closing `}` only when at least
one CASE ran.
All 15 SWITCH blocks in harbour-core/tests/switch.prg now build
and run to completion. FiveSql2 43/43, Harbour compat 56/56,
Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real Harbour headers write parameterised commands with no space
between the keyword and its opening paren:
#xcommand MAKE_TEST( <obj>, <v> ) => ...
ParseRule stored the rule keyword as `MAKE_TEST(` (stripping only
<>, [] marker wrappers), but firstToken normalised source lines by
stopping the first-word scan at `(` — so `MAKE_TEST( o, 42 )`
produced `MAKE_TEST` for the lookup. The two strings didn't match
and the fast-path keyword check rejected every invocation, leaving
the macro unexpanded and the call site as a bare undeclared
identifier.
Trim everything from the first `(` onward during keyword
extraction so both halves agree on the dispatch key. The marker
tokens inside the parens are still parsed normally by
parseMarkers / matchPattern.
Verified with /tmp/test_xcmd2.prg (`MAKE_TEST( o, 99 )` expands
and dispatches to the object's :hVar access). FiveSql2 43/43,
Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour's common.ch exposes classic Clipper type-check shorthands
via #translate rules that map to HB_IS* RTL functions:
#translate ISNIL(<x>) => ((<x>) == NIL)
#translate ISARRAY(<x>) => HB_ISARRAY(<x>)
#translate ISCHARACTER(<x>) => HB_ISSTRING(<x>)
... etc.
Five's preprocessor currently supports #translate only for lines
whose FIRST word is the rule keyword, not for substring matches
inside expressions. Real usage like `IF ISNIL(x)` fails the keyword
check (first word is IF, not ISNIL) and the rule never fires.
Rather than rewrite the PP substring engine (A2 scope), register
the nine short names as direct RTL symbols in register.go, each
pointing at the same Go function as its HB_IS* twin. ISMEMO maps
to HB_ISSTRING as a reasonable approximation for Five (no distinct
memo type at the VM level).
common.ch becomes a short stub that just #defines TRUE/FALSE/YES/NO
and documents where the ISxxx aliases live. DEFAULT / UPDATE
#xcommand forms remain unsupported pending A2.
Verified with /tmp/test_common.prg — ISNUMBER(42), ISCHARACTER("x"),
ISNIL(nilVar) all dispatch correctly. Analyzer still emits
"undeclared variable" warnings for the short names (the static
checker doesn't see runtime-registered RTL symbols) but the
generated code links and runs.
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour permits keywords (CASE, DO, WHILE, etc.) to be used as
variable/array names. In most expression contexts Five already
handles this via expr.go:362 which whitelists keywords when used
as bare identifiers. But parseStmtBlock was stopping on any stop
token unconditionally, so a line like
case[ n ] := x -- 'case' is a LOCAL array
terminated the enclosing stmt block at `case` and left `[ n ] := x`
unparsable.
Add isIdentSuffix(): peeks one ahead and reports whether the next
token is something that can only follow an identifier ([, :=, +=,
-=, *=, /=, %=, ^=, ++, --, :, .). parseStmtBlock now treats the
stop token as a statement-start when its suffix matches, so the
block keeps going.
Verified with /tmp/test_kwident.prg (`case[...]` outside DO CASE,
`arr[...]` inside DO CASE body), /tmp/test_kwident2.prg (both the
`case case[n] == "two"` arm and `case[1] := "updated"` assignment
after ENDCASE). Pathological harbour-core/tests/keywords.prg still
fails — it places `case[...]` in the arm-expected position of a
DO CASE block with no leading arm, which no sane parser can
disambiguate.
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Classic Clipper/Harbour form writes method implementations as bare
`METHOD Name(params)` statements following a `CLASS X ... ENDCLASS`
declaration, with the binding inferred from the most recent class:
CREATE CLASS Shape
METHOD Area
ENDCLASS
METHOD Area -- binds to Shape
RETURN 0
Five was requiring `METHOD Area CLASS Shape` explicitly. Without it,
parseMethodDecl left MethodDecl.ClassName empty, gengo skipped the
body emission, and the link step failed with `undefined: HB_SHAPE_AREA`.
The class registration had AddMethod("AREA", HB_SHAPE_AREA) pointing
at the missing symbol.
Parser tracks p.lastClassName at parseClassDecl, and parseMethodDecl
falls back to that value when no CLASS clause is supplied. Each new
CLASS declaration updates the tracker, so multi-class files still
dispatch correctly — verified with /tmp/test_implicit_class.prg
(Shape + Box both resolve their own Name/Area methods).
Unblocks harbour-core/tests/clsscope.prg and other OOP compat
tests that use this form. FiveSql2 43/43, Harbour compat 56/56,
Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Create Five-1.0-Phase-C-TODO.md capturing the remaining 1.0 work:
three Harbour contrib libraries (hbct Clipper Tools, hbnf Numeric
Functions, hbtip TCP/IP/SMTP/POP3/HTTP). Each entry lists the
Harbour source path, a minimum first-pass scope, and an effort
estimate. Suggested order: hbct → hbtip → hbnf. Total ~6-10 days.
Update RTL-Go-Native-Migration.md "남은 병목" with the Phase A/B
completion list — six features shipped this session — plus a note
that the 11 HBTYPE functions the initial analysis flagged are
actually Harbour's internal scalar class factories, not user-facing
blockers (Five's SendBuiltin covers the same surface).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour's DO() accepts a string (looked up as a function name), a
code block (evaluated with args), or a symbol, and invokes it. Used
for plugin systems and dynamic dispatch idioms like
`DO(cHandler, oRequest)`.
Five already had stmtDo rewrite `DO(...)` at statement-level to a
function-call expression, so callers in expression position just
work — but gengo refused to emit DO as a function call because it
was on the reserved-word guard list (which existed to catch stray
ENDIF/ENDDO from bad IF nesting). Remove DO from that list; the
statement form is still handled upstream by parseDoProc, so the
guard loses nothing.
rtlDo implements the dispatch:
- String target → VM.FindSymbol + t.Function
- Block target → EvalBlock path (same as Eval)
- Anything else → NIL
Tested (/tmp/test_do.prg):
DO("Greet", "World") → "hello, World"
DO({|x,y| x*y+1}, 5, 6) → 31
DO(NIL) → NIL (ValType "U")
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour's `DATA name1, name2, name3` (and `VAR`, `CLASSDATA`)
should declare every listed field. Five's parseDataDecl instead
returned a single DataDecl for the first name and silently dropped
the rest — the comma branch just consumed the identifier without
producing a new decl. Surfaced by the OPERATOR overloading test
(/tmp/test_operator.prg originally had `DATA x, y` for a Vec2
class) where later `::y` access panicked with "unknown method y".
Change the signature to `[]*ast.DataDecl` and rewrite the loop so
each comma closes the current decl and starts a fresh one. AS /
INIT / qualifier runs still attach to the most recent name, so:
DATA x, y, z → three decls, no init
DATA x INIT 10, y, z INIT 0 → init attaches to preceding name
DATA cName AS CHARACTER → typed single decl
All seven class-body call sites flatten the slice into `members`.
Verified with /tmp/test_multidata.prg (`DATA x, y, z` + mixed
`DATA label INIT "origin", count INIT 0`) and the OPERATOR test
which now passes with the original `DATA x, y` form restored.
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour's macro operator was a stub: hbrt.MacroCompile only resolved
bare identifier names to memvars/functions and returned the source
string unchanged for any non-trivial expression. The gengo emit was
also broken — `t.MacroPush() + t.PushNil()` never pushed the inner
expression's value, so MacroPush popped whatever happened to be on
the stack.
Wire it up properly:
1. Gengo fix: `case *ast.MacroExpr` now emits `emitExpr(e.Expr);
t.MacroPush()`. The inner expression produces the source string;
MacroPush consumes it and pushes the evaluated result.
2. Hook pattern in hbrt: `SetMacroEvalHook(fn)` lets hbrtl install
the real evaluator without creating an import cycle (genpc
already imports hbrt). MacroPush delegates to the hook when
installed; otherwise falls back to the legacy stub for hbrt
unit tests.
3. hbrtl.init registers macroEval, which reuses compileExprSource
(factored out of PcCompile) so macro lookups share the same
sync.Map-backed pcode cache — repeat evaluations of the same
macro source are free after the first hit.
4. ExecPcode leaves the result in retVal; macroEval copies it to
the operand stack via PushRetValue.
Tested (/tmp/test_macro.prg):
&"10 + 20" → 30
&"Sqrt(16)" → 4
&"Upper('hello')" → HELLO
&("30 * " + Str(nX, 1)) → 210 (runtime-built source)
&"5 > 3 .AND. .T." → .T.
&("Str(" + Str(nX*10,2) + ",2)") → 70
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour lets a class define custom behaviour for arithmetic and
comparison operators via `OPERATOR "<sym>" ARG <name> INLINE <expr>`.
Five already had the runtime slot infrastructure (ClassDef.Operators
+ AddOperator + parent-chain copy) but parser skipped the form and
the VM ops never consulted the slots.
Parser: parseOperatorDecl captures the symbol, ARG binding, and
INLINE body into a MethodDecl with IsOperator=true and OperatorOp
set to the hbrt.Op* slot. Synthesised method name is __OP_<idx>
to keep the regular method namespace clean.
Codegen: emitClassDecl routes IsOperator members through
_def.AddOperator instead of AddMethod. Inline body generation is
shared with the MESSAGE/INLINE path (34485cd).
VM: Thread.tryBinaryOp walks the LHS object's class operator slot,
pushes args with Self bound to LHS, and returns true if the slot
is populated. Wired into Plus/Minus/Mult/Divide and Equal/NotEqual/
Less/Greater/LessEqual/GreaterEqual. Falls through to built-in
behaviour when no overload exists — non-object LHS costs one tag
check per op.
Operator symbol→slot mapping keeps `=` and `==` on the same slot
(OpEqual=8) because Five's gengo routes both to t.Equal() and the
VM doesn't distinguish strict vs non-strict equality today.
Tested (/tmp/test_operator.prg): Vec2 + - == < with per-field
results all correct.
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour's inline-method sugar was parsed but the body was skipped,
leaving any `METHOD X() INLINE expr` declaration registered in the
class vtable with no matching HB_<CLASS>_X function — link error
at build time.
Parser: MethodDecl gains an InlineBody Expr field. parseClassMethodDecl
captures the expression after INLINE instead of skipping to EOL.
New parseMessageDecl handles `MESSAGE <name> [(params)] INLINE expr`
and returns the same MethodDecl shape.
Codegen: emitClassDecl walks members a second time after the class
registration init block and emits emitInlineMethodBody for each
IsInline method — a Frame(nParams, 0) + emitExpr(InlineBody) +
RetValue function. curMethodClass is bound so ::super: inside an
inline body still resolves.
Tested (/tmp/test_inline.prg): all four patterns — bare INLINE,
MESSAGE INLINE, INLINE with params, INLINE reading ::field —
produce expected values.
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour's ::super: idiom routes a method call through the parent of
the class that defines the currently-executing method — Self stays
the child instance, only the vtable entry point shifts. Five
previously parsed ::super as a data-field access (PushSelfField("SUPER"))
which returned nil and panicked on the subsequent Send.
Runtime: Thread.SendSuper(fromClassName, methodName, nArgs).
Binding to the *defining* class (not Self's runtime class) is
load-bearing for 3+ level hierarchies: without it,
Grand:New → ::super:New → Child:New → ::super:New
would resolve to Grand.Parent=Child again and infinite-loop.
Gengo: Generator.curMethodClass tracks the class name across each
method body emission. emitSendExpr detects the nested SendExpr
shape `::super:X(...)` and emits SendSuper with curMethodClass as
the first argument.
Tested (/tmp/test_super, /tmp/test_super2):
Parent → Child: ::super:Greet() returns composed result
Base → Child → Grand: ::super:New chain passes args correctly
Also fixes three gengo unit tests whose expected output was stale
from prior perf commits (b829ed4 const prop, 1f63c7f symbol hoist,
7e4079f string-concat reassoc) — assertions now match the current
optimized codegen.
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six new migration-log entries covering this session's 21 commits:
#27 VM in-place stack ops + symbol hoist (global 3-15%)
#28 gengo compile-time peepholes (9 commits, 1-7% bench)
#29 SELECT WA cache extension (single-table 2x+)
#30 JOIN temp-alias stabilisation (B6 1.67x)
#31 Stat-loop gates — view + CTE (CPU -40pp in rawsyscalln)
#32 Go-native SqlIsAggName + FetchRow (agg/window 1.3-1.7x)
Plus a cumulative bench table vs the 3caadb2 baseline and an
updated "남은 병목" section pointing at EvalExpr / JOINRECURSE /
HASHJOIN / Go runtime primitives as the remaining levers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CTE tables now materialise via MEMRDD (no file on disk), yet the
RunSelect cleanup loop was still stat-ing __cte_<name>.dbf for every
CTE in every CTE query. Profile after the FetchRow rewrite pinned
HbFileExists at 20.28% of total CPU — pure waste when MEMRDD is the
common path.
Add s_lCteDiskSeen flag, set only when the legacy DBFNTX fallback in
RunSelect actually opens a pre-existing __cte_<name>.dbf (line 1247
path — rare, only for sub-executors referencing a CTE by name on a
crashed-prior-run .dbf). Cleanup runs only when the flag is set.
pprof delta (full bench with cache enabled):
rawsyscalln: 25.56% → 8.50% (~17 points removed)
HbFileExists: 20.28% → 0% (dropped out of top)
Wall-clock unchanged (ENOENT stats are kernel-cached on Darwin), but
this removes the last visible avoidable syscall. What's left in the
profile (kevent, madvise, pthread_cond_*) is Go runtime + scheduler
overhead that application code can't touch.
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TSqlExecutor:FetchRow was the per-row workhorse for aggregation,
HAVING, and window queries. Even with the pre-built aFetchCache
binding columns to (nWA, nFPos), the PRG FOR loop paid one method
dispatch per column per row (dbSelectArea, FieldGet, AllTrim,
AAdd) — profile pinned it at ~30% of B4 CPU.
SqlFetchRowFast collapses the cache-path loop into a single Go
call:
- bound entry: SelectByNum + area.GetValue directly
- unbound (aggregate/expression): self:EvalExpr via Send
- character values: TrimSpace inline
The PRG FetchRow keeps its original cache-miss fallback path
unchanged for rare queries where aFetchCache isn't built.
Bench deltas (median of 3 steady runs, 1000 iters):
B4_GROUP_HAVING 418 → 327 us -22% (1.28x)
B9_ROW_NUMBER 191 → 120 us -37% (1.59x)
B10_RANK_PART 228 → 135 us -41% (1.69x)
B11_SUM_OVER 249 → 156 us -37% (1.60x)
B14_COUNT 235 → 219 us -7%
B15_CTE_WIN_JOIN 1577 → 1452 us -8%
Single-table SELECT (B1-B3, B5-B7, B8) stays flat — those already
hit the column-binding fast path and don't need aggregate dispatch.
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
B4 GROUP+HAVING profile showed SqlIsAggName at ~9% of CPU —
SqlEvalFunc checks it for every function in every row, and the
PRG body was two string allocations + a substring scan:
RETURN ("," + c + ",") $ ("," + AGG_FUNCTIONS + ",")
Replace with a hash lookup against the existing aggFuncSet map
in hbrtl/sqlexpr.go (already populated for SqlExprHasAgg, same
AGG_FUNCTIONS list). Upper-casing skips the allocation when the
input is already upper, which it almost always is in practice.
Bench deltas (median of 3 steady runs, 1000 iters):
B4_GROUP_HAVING 447 → 418 us -6.5%
B14_COUNT 252 → 235 us -7%
B15_CTE_WIN_JOIN 1595 → 1577 us -1%
Other benches unchanged (no aggregate calls per row).
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AcquireTemp now returns the purpose string (upper-cased table name)
as the alias when available, and falls back to FA_#### only when the
same purpose is already in-flight this query — i.e., self-joins.
Previously every call returned a fresh FA_####, so the WA cache
(keyed by alias) could never hit on JOIN queries and the file got
reopened every iteration.
Bench deltas vs prior HEAD:
B6_INNER_JOIN 217 → 130 us -40% (1.67x)
B15_CTE_WIN_JOIN 1678 → 1595 us -5%
Single-table benches unchanged — they were already hitting the
cache via the table-name alias path.
B8 recursive CTE stays flat: its sub-executors at nDepth>1 still
cycle through fresh purposes that don't stabilise across queries.
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After the SELECT WA cache landed, pprof showed HbFileExists → os.Stat
at 28% of remaining CPU — the RunSelect cleanup loop was stat-ing
__view_<table>.dbf for every table in every query, even on the
common view-free path.
Track view materialisation with a TSqlIndex.lViewUsed flag set in
OpenTable when CheckView produces a temp. The cleanup loop now
runs only when the flag is set, then resets it. View-using queries
are unaffected.
pprof delta:
rawsyscalln: 2.14s → 1.41s (48% → 32% of total CPU)
os.Stat: 1.24s → 0.49s (28% → 11%)
Wall-clock bench numbers stayed within plus-or-minus 3% noise (stats
are cheap when the target file does not exist, so CPU savings do not
translate directly to end-to-end time) but this removes the next
biggest syscall waste visible in the profile.
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TSqlExecutor:OpenTable now hands lifetime to the WA cache for stable
aliases (user-supplied or table-named). CloseOpened skips those
entries, so the DBF mmap stays alive across queries instead of being
unmapped + re-opened 1000 times a bench. Previously the WA cache only
covered DML (INSERT/UPDATE/DELETE) — SELECT was still paying the full
dbUseArea/dbCloseArea syscall bill every query (profile showed
rtlDbCloseArea + munmap at ~30% of total CPU).
AcquireTemp-generated aliases (FA_####) are excluded — they change
every query (self-joins, nested depth), so caching them would just
leak entries for no reuse. JOIN / recursive CTE regressions from an
earlier unrestricted version are gone.
Bench deltas vs prior HEAD (median of 3 steady runs, 1000 iters):
B1_SELECT_STAR 82 → 41 us -50% (2.0x)
B2_WHERE_FILTER 78 → 35 us -55% (2.2x)
B3_ORDER_BY 90 → 48 us -47% (1.88x)
B5_DISTINCT 75 → 32 us -57% (2.34x)
B7_CTE_SIMPLE 120 → 77 us -36% (1.56x)
B9_ROW_NUMBER 239 → 194 us -19%
B10_RANK_PART 276 → 233 us -16%
B11_SUM_OVER 296 → 252 us -15%
B4_GROUP_HAVING 498 → 450 us -10%
Others flat (JOIN / recursive CTE / DML already covered).
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When collectConstLocals proves a LOCAL is only ever read, not
written beyond its literal init, every read site gets the literal
substituted inline — which means the init itself has no live
reader. Skip emitting the PushXxx/PopLocalFast pair for those
LOCALs in both top-of-function and mid-body decls.
On a function with `LOCAL nBuf := 100, sTag := "x", bFlag := .T.`,
all three inits drop out (6 VM ops saved in the prologue), while
the still-written `LOCAL nSum := 0` init stays. Harbour compat
56/56, FiveSql2 43/43.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>