stripBlockComments scanned each line for a /* block-comment opener
while tracking string literals but had no notion of // or && line
comments. A line like `// see app/api/*.prg` would open a block
comment from /*.prg that ran until EOF or the next */, silently
dropping every FUNCTION declaration in between. The compiled file
ended up with an empty symbols slice, and callers in other files
panicked at runtime with "no function symbol for call".
Hit while writing app/lib/text.prg in solmade — its `// build's
\`app/api/*.prg\` glob doesn't pick it up` line dropped all three
of QueryParamRaw / UrlDecodeBytes / IsAllDigits.
Fix: detect // and && line-comment markers before the /* check.
When one is seen, copy the rest of the line through verbatim (the
lexer and #command machinery still need it) and stop scanning so
the embedded /* can't open a block comment.
Two regression tests cover both markers. Full mandatory test suite
(go test ./..., FiveSql2 43/43, compat 56/56, std.ch 17/17, FRB 7/7,
pgserver 11/11) still passes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three silent-miscompile fixes in the preprocessor that were
masking real bugs in Harbour-style PRG.
1. Brace tokenizer (compiler/pp/command.go)
`{` and `}` now tokenize as standalone separator tokens. The
matcher previously only split on `,()[]"'` etc., so a codeblock
literal `{|| ... }` in a macro argument became the tokens `{||`,
`""`, `}`. The capture-depth tracker only matched exact `{`/`}`,
so `{||` was invisible as an opener while the standalone `}`
wrongly decremented depth — `TEST_LINE( o:VarPut({|| "" }) )`
truncated mid-argument and the parser later choked at the inner
`}` with `expected ), got } "}"`.
Fix: add `{` and `}` to tokenizeLine's separator set. Now
`{|| ... }` lexes as `{`, `||`, `""`, `}` and balances cleanly.
2. ;-continuation join for non-`#` lines (compiler/pp/pp.go)
The existing line-joiner only collapsed trailing `;` continuations
on `#`-prefixed directives. Plain source code using the same
convention — e.g. Harbour's TEST macro:
TEST t004 STATIC s_once := NIL, S_C ;
INIT hb_threadOnce( @s_once, {|| ... } ) ;
CODE x := S_C
was processed one physical line at a time, so the TEST pattern
never matched the full logical statement. The first row passed
through unrewritten, fell through to the parser as an expression,
and gengo silently absorbed it as part of the *previous*
function's body. Six TEST macros' STATIC declarations all ended
up tagged with t003's function name, producing duplicate
`static_T003_S_ONCE` decls and a Go compile failure.
Fix: add the same trailing-`;` join logic to user code, with
blank-line fillers inserted post-join so source line numbers in
parser errors still align with the original file.
3. Block-comment-aware continuation join
Inline `/* ... */` at the end of a continuation row hid the
trailing `;` from the joiner's HasSuffix check. The fix calls
stripBlockComments on the next-line peek before testing for `;`,
so chains like
AAdd( aResult, { cChildBase, ;
aRefs[ "fk" ][ j ][ 1 ], ; /* child col */
aRefs[ "fk" ][ j ][ 3 ], ; /* parent col */
...
keep folding instead of stopping after one row and leaving a
dangling `,` at end of line.
Results
-------
Harbour-core compat sweep: 25/30 → 28/30 (remaining lnlenli1 +
keywords are //NOTEST stress files, intentionally unbalanced).
All 6 release gates green: go test ./..., FiveSql2 43/43,
Harbour compat 56/56, std.ch 17/17, FRB 7/7, examples 65/71.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three medium-priority audit items in one commit, each independently
revertible.
* **#18 JOIN hash-join fast path.** New std.ch shape:
JOIN WITH <alias> TO <file> [FIELDS ...] ON <mfield> = <dfield>
expands to a 6-arg __dbJoin call with the master/detail key
field names. Runtime detects the extra args, builds an O(M)
hash over the detail's key column, then probes per master row
for O(N+M) total — vs the FOR form's O(N*M). For 1k×1k that's
2k vs 1M operations; the gap widens with N. The original FOR
form is unchanged and stays the fallback for arbitrary
predicates. New helper dbHashKey type-tags the key string so
`1` (numeric), `"1"` (string), and `.T.` (logical) don't
collide in the bucket map.
* **#38 PP rule result-marker validation.** ParseRule now walks
the result template after parseMarkers and warns about every
`<name>` (or `<(name)>` / `<.name.>` / `<{name}>` / `#<name>`
/ `<"name">`) that doesn't match a pattern marker. Warnings
flow into pp.errors via handleDirective with the directive's
filename:line, so a typo'd `<NaMe>` in an `#xcommand`
case-sensitive rule fails the build with a clear diagnostic
instead of silently producing broken expansions.
* **#44 looksLikeInlineC heuristic strengthened.** Catches more
of the common Harbour-PRG-with-C-inline-block shapes that
used to fall through and produce cryptic Go-side errors:
function-like #define, `extern "C"` linkage blocks, C return-
type declarations (`int foo(`, `static char* bar(`), and the
hb_ret*() helper family used by Harbour's C FFI return
setters. Two small predicate helpers (allLetters,
allIdentChars) keep the C-vs-Go disambiguation tight enough
that legit Go code (`func name() int { ... }`) doesn't trip.
* **#28 LIST/DISPLAY pagination** — explicitly deferred. Proper
pagination requires interactive terminal handling (Inkey(0)
for the keypress) which would hang in CI / batch mode. Will
revisit when an interactive terminal layer needs it for
other reasons.
Test fixtures: tests/std_ch/test_join_hash.prg verifies the new
ON-form path produces the same output as the FOR form would.
std.ch runner now stands at 16/16.
Other gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
std.ch suite : 16/16
FRB suite : 7/7
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Until now applyRules looked at the *first* token of each physical
line. PRG legitimately packs multiple statements on a single line
with `;` as an intra-line separator (e.g. `dbCommit(); CLOSE ALL`),
and after Wave 1 removed the parser's xBase fallback for CLOSE/
COMMIT/etc., a `;`-separated `CLOSE ALL` on a line that started
with another statement would slip past std.ch entirely. The parser
then saw `CLOSE` / `ALL` as IDENTifiers, the runtime tried to
dispatch `CLOSE` as a function, and the user got a "no function
symbol for call" panic at execution time.
Fix: at applyRules entry, check for top-level `;` (paren / bracket
/ brace / string-literal balanced), split the line into statement
segments, recursively apply rules to each, rejoin with `;`. Two
new helpers (`hasTopLevelSemi` / `splitTopLevelSemi`) keep the
balancing logic small and self-contained.
Found by compiling _FiveSql2/test/test_sql_extreme.prg, which packs
the typical xBase one-liner DBF setup `dbAppend(); FieldPut(...);
...; dbCommit(); CLOSE ALL` across many rows of test data. The
test was panicking at the first such line; with this fix it now
runs to completion: 15/15 PASS.
All FiveSql2 SQL tests green together for the first time:
test_sql1999 : 43/43
test_sql1999_hard : 10/10
test_sql_extreme : 15/15
test_sql_challenge : 15/15
--
83 / 83
Other gates green:
go test ./... : PASS
Harbour compat : 56/56
std.ch suite : 14/14
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six audit-driven blockers landed together because they're tangled:
* MENU TO removed from std.ch — the rule expanded to a call to a
nonexistent __MenuTo() RTL symbol, so any user code with `MENU
TO choice` compiled clean and panicked at runtime. Behavior
pre-this-round was a parser silent no-op, which is at least
consistent. Restore that until @ PROMPT (the companion command)
actually lands.
* COUNT now requires `TO <var>`. The earlier `[TO <v>]` optional
bracket was a Harbour-pattern transcription error: the result
template references `<v>` unconditionally, so a bare `COUNT`
expanded to ungrammatical ` := 0 ; dbEval(...)` and the
PRG parser rejected it. Match Harbour's std.ch which makes TO
mandatory.
* UPDATE FROM ... REPLACE now requires `FROM`/`ON`/`REPLACE` all
three. Same root cause as COUNT: the result template uses
`<key>`, `<f1>`, `<x1>` unconditionally; missing any of them
produced broken syntax. Tightened to fail loudly rather than
silently mis-expand.
* CLOSE <unknown_alias> no longer closes the *current* workarea.
SelectByAlias was a silent no-op when the alias was missing,
leaving WASaveAndSelectAlias to evaluate the inner DbCloseArea()
against the originally-selected WA — a real data-loss footgun.
SelectByAlias now returns bool; WASaveAndSelectAlias switches to
the no-area sentinel (0) on miss so the inner expression's
Current() returns nil and short-circuits.
* SUM <x1>, <xN> TO <v1>, <vN> — multi-pair form supported.
Required two pieces:
1. matchSegment's regular-marker stop-boundary now combines
outerTail literals AND the segment's repeat boundary so
`[, <xN>]` doesn't let `<xN>` swallow past the next ','.
2. **Five parser miscompiled comma-separated expressions in
code blocks.** `{|| e1, e2, e3 }` kept only the last expr
and threw away earlier ones at *AST level*, so all their
side effects vanished. New SeqExpr AST node + emitter
(emit each, pop intermediate results) + folding/walk
updates fix the underlying bug, which also unbreaks any
other block that relied on comma sequencing.
* pp.go's `;` continuation joiner now strips exactly one trailing
`;` per iteration, preserving Harbour's `;;` convention (literal
`;` followed by a continuation marker). Without this the SUM
rule's chained `<v1> :=[ <vN> :=] 0 ; ; dbEval(...)` collapsed
to a missing statement separator.
* parseExprStmt's xBase fallback switch is back in sync with
parseIdentStmt — COPY/SORT/COUNT/SUM/AVERAGE/TOTAL/UPDATE/JOIN/
DISPLAY/LIST removed (std.ch handles all of them now). Leaving
them in the fallback masked typos as silent no-ops.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three xBase analytical commands that were silent no-ops in the
parser now execute as Harbour-style PP rewrites:
COUNT [TO <v>] [FOR <for>] [WHILE <while>] ... -> dbEval()
SUM <x> TO <v> [FOR <for>] [WHILE <while>] ... -> dbEval()
AVERAGE <x> TO <v> [FOR ...] -> __dbAverage()
COUNT and SUM expand to a `<v> := 0 ; dbEval( {|| ... } )` pair
matching harbour-core/include/std.ch verbatim. AVERAGE delegates to
a new RTL function rtlDbAverage (sum + count + divide; returns 0 on
empty match) — the chained-private-variable trick Harbour uses to
keep AVERAGE inline doesn't translate cleanly through Five's PP.
Wiring up these rules surfaced four PP issues that had to be fixed
for the rewrite to even reach the parser:
* Result template did not implement <{name}> blockify. So a rule
body like `{|| x := x + <x> }, <{for}>` left the literal text
`<{for}>` in the output. Added blockify substitution: captured
-> `{|| <captured> }`, missing -> NIL.
* findMarkerEnd did not recognise `{`/`}` so unreferenced
blockify markers were not cleaned up either. Added `{`/`}` to
its prefix/suffix sets.
* Optional-clause matching had no view of the outer pattern, so a
regular marker at the end of `[TO <v>]` would swallow the rest
of the line — `COUNT TO n FOR x>5` captured `<v>` as
"n FOR x>5". matchSegment now takes outerTail and stops at its
first literal.
* `#command` directives could not span multiple physical lines.
A trailing `;` is harbour-core's line-continuation marker for
std.ch and now joins the next line into the directive before
parsing.
Parser cleanup: COUNT, SUM, AVERAGE removed from the IDENT-statement
no-op switch in parseIdentStmt + parseExprStmt. The remaining xBase
verbs (COPY, SORT, TOTAL, JOIN, LIST, DISPLAY, LABEL, REPORT, ...)
stay in the parser until their RTL backends arrive.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduce compiler/pp/std.ch with 19 #command rules so that ERASE,
RENAME, DELETE FILE, CLOSE [<a>|ALL|DATABASES], COMMIT, UNLOCK,
LOCATE/CONTINUE, REINDEX, PACK, ZAP, KEYBOARD, RUN, MENU TO, and
CLEAR GETS reach the parser pre-rewritten as plain function calls.
Embedded into the compiler binary via //go:embed so it auto-loads
without an explicit #include in user code, exactly the way Harbour
auto-loads its std.ch.
This is a pure dispatch move, not a behavior change for the
already-working forms: the same Five RTL functions get called.
But it does fix three regressions that the parser was masking:
* ERASE / RENAME / DELETE FILE used to be silent no-ops — the
parser swallowed the entire line and returned NIL. They now
actually delete/rename files (FErase / FRename).
* CLOSE <alias> used to silently ignore the alias and close the
current area. It now switches to the named area first
(<a>->( DbCloseArea() )).
* Two latent #command matcher bugs that surfaced while wiring
std.ch up:
- bare `CLOSE` would match rule `CLOSE ALL` because the tail
of the pattern wasn't checked for unconsumed literals.
- bare `CLOSE` would match rule `CLOSE <a>` because all
unconsumed pattern markers were unconditionally treated as
optional. They are only optional when nested inside `[...]`.
Parser cleanup: parseIdentStmt + parseExprStmt no longer hardcode
ERASE / RENAME / RUN / KEYBOARD / REINDEX / LOCATE / CONTINUE /
COMMIT / CLOSE — the rewriter handles them. Other xBase verbs
(COPY / SORT / COUNT / SUM / AVERAGE / TOTAL / JOIN / LIST /
DISPLAY / LABEL / REPORT / DIR ...) still no-op in the parser
because their RTL backends aren't implemented yet — once the
backends land they move into std.ch the same way.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour's #pragma BEGINDUMP ... #pragma ENDDUMP blocks carry C source
that the Harbour toolchain embeds verbatim. Five takes the same
directive but targets Go — any `.prg` ported from Harbour that ships
inline C gets its C shoveled into the Go codegen pipeline and fails
with opaque errors like "invalid character U+0023 '#'" from the Go
compiler, dozens of lines downstream of the actual cause.
Detect the C shape at PP time and report a clear, actionable error:
pp: file.prg:N: #pragma BEGINDUMP contains C code — Five accepts
inline Go only. Port the block to Go (or use an RTL function),
then wrap in #pragma BEGINDUMP ... #pragma ENDDUMP.
looksLikeInlineC uses conservative signals that don't false-positive
on legitimate inline Go (which calls `hbrt.HB_FUNC("NAME", fn)` with
a package prefix and a quoted string, distinct from C's bare
`HB_FUNC(NAME)` macro). Signals:
- `#include <...>` / `#include "..."` — unambiguous C preprocessor
- line-starting `HB_FUNC(` / `HB_FUNC_STATIC(` — C FFI macro
- `typedef ` / `struct ` / `int main(` / `void main(` at line start
main.go now aborts the build when PP returns errors (previously
printed but continued — same behavior the parser already had for
its own errors). Keeps build output short: one pp line + one
summary line, no gengo noise.
Verified:
- harbour-core/tests/inline_c.prg → clean PP error, exit 1
- examples/godump_demo.prg (legitimate inline Go) → passes PP
(hits a separate pre-existing gengo import-ordering bug, not
related to this change)
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>