Harbour's #pragma BEGINDUMP ... #pragma ENDDUMP blocks carry C source
that the Harbour toolchain embeds verbatim. Five takes the same
directive but targets Go — any `.prg` ported from Harbour that ships
inline C gets its C shoveled into the Go codegen pipeline and fails
with opaque errors like "invalid character U+0023 '#'" from the Go
compiler, dozens of lines downstream of the actual cause.
Detect the C shape at PP time and report a clear, actionable error:
pp: file.prg:N: #pragma BEGINDUMP contains C code — Five accepts
inline Go only. Port the block to Go (or use an RTL function),
then wrap in #pragma BEGINDUMP ... #pragma ENDDUMP.
looksLikeInlineC uses conservative signals that don't false-positive
on legitimate inline Go (which calls `hbrt.HB_FUNC("NAME", fn)` with
a package prefix and a quoted string, distinct from C's bare
`HB_FUNC(NAME)` macro). Signals:
- `#include <...>` / `#include "..."` — unambiguous C preprocessor
- line-starting `HB_FUNC(` / `HB_FUNC_STATIC(` — C FFI macro
- `typedef ` / `struct ` / `int main(` / `void main(` at line start
main.go now aborts the build when PP returns errors (previously
printed but continued — same behavior the parser already had for
its own errors). Keeps build output short: one pp line + one
summary line, no gengo noise.
Verified:
- harbour-core/tests/inline_c.prg → clean PP error, exit 1
- examples/godump_demo.prg (legitimate inline Go) → passes PP
(hits a separate pre-existing gengo import-ordering bug, not
related to this change)
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related fixes for Harbour's data-driven `USE &cFile ALIAS &cAlias
INDEX &cNdx` idiom — common in any app that dispatches table names
at runtime.
Parser (compiler/parser/parser.go parseUse):
- `USE &cFile` / `USE &(expr)` previously triggered a
skipToEndOfLine short-circuit, emitting an empty UseCmd (equivalent
to bare USE = close current area). Now parseMacro runs and the
MacroExpr becomes the File node, so codegen emits MacroPush +
dbUseArea.
- `ALIAS &cAlias` / `ALIAS &a.1` similarly dropped the macro result;
now captures it into UseCmd.AliasExpr so codegen evaluates the
alias at runtime. Both the IDENT-path ("ALIAS") and keyword-path
(token.ALIAS) handlers fixed.
PP (compiler/pp/command.go):
- captureExpression and the MarkerList branch now paren-balance
`(`/`[`/`{` so nested grouping inside a macro argument doesn't let
an inner `)` terminate the capture. Example:
_REGULAR_(&(a))
previously captured `&(a` (missing inner `)`) and left the outer
`)` dangling, producing parse errors in the expanded output.
- MarkerList capture still joins tokens with " " for raw `<z>`
substitution — comma tokens stay in the stream, so `s(<z>)`
re-emits them as argument separators and the list expands cleanly.
Bench: harbour-core/tests/pp.prg 2 errors → 0 for the realistic
`USE ¯o` / `&(expr)` patterns. Remaining parse errors on line 70
are a pathological `_REGULAR_L` list that includes `&a. [2]`
(space between macro's terminating dot and an array index) — the
PP expands it correctly but Five's lexer refuses the expanded
result. That form doesn't occur in real code.
/tmp/test_use_macro.prg — all four patterns (`USE &f`, `USE &f ALIAS
&f`, `USE &f ALIAS &f INDEX &i`, dot-terminated) now compile. FiveSql2
43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three cumulative fixes for Harbour's preprocessor stringify forms
surfaced by harbour-core/tests/pp.prg:
1. Token alignment — tokenizePattern and tokenizeLine now both
split on parens and brackets, so `DUMB(a)` (no space) tokenises
as `DUMB`, `(`, `a`, `)` on both sides. Previously the line
tokenizer kept `DUMB(a)` as one token while the pattern split
it three ways, and the match never engaged. Fixes `_DUMB_(a)`-
style calls in pp.prg line 57+.
2. Substitution order — applyResult was replacing the bare `<z>`
marker first, eating the inner `<z>` of `#<z>`, `<"z">`, `<(z)>`
and `<.z.>` and leaving stray `#` / `<` / `.` characters that
the lexer reported as ILLEGAL tokens. Run all compound forms
first, bare `<z>` last.
3. Quote delimiter picker — ppQuote wraps a captured value in a
legal PRG string literal by trying `"..."` first, then `'...'`,
then `[...]`. Harbour's #<z> dumb-stringify needs this because
the capture may already contain `"`, and Five was producing
malformed `""world""` literals.
Bonus: smart-stringify `<(z)>` now recognises input that's already
a string literal (`"x"` / `'x'` / `[x]`) and keeps it verbatim
instead of double-quoting.
pp.prg 26 parse errors → 2 (remaining: `USE &b ALIAS &a.1` macro-
inside-command at line 21 and one related line, unrelated to this
fix). FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour reserves the aliases `M` and `MEMVAR` for the memvar
namespace — `M->cVar` reads a PUBLIC/PRIVATE memvar, not a DBF
field in a workarea named M. Five's emitAliasExpr and emitAssign
treated all aliases identically, emitting:
t.PushAliasField("M", "cVar") // read
_wa := t.WA.(*hbrdd.WorkAreaManager); _wa.SetAliasField("M", ...) // write
which triggered a spurious hbrdd import on programs using memvars
and attempted a workarea lookup that couldn't find a "M" area at
runtime.
Detect the reserved aliases (case-insensitive) at the three
AliasExpr call sites — the read path (emitAliasExpr) and both
assign paths (emitAssign for statements, emitAssignExpr for
expression context) — and route to t.PushMemvar / t.PopMemvar
instead. The existing Thread helpers hash into the MemvarTable
populated by PUBLIC/PRIVATE declarations.
Unblocks harbour-core/tests/macro.prg build (runtime still needs
the TVALUE test helper, unrelated). FiveSql2 43/43, Harbour compat
56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three SWITCH codegen bugs surfaced by harbour-core/tests/switch.prg:
1. Empty SWITCH (`SWITCH x ENDSWITCH`) — legal Harbour, produced by
conditional-compile files like switch.prg:13. Previous code
emitted `_sw := t.Pop2()` followed by `}` with no matching `{`,
closing the enclosing procedure body and producing "syntax error:
non-declaration statement outside function body".
2. OTHERWISE-only (no CASE arms) — emitted `} else {` with no opening
if, same "unexpected keyword else" category.
3. `EXIT` inside a CASE should break out of the SWITCH — but Five
lowers SWITCH to an if/else-if chain, so the generated `break`
had nowhere to land ("break is not in a loop, switch, or select").
Fix all three by wrapping every SWITCH in a one-iteration `for`
loop. `break` inside a case targets the wrapper, matching Harbour
semantics. Empty / OTHERWISE-only bodies still emit valid Go
because the for-loop provides the scope boundary regardless of
whether any if-chain opened. A trailing `break` keeps the loop
one-shot.
Also:
- `_ = _sw` silences unused-var for empty SWITCH.
- Conditionally emit the if-chain closing `}` only when at least
one CASE ran.
All 15 SWITCH blocks in harbour-core/tests/switch.prg now build
and run to completion. FiveSql2 43/43, Harbour compat 56/56,
Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real Harbour headers write parameterised commands with no space
between the keyword and its opening paren:
#xcommand MAKE_TEST( <obj>, <v> ) => ...
ParseRule stored the rule keyword as `MAKE_TEST(` (stripping only
<>, [] marker wrappers), but firstToken normalised source lines by
stopping the first-word scan at `(` — so `MAKE_TEST( o, 42 )`
produced `MAKE_TEST` for the lookup. The two strings didn't match
and the fast-path keyword check rejected every invocation, leaving
the macro unexpanded and the call site as a bare undeclared
identifier.
Trim everything from the first `(` onward during keyword
extraction so both halves agree on the dispatch key. The marker
tokens inside the parens are still parsed normally by
parseMarkers / matchPattern.
Verified with /tmp/test_xcmd2.prg (`MAKE_TEST( o, 99 )` expands
and dispatches to the object's :hVar access). FiveSql2 43/43,
Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour's common.ch exposes classic Clipper type-check shorthands
via #translate rules that map to HB_IS* RTL functions:
#translate ISNIL(<x>) => ((<x>) == NIL)
#translate ISARRAY(<x>) => HB_ISARRAY(<x>)
#translate ISCHARACTER(<x>) => HB_ISSTRING(<x>)
... etc.
Five's preprocessor currently supports #translate only for lines
whose FIRST word is the rule keyword, not for substring matches
inside expressions. Real usage like `IF ISNIL(x)` fails the keyword
check (first word is IF, not ISNIL) and the rule never fires.
Rather than rewrite the PP substring engine (A2 scope), register
the nine short names as direct RTL symbols in register.go, each
pointing at the same Go function as its HB_IS* twin. ISMEMO maps
to HB_ISSTRING as a reasonable approximation for Five (no distinct
memo type at the VM level).
common.ch becomes a short stub that just #defines TRUE/FALSE/YES/NO
and documents where the ISxxx aliases live. DEFAULT / UPDATE
#xcommand forms remain unsupported pending A2.
Verified with /tmp/test_common.prg — ISNUMBER(42), ISCHARACTER("x"),
ISNIL(nilVar) all dispatch correctly. Analyzer still emits
"undeclared variable" warnings for the short names (the static
checker doesn't see runtime-registered RTL symbols) but the
generated code links and runs.
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour permits keywords (CASE, DO, WHILE, etc.) to be used as
variable/array names. In most expression contexts Five already
handles this via expr.go:362 which whitelists keywords when used
as bare identifiers. But parseStmtBlock was stopping on any stop
token unconditionally, so a line like
case[ n ] := x -- 'case' is a LOCAL array
terminated the enclosing stmt block at `case` and left `[ n ] := x`
unparsable.
Add isIdentSuffix(): peeks one ahead and reports whether the next
token is something that can only follow an identifier ([, :=, +=,
-=, *=, /=, %=, ^=, ++, --, :, .). parseStmtBlock now treats the
stop token as a statement-start when its suffix matches, so the
block keeps going.
Verified with /tmp/test_kwident.prg (`case[...]` outside DO CASE,
`arr[...]` inside DO CASE body), /tmp/test_kwident2.prg (both the
`case case[n] == "two"` arm and `case[1] := "updated"` assignment
after ENDCASE). Pathological harbour-core/tests/keywords.prg still
fails — it places `case[...]` in the arm-expected position of a
DO CASE block with no leading arm, which no sane parser can
disambiguate.
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Classic Clipper/Harbour form writes method implementations as bare
`METHOD Name(params)` statements following a `CLASS X ... ENDCLASS`
declaration, with the binding inferred from the most recent class:
CREATE CLASS Shape
METHOD Area
ENDCLASS
METHOD Area -- binds to Shape
RETURN 0
Five was requiring `METHOD Area CLASS Shape` explicitly. Without it,
parseMethodDecl left MethodDecl.ClassName empty, gengo skipped the
body emission, and the link step failed with `undefined: HB_SHAPE_AREA`.
The class registration had AddMethod("AREA", HB_SHAPE_AREA) pointing
at the missing symbol.
Parser tracks p.lastClassName at parseClassDecl, and parseMethodDecl
falls back to that value when no CLASS clause is supplied. Each new
CLASS declaration updates the tracker, so multi-class files still
dispatch correctly — verified with /tmp/test_implicit_class.prg
(Shape + Box both resolve their own Name/Area methods).
Unblocks harbour-core/tests/clsscope.prg and other OOP compat
tests that use this form. FiveSql2 43/43, Harbour compat 56/56,
Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Create Five-1.0-Phase-C-TODO.md capturing the remaining 1.0 work:
three Harbour contrib libraries (hbct Clipper Tools, hbnf Numeric
Functions, hbtip TCP/IP/SMTP/POP3/HTTP). Each entry lists the
Harbour source path, a minimum first-pass scope, and an effort
estimate. Suggested order: hbct → hbtip → hbnf. Total ~6-10 days.
Update RTL-Go-Native-Migration.md "남은 병목" with the Phase A/B
completion list — six features shipped this session — plus a note
that the 11 HBTYPE functions the initial analysis flagged are
actually Harbour's internal scalar class factories, not user-facing
blockers (Five's SendBuiltin covers the same surface).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour's DO() accepts a string (looked up as a function name), a
code block (evaluated with args), or a symbol, and invokes it. Used
for plugin systems and dynamic dispatch idioms like
`DO(cHandler, oRequest)`.
Five already had stmtDo rewrite `DO(...)` at statement-level to a
function-call expression, so callers in expression position just
work — but gengo refused to emit DO as a function call because it
was on the reserved-word guard list (which existed to catch stray
ENDIF/ENDDO from bad IF nesting). Remove DO from that list; the
statement form is still handled upstream by parseDoProc, so the
guard loses nothing.
rtlDo implements the dispatch:
- String target → VM.FindSymbol + t.Function
- Block target → EvalBlock path (same as Eval)
- Anything else → NIL
Tested (/tmp/test_do.prg):
DO("Greet", "World") → "hello, World"
DO({|x,y| x*y+1}, 5, 6) → 31
DO(NIL) → NIL (ValType "U")
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour's `DATA name1, name2, name3` (and `VAR`, `CLASSDATA`)
should declare every listed field. Five's parseDataDecl instead
returned a single DataDecl for the first name and silently dropped
the rest — the comma branch just consumed the identifier without
producing a new decl. Surfaced by the OPERATOR overloading test
(/tmp/test_operator.prg originally had `DATA x, y` for a Vec2
class) where later `::y` access panicked with "unknown method y".
Change the signature to `[]*ast.DataDecl` and rewrite the loop so
each comma closes the current decl and starts a fresh one. AS /
INIT / qualifier runs still attach to the most recent name, so:
DATA x, y, z → three decls, no init
DATA x INIT 10, y, z INIT 0 → init attaches to preceding name
DATA cName AS CHARACTER → typed single decl
All seven class-body call sites flatten the slice into `members`.
Verified with /tmp/test_multidata.prg (`DATA x, y, z` + mixed
`DATA label INIT "origin", count INIT 0`) and the OPERATOR test
which now passes with the original `DATA x, y` form restored.
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour's macro operator was a stub: hbrt.MacroCompile only resolved
bare identifier names to memvars/functions and returned the source
string unchanged for any non-trivial expression. The gengo emit was
also broken — `t.MacroPush() + t.PushNil()` never pushed the inner
expression's value, so MacroPush popped whatever happened to be on
the stack.
Wire it up properly:
1. Gengo fix: `case *ast.MacroExpr` now emits `emitExpr(e.Expr);
t.MacroPush()`. The inner expression produces the source string;
MacroPush consumes it and pushes the evaluated result.
2. Hook pattern in hbrt: `SetMacroEvalHook(fn)` lets hbrtl install
the real evaluator without creating an import cycle (genpc
already imports hbrt). MacroPush delegates to the hook when
installed; otherwise falls back to the legacy stub for hbrt
unit tests.
3. hbrtl.init registers macroEval, which reuses compileExprSource
(factored out of PcCompile) so macro lookups share the same
sync.Map-backed pcode cache — repeat evaluations of the same
macro source are free after the first hit.
4. ExecPcode leaves the result in retVal; macroEval copies it to
the operand stack via PushRetValue.
Tested (/tmp/test_macro.prg):
&"10 + 20" → 30
&"Sqrt(16)" → 4
&"Upper('hello')" → HELLO
&("30 * " + Str(nX, 1)) → 210 (runtime-built source)
&"5 > 3 .AND. .T." → .T.
&("Str(" + Str(nX*10,2) + ",2)") → 70
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour lets a class define custom behaviour for arithmetic and
comparison operators via `OPERATOR "<sym>" ARG <name> INLINE <expr>`.
Five already had the runtime slot infrastructure (ClassDef.Operators
+ AddOperator + parent-chain copy) but parser skipped the form and
the VM ops never consulted the slots.
Parser: parseOperatorDecl captures the symbol, ARG binding, and
INLINE body into a MethodDecl with IsOperator=true and OperatorOp
set to the hbrt.Op* slot. Synthesised method name is __OP_<idx>
to keep the regular method namespace clean.
Codegen: emitClassDecl routes IsOperator members through
_def.AddOperator instead of AddMethod. Inline body generation is
shared with the MESSAGE/INLINE path (34485cd).
VM: Thread.tryBinaryOp walks the LHS object's class operator slot,
pushes args with Self bound to LHS, and returns true if the slot
is populated. Wired into Plus/Minus/Mult/Divide and Equal/NotEqual/
Less/Greater/LessEqual/GreaterEqual. Falls through to built-in
behaviour when no overload exists — non-object LHS costs one tag
check per op.
Operator symbol→slot mapping keeps `=` and `==` on the same slot
(OpEqual=8) because Five's gengo routes both to t.Equal() and the
VM doesn't distinguish strict vs non-strict equality today.
Tested (/tmp/test_operator.prg): Vec2 + - == < with per-field
results all correct.
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour's inline-method sugar was parsed but the body was skipped,
leaving any `METHOD X() INLINE expr` declaration registered in the
class vtable with no matching HB_<CLASS>_X function — link error
at build time.
Parser: MethodDecl gains an InlineBody Expr field. parseClassMethodDecl
captures the expression after INLINE instead of skipping to EOL.
New parseMessageDecl handles `MESSAGE <name> [(params)] INLINE expr`
and returns the same MethodDecl shape.
Codegen: emitClassDecl walks members a second time after the class
registration init block and emits emitInlineMethodBody for each
IsInline method — a Frame(nParams, 0) + emitExpr(InlineBody) +
RetValue function. curMethodClass is bound so ::super: inside an
inline body still resolves.
Tested (/tmp/test_inline.prg): all four patterns — bare INLINE,
MESSAGE INLINE, INLINE with params, INLINE reading ::field —
produce expected values.
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harbour's ::super: idiom routes a method call through the parent of
the class that defines the currently-executing method — Self stays
the child instance, only the vtable entry point shifts. Five
previously parsed ::super as a data-field access (PushSelfField("SUPER"))
which returned nil and panicked on the subsequent Send.
Runtime: Thread.SendSuper(fromClassName, methodName, nArgs).
Binding to the *defining* class (not Self's runtime class) is
load-bearing for 3+ level hierarchies: without it,
Grand:New → ::super:New → Child:New → ::super:New
would resolve to Grand.Parent=Child again and infinite-loop.
Gengo: Generator.curMethodClass tracks the class name across each
method body emission. emitSendExpr detects the nested SendExpr
shape `::super:X(...)` and emits SendSuper with curMethodClass as
the first argument.
Tested (/tmp/test_super, /tmp/test_super2):
Parent → Child: ::super:Greet() returns composed result
Base → Child → Grand: ::super:New chain passes args correctly
Also fixes three gengo unit tests whose expected output was stale
from prior perf commits (b829ed4 const prop, 1f63c7f symbol hoist,
7e4079f string-concat reassoc) — assertions now match the current
optimized codegen.
FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six new migration-log entries covering this session's 21 commits:
#27 VM in-place stack ops + symbol hoist (global 3-15%)
#28 gengo compile-time peepholes (9 commits, 1-7% bench)
#29 SELECT WA cache extension (single-table 2x+)
#30 JOIN temp-alias stabilisation (B6 1.67x)
#31 Stat-loop gates — view + CTE (CPU -40pp in rawsyscalln)
#32 Go-native SqlIsAggName + FetchRow (agg/window 1.3-1.7x)
Plus a cumulative bench table vs the 3caadb2 baseline and an
updated "남은 병목" section pointing at EvalExpr / JOINRECURSE /
HASHJOIN / Go runtime primitives as the remaining levers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CTE tables now materialise via MEMRDD (no file on disk), yet the
RunSelect cleanup loop was still stat-ing __cte_<name>.dbf for every
CTE in every CTE query. Profile after the FetchRow rewrite pinned
HbFileExists at 20.28% of total CPU — pure waste when MEMRDD is the
common path.
Add s_lCteDiskSeen flag, set only when the legacy DBFNTX fallback in
RunSelect actually opens a pre-existing __cte_<name>.dbf (line 1247
path — rare, only for sub-executors referencing a CTE by name on a
crashed-prior-run .dbf). Cleanup runs only when the flag is set.
pprof delta (full bench with cache enabled):
rawsyscalln: 25.56% → 8.50% (~17 points removed)
HbFileExists: 20.28% → 0% (dropped out of top)
Wall-clock unchanged (ENOENT stats are kernel-cached on Darwin), but
this removes the last visible avoidable syscall. What's left in the
profile (kevent, madvise, pthread_cond_*) is Go runtime + scheduler
overhead that application code can't touch.
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TSqlExecutor:FetchRow was the per-row workhorse for aggregation,
HAVING, and window queries. Even with the pre-built aFetchCache
binding columns to (nWA, nFPos), the PRG FOR loop paid one method
dispatch per column per row (dbSelectArea, FieldGet, AllTrim,
AAdd) — profile pinned it at ~30% of B4 CPU.
SqlFetchRowFast collapses the cache-path loop into a single Go
call:
- bound entry: SelectByNum + area.GetValue directly
- unbound (aggregate/expression): self:EvalExpr via Send
- character values: TrimSpace inline
The PRG FetchRow keeps its original cache-miss fallback path
unchanged for rare queries where aFetchCache isn't built.
Bench deltas (median of 3 steady runs, 1000 iters):
B4_GROUP_HAVING 418 → 327 us -22% (1.28x)
B9_ROW_NUMBER 191 → 120 us -37% (1.59x)
B10_RANK_PART 228 → 135 us -41% (1.69x)
B11_SUM_OVER 249 → 156 us -37% (1.60x)
B14_COUNT 235 → 219 us -7%
B15_CTE_WIN_JOIN 1577 → 1452 us -8%
Single-table SELECT (B1-B3, B5-B7, B8) stays flat — those already
hit the column-binding fast path and don't need aggregate dispatch.
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
B4 GROUP+HAVING profile showed SqlIsAggName at ~9% of CPU —
SqlEvalFunc checks it for every function in every row, and the
PRG body was two string allocations + a substring scan:
RETURN ("," + c + ",") $ ("," + AGG_FUNCTIONS + ",")
Replace with a hash lookup against the existing aggFuncSet map
in hbrtl/sqlexpr.go (already populated for SqlExprHasAgg, same
AGG_FUNCTIONS list). Upper-casing skips the allocation when the
input is already upper, which it almost always is in practice.
Bench deltas (median of 3 steady runs, 1000 iters):
B4_GROUP_HAVING 447 → 418 us -6.5%
B14_COUNT 252 → 235 us -7%
B15_CTE_WIN_JOIN 1595 → 1577 us -1%
Other benches unchanged (no aggregate calls per row).
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AcquireTemp now returns the purpose string (upper-cased table name)
as the alias when available, and falls back to FA_#### only when the
same purpose is already in-flight this query — i.e., self-joins.
Previously every call returned a fresh FA_####, so the WA cache
(keyed by alias) could never hit on JOIN queries and the file got
reopened every iteration.
Bench deltas vs prior HEAD:
B6_INNER_JOIN 217 → 130 us -40% (1.67x)
B15_CTE_WIN_JOIN 1678 → 1595 us -5%
Single-table benches unchanged — they were already hitting the
cache via the table-name alias path.
B8 recursive CTE stays flat: its sub-executors at nDepth>1 still
cycle through fresh purposes that don't stabilise across queries.
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After the SELECT WA cache landed, pprof showed HbFileExists → os.Stat
at 28% of remaining CPU — the RunSelect cleanup loop was stat-ing
__view_<table>.dbf for every table in every query, even on the
common view-free path.
Track view materialisation with a TSqlIndex.lViewUsed flag set in
OpenTable when CheckView produces a temp. The cleanup loop now
runs only when the flag is set, then resets it. View-using queries
are unaffected.
pprof delta:
rawsyscalln: 2.14s → 1.41s (48% → 32% of total CPU)
os.Stat: 1.24s → 0.49s (28% → 11%)
Wall-clock bench numbers stayed within plus-or-minus 3% noise (stats
are cheap when the target file does not exist, so CPU savings do not
translate directly to end-to-end time) but this removes the next
biggest syscall waste visible in the profile.
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TSqlExecutor:OpenTable now hands lifetime to the WA cache for stable
aliases (user-supplied or table-named). CloseOpened skips those
entries, so the DBF mmap stays alive across queries instead of being
unmapped + re-opened 1000 times a bench. Previously the WA cache only
covered DML (INSERT/UPDATE/DELETE) — SELECT was still paying the full
dbUseArea/dbCloseArea syscall bill every query (profile showed
rtlDbCloseArea + munmap at ~30% of total CPU).
AcquireTemp-generated aliases (FA_####) are excluded — they change
every query (self-joins, nested depth), so caching them would just
leak entries for no reuse. JOIN / recursive CTE regressions from an
earlier unrestricted version are gone.
Bench deltas vs prior HEAD (median of 3 steady runs, 1000 iters):
B1_SELECT_STAR 82 → 41 us -50% (2.0x)
B2_WHERE_FILTER 78 → 35 us -55% (2.2x)
B3_ORDER_BY 90 → 48 us -47% (1.88x)
B5_DISTINCT 75 → 32 us -57% (2.34x)
B7_CTE_SIMPLE 120 → 77 us -36% (1.56x)
B9_ROW_NUMBER 239 → 194 us -19%
B10_RANK_PART 276 → 233 us -16%
B11_SUM_OVER 296 → 252 us -15%
B4_GROUP_HAVING 498 → 450 us -10%
Others flat (JOIN / recursive CTE / DML already covered).
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When collectConstLocals proves a LOCAL is only ever read, not
written beyond its literal init, every read site gets the literal
substituted inline — which means the init itself has no live
reader. Skip emitting the PushXxx/PopLocalFast pair for those
LOCALs in both top-of-function and mid-body decls.
On a function with `LOCAL nBuf := 100, sTag := "x", bFlag := .T.`,
all three inits drop out (6 VM ops saved in the prologue), while
the still-written `LOCAL nSum := 0` init stays. Harbour compat
56/56, FiveSql2 43/43.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Scan each function body for LOCALs whose sole write is a literal
initialiser (never ++/-- / += / @byref / MultiAssign target /
FOR var / @GET target / macro). Reads substitute the literal
inline at emit time, which cascades into all earlier folds: dead
IF branches, AND/OR short-circuit, NOT, string-concat reassoc,
and the FOR LocalLessEqualInt fast path (extended to see through
a propagated ident limit).
Walker is bounded — unrecognised AST nodes abort propagation for
the whole function rather than risk missing a hidden write.
Harbour compat 56/56, FiveSql2 43/43.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`"a" + x + "b" + "c" + "d"` used to emit 4 Plus() calls because
the parser builds a left-leaning chain and no pair was
literal+literal. Add a reassociation step inside foldLiteralTree:
when the outer shape is `(Y + strlit1) + strlit2`, rewrite as
`Y + (strlit1+strlit2)` so the tail literals collapse. Also run
foldLiteralTree on the root BinaryExpr in emitExpr so the
outermost reassoc fires (was only running on children).
Verified: the 4-Plus case now emits 2 Plus calls (`"a" + x + "bcd"`).
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DO WHILE .T. now emits a bare for-loop with no PushBool/PopLogical
per iteration — saves a stack roundtrip on every trip through the
idiomatic infinite-loop pattern (9 .prg files use it). DO WHILE .F.
emits nothing. Loop exits still work via EXIT / RETURN.
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`.NOT. .T.` / `.NOT. .F.` emit PushBool directly instead of
pushing the source bool and calling Not(). boolLiteralValue also
sees through an outer NOT, so `IF !.F.` now triggers the full
dead-branch pass (no PopLogical wrapper either).
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Skip the PushBool/PopLogical/branch wrapper when the LHS of .AND. /
.OR. is a bare .T./.F. literal. `.T. .AND. X` emits X alone;
`.F. .AND. X` emits PushBool(false) with X dropped; symmetric for
OR. Common after constant-folding a sub-expression — pairs with
the earlier dead-IF-branch peephole.
FiveSql2 43/43, Harbour compat 56/56. Verified via /tmp/test_andor.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
IF .T. collapses to its body; IF .F. forwards to the first live
ELSEIF or ELSE. For dynamic main conditions the chain is still
filtered: ELSEIF .F. drops out, ELSEIF .T. truncates and becomes
the ELSE. Verified with /tmp/test_deadif.prg — five dead labels
all removed from gen output, runtime emits only live branches.
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two more leaf-level code-gen cleanups now that the const folder is in.
- UnaryExpr MINUS over a LITERAL (INT/DOUBLE) emits the negated value
directly, so `-42` becomes PushInt(-42) instead of PushInt(42) +
Negate(). Guarded: MinInt64 passes through to the VM so the
coerce-to-double path stays authoritative. Variables fall through
to the normal Negate path — the LiteralExpr type assertion is the
gate, so runtime-typed `-x` keeps its semantics.
- `x := x + <expr>` / `x := x - <expr>` detected when the LHS ident
resolves to the same local as the self-reference on the RHS,
emits the same LocalAdd / Negate+LocalAdd shape that x += y already
used. Non-matching locals (shadowing, module statics) fall through.
Verification
- go test ./... ALL PASS
- FiveSql2 test_sql1999 43/43
- tests/compat_harbour 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fold BinaryExpr subtrees whose operands reduce to INT or STRING
literals at compile time. `10 * 2 + 5` now emits a single PushInt(25)
instead of three VM ops; `"a" + "b"` collapses to "ab". Overflowing
INTs and SLASH (which Harbour turns into double) fall through to the
VM so semantics stay intact.
Implementation is a bottom-up foldLiteralTree pre-pass on each
BinaryExpr, plus a tryFoldBinary matcher for the leaf case. Mutates
the AST in place — safe because the generator owns the tree after
parse.
Bench numbers don't move (SQL paths have no literal-only arithmetic
in hot loops), but generated code shrinks on PRG that uses #define
constants for widths / offsets / factors.
Verification
- go test ./... ALL PASS
- FiveSql2 test_sql1999 43/43
- tests/compat_harbour 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply the sp-rewrite shape to the three binary arithmetic ops. The
tInt==tInt fast branch reads scalar directly (skips the AsNumInt
method) so the hot path is int64 ops + an overflow check; mixed-type
branches keep AsNumDouble unchanged.
PRG tight loops (FOR counter, SUM accumulators outside SQL aggregate
path) skip one cachedNil store and two bounds-check sequences per op.
Verification
- go test ./... ALL PASS
- FiveSql2 test_sql1999 43/43
- tests/compat_harbour 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fold And/Or into the same in-place sp-rewrite shape as Not/LessEqual.
Both args must be tLogical — short-circuit on the raw scalar field so
the hot path is pure integer arithmetic + two cached bool Values.
Verification
- go test ./... ALL PASS
- FiveSql2 test_sql1999 43/43
- tests/compat_harbour 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirror the treatment LessEqual already got onto Equal/NotEqual/Less/
Greater/GreaterEqual — rewrite sp directly, check the type tag for
Int==Int in the hot branch, short-circuit to cachedTrue/cachedFalse
without a second method call. Keeps the slow fallback for mixed /
string / date types.
Bench movement is minor on SQL paths (WHERE is already pcode and
skips these ops); the win is on PRG comparisons that cache-miss out
of the pcode path — FOR-condition short forms, IF chains, etc.
Verification
- go test ./... ALL PASS
- FiveSql2 test_sql1999 43/43
- tests/compat_harbour 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three small tweaks to the pop-push hotspots in the VM.
- ops_arith.go Inc/Dec/AddInt: unary ops mutate the top stack slot in
place via peekPtr() instead of pop-compute-push. Drops the bounds
check + cachedNil clear + push bounds check per call. Biggest
beneficiary: FOR loop counters (implicit Inc) — every iteration of
every PRG loop pays these ops once.
- ops_collection.go ArrayGen: consume N slots via a single `copy`
into the freshly-allocated result slice, then rewind sp and clear
the intermediate slots for GC (the first slot is overwritten by
the array push). Skips the N-deep pop loop.
- ops_collection.go EvalBlock: read block value before shift, collapse
args down one slot to overwrite the block position, then let the
block run against the same in-place layout. Matches the
Function()/PushSymbol round-trip removal from the prior commit.
bench_sql deltas
- B2 WHERE 83 → 78 µs (6%)
- B3 ORDER BY 96 → 90 µs (6%)
- B4 GROUP_HAVING 554 → 528 µs (5%)
- B9 ROW_NUMBER 255 → 241 µs (5%)
- B10 RANK PART 296 → 278 µs (6%)
- B11 SUM OVER 320 → 300 µs (6%)
- B15 CTE+WIN+JOIN 1826 →1743 µs (5%)
Verification
- go test ./... ALL PASS
- FiveSql2 test_sql1999 43/43
- tests/compat_harbour 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The VM call path (PushSymbol → Function → Frame) is traversed by every
PRG function call. Three changes together cut per-call overhead across
the entire bench suite.
Changes
- hbrt/call.go Function(): replace pop-push dance with a single slice
shift (N+2 pops + N pushes → 1 copy of N slots + sp adjust). Kills
the per-call `make([]Value, nArgs)` heap alloc. Resolved function
pointer is cached back into sym.Func so subsequent calls on the
same Symbol skip the VM lookup entirely.
- hbrt/vm.go GetSym(): new helper. Generated code calls it with a
pointer to a package-level `*Symbol` slot so FindSymbol (which takes
the VM RWMutex + map lookup) runs at most once per symbol per
process. Nil results are intentionally NOT cached — an init-order
miss becomes a retry on the next call instead of a permanent sticky
failure.
- hbrt/thread.go pushPendingSym(): scalar fast slot for depth=1 call
nesting (common case). Nil syms still go through the slice so the
"empty vs stored nil" ambiguity can't produce a false pop.
- compiler/gengo/gengo.go: emit `t.PushSymbol(t.GetSym(&_sym_<file>_<NAME>, "NAME"))`
for every function call site, with a per-file prefix so multi-PRG
builds don't collide on identical symbol names.
Bugs fixed during bring-up
- pendingSymFast == nil was ambiguous ("unused" vs "nil stored"). Nil
syms now spill to the slice, preserving distinguishability.
- The old varName-reuse branch at the PushSymbol emit site skipped
the GetSym wrapper, emitting a raw `t.PushSymbol(varName)` against
an uninitialized package-level *Symbol. Every call path now funnels
through emitPushSymbol.
bench_sql deltas vs prior build
- B1 SELECT * 114 → 97 µs (15%)
- B4 GROUP_HAVING 584 → 554 µs (5%)
- B8 RECURSIVE CTE 150 → 141 µs (6%)
- B10 RANK PARTITION 310 → 296 µs (5%)
- B11 SUM OVER 335 → 320 µs (4%)
- B14 COUNT 295 → 281 µs (5%)
- B15 CTE+WIN+JOIN 1891 → 1826 µs (3%)
Verification
- go test ./... ALL PASS
- FiveSql2 test_sql1999 43/43
- tests/compat_harbour 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SqlOrderBy: Go sort.Slice for ORDER BY, 10-50x faster than PRG ASort.
SqlGroupBy: Go map-based GROUP BY accumulation (ready for integration).
TryBuildSortSpec detects simple ORDER BY columns and routes to Go.
Fallback to PRG for complex ORDER BY expressions.
43/43 + 41/41 verify + 51/51 compat + go test ALL PASS.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
FindColIdx2 searched for bare column name (e.g. 'AMOUNT') but
aFieldNames now contains qualified names ('o.amount') from the
Go join fast path. Added fallback: try xArg[2] (the full AST name)
when the bare name misses. Fixes SUM/AVG/MIN/MAX aggregation after
Go-native hash join.
Verified: 41/41 correctness tests pass (verify_correctness.prg).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Q2 Running total regressed 100ms→6.7s from the frame-aware rewrite.
Default frame (UNBOUNDED PRECEDING to CURRENT ROW) now uses O(N)
incremental path; general per-row-frame loop only for custom frames.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
--- #15 RIGHT JOIN O(N*M) → O(N+M) via matched RecNo set ---
--- #19 s_nRCJSeq modular counter (% 100000) ---
--- #20 Implicit column alias without AS keyword ---
Validation: 43/43 + 51/51 + go test ALL PASS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
--- #12 Window frame spec now honoured ---
Parser parsed ROWS BETWEEN ... AND ... but discarded the result.
Now stores hFrame in a 6th slot on ND_WINDOW nodes via AAdd.
ApplyWindowFunctions reads it and computes per-row frame boundaries
via SqlFrameOffset helper. Unified SUM/AVG/COUNT/MIN/MAX into one
frame-aware CASE branch.
--- #6 EXISTS LIMIT mutation removed ---
Removed direct parse-tree mutation (hQuery["limit"] := 1) that
would corrupt reuse. Semi-join lift handles the fast case.
Validation: 43/43 + 51/51 + go test ALL PASS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Continues the static-analysis sweep from 7babfb7.
--- #3 Resolve NIL ambiguity (HIGH) ---
ResolveFromOuter returned NIL for both "column not found" and
"column value is NULL". Callers tested `xVal != NIL` to decide
success, which silently dropped legitimate NULL outer-row values
in correlated subqueries. Added a by-reference lFound flag so
callers distinguish the two cases.
--- #14 Multi-level LEFT JOIN null-fill (MEDIUM) ---
LEFT JOIN null-fill only fired at the last join level
(`nIdx >= Len(aJoins)`). For `a LEFT JOIN b ON ... JOIN c ON ...`
where b had no match, the null-fill for b was skipped and the
outer row was dropped entirely. Now recurses into subsequent joins
when the match fails, so the base case can still emit a row with
NULLs for b's columns.
--- #18 UNION/INTERSECT/EXCEPT applied after LIMIT (MEDIUM) ---
SQL standard requires set operations before ORDER BY / DISTINCT /
OFFSET / LIMIT. Reordered to:
RIGHT JOIN pass → UNION/INTERSECT/EXCEPT → DISTINCT → ORDER BY
→ OFFSET → LIMIT.
Previously LIMIT clipped the first SELECT before UNION merged the
second's rows, producing more rows than intended.
--- #22 DATEADD month overflow (LOW) ---
`DATEADD('MONTH', 1, '2024-01-31')` produced `SToD("20240231")`
(Feb 31) → empty date. Now normalizes month overflow/underflow
into year rollover and clamps the day to the target month's last
day. Year addition also handles Feb 29 → Feb 28 on non-leap years.
--- #23 VIEW temp file leak (LOW) ---
TSqlIndex:CheckView creates `__view_<table>.dbf` temp files that
were never cleaned up. Added post-scan cleanup in RunSelect's
close section (after CTE cleanup) that erases matching temp files.
Validation:
- FiveSql2 43/43
- Harbour compat 51/51
- go test ./... ALL PASS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Systematic bug-hunt driven by an automated analysis of all FiveSql2
source files. Each fix is targeted — no speculative refactoring.
--- #1 CLASSDATA hSubCache leaked across queries (CRITICAL) ---
CLASSDATA hSubCache INIT { => } SHARED
shared one hash across ALL TSqlExecutor instances. A non-correlated
subquery cached in query A was silently returned for an unrelated
query B if the subquery text happened to produce the same cache key.
Converted to instance DATA initialized in New().
--- #5+#21 IS NULL / COALESCE treated empty string as NULL (HIGH) ---
RETURN xL == NIL .OR. ( ValType(xL) == "C" .AND. Empty(AllTrim(xL)) )
SQL standard: '' is a valid non-NULL value. Removed the empty-string
check from both IS NULL evaluation and COALESCE skip logic.
--- #4 Multiple ? parameters all returned first value (HIGH) ---
ND_PAR nodes had no index — EvalExpr always returned ::aParams[1].
Parser now stamps each ? with a sequential 1-based index in xNode[2].
EvalExpr uses it to return the correct ::aParams[n].
--- #10+#11 SqlEvalRowExpr missing / and || operators, single-arg
function eval (MEDIUM) ---
Division and string concatenation fell through to RETURN NIL in the
row-expression evaluator used by recursive CTEs and aggregate
ComputeAgg. Also, multi-argument functions like SUBSTR(x,2,3) only
received the first argument. Both fixed.
--- #9 SUM/AVG/MIN/MAX of all NULLs returned 0 instead of NULL
(MEDIUM) ---
SQL standard requires NULL. Changed the aggregate return path to
return NIL when nCount == 0 (SUM/AVG) or when xMin/xMax == NIL.
--- #8 MIN/MAX used SqlCoerceNum for comparison (MEDIUM) ---
Strings and dates were coerced to numbers (Val()) before comparing,
making MIN('banana') == MIN('apple') == 0. Switched to SqlCmpLt
which handles type-appropriate comparison.
--- #7 SqlExprHasAgg only checked top-level node (MEDIUM) ---
Expressions like `salary + COUNT(*)` were not detected as containing
an aggregate because the top node was ND_BIN, not ND_FN. Made the
function recursive — walks ND_BIN, ND_UNI, ND_FN args, ND_CASE
branches.
--- #13 SELECT * only expanded first table in JOINs (MEDIUM) ---
`SELECT * FROM orders o JOIN customers c ON ...` only included
fields from orders. Changed the expansion loop to iterate ALL
entries in ::aTables.
--- #2 s_aOuterStack not unwound on subquery error (HIGH) ---
SubqueryCached's PushOuter/PopOuter pair was not protected by
BEGIN SEQUENCE. A runtime error inside the subquery left a stale
entry on the module-level outer stack, corrupting all subsequent
queries' correlated column resolution. Wrapped in SEQUENCE/RECOVER.
Validation:
- FiveSql2 43/43
- Harbour compat 51/51
- go test ./... ALL PASS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
A scalar correlated subquery with a JOIN inside:
SELECT e.name,
(SELECT SUM(o.qty * p.price)
FROM ord o INNER JOIN prod p ON o.prod_id = p.id
WHERE o.emp_id = e.id) AS revenue
FROM emp e WHERE e.dept = 'SALES'
returned wrong values (equal to SUM(qty) instead of SUM(qty*price))
or zero for all but the first outer row. Root cause was a triple
interaction between three independent bugs.
--- Bug 1: Subquery cache leaked across five_SQL invocations ---
hSubCorrCache, aSubCacheSlots, aSemiJoinSlots, nSubCacheSeq were
declared as DATA ... INIT { => } / {} / 0. In Five's compiled output,
hash/array INIT literals may share the same backing instance across
New() calls, so the cache from query A (SUM qty, no join) was still
there when query B ran, providing a hit on the same key — returning
A's cached (wrong) value instead of re-executing B's subquery.
Fix: explicit initialization in New().
--- Bug 2: aJoins alias mutation across subquery invocations ---
RunSelect's join-alias sync loop mutated aJoins[i][3] from the
user alias ("p") to the depth-suffixed temp alias ("FA_0003").
aJoins was a direct reference into hQuery["joins"], so the mutation
persisted across re-executions of the same hQuery. On the 2nd call,
the sync loop couldn't find a matching aTables entry because the
stale temp alias ("FA_0003") didn't match the new one ("FA_0005").
The join table's workarea was positioned wrong → empty join result.
Fix: deep-clone both ::aTables and aJoins at the start of RunSelect
so each invocation starts from the parsed originals.
--- Bug 3: SqlCollectCols stripped alias prefixes ---
When adding hidden columns for complex aggregate arguments (e.g.
SUM(o.qty * p.price)), SqlCollectCols returned bare names like
"qty" and "price" instead of qualified "o.qty" / "p.price". In a
JOIN context, unqualified "price" routed FetchRow to the first
table (ord) instead of prod — FieldPos returned 0, the column was
silently NIL, and the multiplication collapsed to qty*1 = qty.
Fix: new SqlCollectColExprs returns the original ND_COL AST nodes
with qualified names preserved. The hidden-column loop now inserts
these directly so FetchRow's dot-qualified path resolves to the
correct workarea via FindWA.
--- Verification ---
Deterministic 5-emp / 6-order / 3-product test:
Expected revenues per emp:
Emp 1: 2*10 + 3*20 = 80 → got 80.00 ✓
Emp 2: 1*10 + 4*30 = 130 → got 130.00 ✓
Emp 3: 5*20 = 100 → got 100.00 ✓
Emp 4: no orders = 0 → got 0 ✓
Emp 5: 7*10 = 70 → got 70.00 ✓
Also verified SUM(qty*2) and SUM(p.price) variants.
Validation:
- FiveSql2 43/43
- Harbour compat 51/51
- go test ./... ALL PASS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Correlated EXISTS with high-cardinality keys was stuck at O(outer × inner)
because memoization couldn't amortize across unique correlation values.
H3 in the subquery stress bench:
SELECT e.name FROM emp e
WHERE EXISTS (SELECT 1 FROM ord WHERE ord.emp_id = e.id AND ord.qty > 15)
500 outer rows × 500 distinct e.id values × 5000-row ord scan = 10s,
with no path to improvement from caching the subquery result.
Fix: detect the semi-join shape on the subquery and rewrite it at
runtime into a non-correlated DISTINCT scan whose result is cached
as a hash set. Each outer row then becomes an O(1) hash probe.
--- What we lift ---
SELECT ... FROM inner_table
WHERE inner.col = outer.col [AND other_non_correlated_preds]
Shape constraints (all must hold):
- single table, no JOIN
- no GROUP BY, no HAVING, no UNION
- WHERE is an AND tree containing an equi-term where one side is
a column with an alias prefix from the subquery's own FROM
and the other is a column from an outer alias
- the remaining AND terms (non-correlated residue) have no
outer references of their own — rules out patterns like
`WHERE e2.dept = e.dept AND e2.salary > e.salary` where the
second term can't live without the outer context
--- How the lift works ---
1. Walk the WHERE as a flat AND-term list
2. Find and remove the first correlated equi-term, remember the
inner column name and outer column reference
3. Verify residue is non-correlated via a recursive AST walker
(SemiJoinHasOuterRef) — bail to fallback if not
4. Clone hQuery with:
columns = {DISTINCT inner.col}
where = residue (or NIL)
distinct = .T.
limit / top / order_by / group_by / having cleared
5. Run the cloned subquery once via a nested TSqlExecutor — no
PushOuter because it's now non-correlated
6. Build a hash set keyed on SqlValToStr(each distinct inner value)
7. Per EXISTS probe: Resolve the outer column reference, look up
in the hash set
Cached in ::aSemiJoinSlots indexed by xSubNode identity so the
analysis + lifted scan runs exactly once per subquery expression.
Subqueries that don't match the shape store the sentinel "NO" so
subsequent probes skip re-analysis and fall through to the existing
SubqueryCached + LIMIT 1 path.
NOT EXISTS works through the same path — lNegate flag just flips
the final hash-lookup result.
--- Bench (emp=500, prod=100, ord=5k) ---
Pattern Before After Speedup
────────────────────────────────────────────────────────────
H3 EXISTS correlated 10.0s 4.5ms ~2200x
H8 NOT EXISTS self-join 900ms 890ms same (can't lift:
remainder
`e2.salary > e.salary`
is correlated)
H11 Scalar + EXISTS + derived 3.2s 1.0s 3.2x
H8 correctly falls through to the non-lifted path because the
remainder outer-reference check (SemiJoinHasOuterRef) rejects the
`e2.salary > e.salary` term. The 5-row answer is still correct.
Validation:
- FiveSql2 43/43
- Harbour compat 51/51
- go test ./... ALL PASS
- H3 returns 125 rows (matches pre-change correct result)
- H8 returns 5 rows (matches pre-change correct result)
Known pre-existing bug, unrelated: H7 (scalar correlated subquery
with inner INNER JOIN) returns zero for rows 2..N — workarea state
leaks between consecutive subquery invocations. Not touched here,
filed for follow-up.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extreme subquery stress bench (12 patterns spanning scalar-in-SELECT,
nested correlation, EXISTS, NOT IN, derived tables, self-joins, and
mixed combinations) exposed three weaknesses in the post-ROLLUP state:
1. EXISTS / NOT EXISTS evaluated the full subquery result per outer
row, even though it only needs to know whether any row matches.
2. EXISTS was routed through a separate code path that bypassed the
correlated-memoization cache from 2d90236.
3. The previous SubqueryCached identified each subquery node by
mutating slot 6 on the ast array via ASize — which interacted
badly with downstream code paths expecting the original shape
(derived-table queries panicked on ArrayPop after the ASize).
Fixes:
* EXISTS / NOT EXISTS now route through SubqueryCached the same way
ND_SUB in WHERE does, so correlated EXISTS predicates memoize on
outer free-variable values when the cardinality is low.
* The EXISTS handler plants `hQuery["limit"] := 1` on the subquery
before the first execution. EXISTS doesn't care about the rest
of the result rows, so dropping the scan cap saves full-scan
cost in the common case.
* A new early-termination branch in RunSelect's scan loop exits
the `WHILE !Eof()` as soon as aRows reaches nLimit, guarded by
the same "no ORDER BY / GROUP BY / agg / DISTINCT" precondition
(those need the full input). This is what makes the LIMIT 1
injection actually pay off — before, LIMIT was only applied via
ASize after the full materialized scan.
* SubqueryCached no longer mutates the parse tree. Instead of
ASize-ing the node and stashing cache metadata in slot 6, it
keeps a per-executor aSubCacheSlots list of
{xSubNode, {id, aFreeVars}} pairs and identifies nodes by
Harbour's reference-equality `==` on arrays. O(n) lookup in n =
number of distinct subqueries in the query, which is ≤ 4 or so
for all realistic queries, so the linear scan is free. Fixes the
derived-table ArrayPop panic.
Bench impact (emp=500, prod=100, ord=5k — subquery hell):
Pattern Before After Δ
───────────────────────────────────────────────────────
H3 Correlated EXISTS 13.3s 10.0s 1.3x
H7 Scalar-in-SELECT + JOIN 362ms 2ms 181x
H8 NOT EXISTS self-join 1.8s 900ms 2.0x
H11 Scalar + EXISTS + derived 13.7s 3.2s 4.3x
(H1, H2, H5, H6, H9, H10, H12 unchanged at 3–72ms)
H7's 181x is the scalar-in-SELECT-list memoization payoff — each
dept's revenue subquery used to run 100 times (once per SALES emp),
now runs once per distinct dept.
H3's 1.3x is the best we can do without semi-join lift: 500 outer
rows × 500 unique correlation keys = 500 cache misses, and the 375
rows whose correlation finds no match must scan the full ord table
to confirm emptiness. Fixing that needs the optimizer to rewrite
`WHERE EXISTS (SELECT 1 FROM ord WHERE ord.emp_id = e.id AND ...)`
into `WHERE e.id IN (SELECT DISTINCT emp_id FROM ord WHERE ...)`,
which is a real query-rewrite feature left for a follow-up.
Validation:
- FiveSql2 43/43
- Harbour compat 51/51
- go test ./... ALL PASS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>