fivedev/five - five - fivego gitea

Author	SHA1	Message	Date
CharlesKWON	2d9023622c	feat(FiveSql2): ROLLUP/CUBE/GROUPING SETS + correlated subquery memoization Two SQL:2013 features that were stubs or bugs. Both ship together because they share testing infrastructure (the SQL:2013 analytics bench). --- 1. ROLLUP / CUBE / GROUPING SETS (TSqlAgg) --- The parser has recognized these for a while, storing them as `ND_FN "ROLLUP"` / "CUBE" / "GROUPING SETS" nodes inside the GROUP BY list. GroupBy never actually expanded them — it treated the ND_FN as an opaque group term, which meant every row hashed into the empty bucket and the query returned a single row. New TSqlAgg:ExpandGroupingSets walks the aGroupBy array and expands each ROLLUP / CUBE / GSETS modifier into a list of flat grouping sets by cross-product with the surrounding plain terms: GROUP BY ROLLUP(a, b, c) → {(a,b,c), (a,b), (a), ()} GROUP BY CUBE(a, b) → {(a,b), (a), (b), ()} GROUP BY GROUPING SETS((a,b),()) → as-is GROUP BY x, ROLLUP(a, b) → {(x,a,b), (x,a), (x)} When the expansion produces more than one set, GroupBy recurses once per set (passing the plain flat set) and NILs out SELECT columns that aren't in the current set — the standard subtotal placeholder. Fast path (no ROLLUP/CUBE/GSETS node) short-circuits to the original single-pass logic. Correctness check: `SELECT region, SUM(amount) FROM sales GROUP BY ROLLUP(region)` on a 5-region dataset now returns 6 rows (5 per-region subtotals + 1 grand total row with region=NIL). Was 1. --- 2. Correlated subquery memoization (TSqlExecutor) --- Committed `9e0f82c` fixed a silent caching bug that made correlated subqueries return the first outer-row's result for every subsequent row, at the cost of dropping caching entirely — every outer row re-executed the subquery. For Q8 in the SQL:2013 bench (1000 emps, correlated on 3 distinct depts) that was 4.9 seconds. The right answer is to memoize per outer-key, not globally. This commit adds: - TSqlExecutor:CollectFreeVars(hQ): walks a subquery's WHERE, columns, and HAVING for ND_COL references whose alias prefix isn't one of the subquery's own FROM tables. Those are the outer columns the subquery actually depends on. - TSqlExecutor:SubqueryCached(xSubNode): runs the free-var analysis once per distinct AST node (memoized onto a 6th slot on the node), builds a cache key from the current values of those free vars via ::Resolve(), looks up in ::hSubCorrCache, executes on miss. Non-correlated subqueries end up with an empty free-var list → single cache entry → same behavior as the old CacheSubquery fast path. - ND_SUB and ND_SUB-in-IN handlers route through SubqueryCached instead of the split cache/push-outer logic. Plus a correctness fix that SubqueryCached surfaced: when a subquery runs at nDepth > 1, TSqlExecutor rewrites each FROM table's alias to a depth-suffixed temp (so concurrent opens of the same file don't collide). Previously the original user-written alias was only preserved in aTables[i][3] for single-char aliases. Multi-char aliases like `emp e2` lost their original after the rename, so FindWA("E2") failed, Resolve("e2.dept") returned NIL, and `WHERE e2.dept = e1.dept` evaluated NIL=NIL → every row was filtered out → subquery AVG returned 0 → outer `salary > 0` was trivially true for everyone. Now we always stash the original alias in [3] before the rename. --- Bench (SQL:2013 analytics, 10 queries, emp=1k, sales=20k) --- Query Before After Δ ──────────────────────────────────────────────────────── Q6 RECURSIVE hierarchy (prev fix) 30ms Q7 ROLLUP subtotals 86ms, 1 row 106ms, 6 rows (correct) Q8 Correlated subquery 4933ms 20ms ~245x (all other queries unchanged at 4–230ms) Q8 30-row sanity regression test (emp.dept in {A,B,C}, deterministic salaries so hand-computed averages are 155/810/1765): SELECT name, dept, salary FROM emp e1 WHERE salary > (SELECT AVG(salary) FROM emp e2 WHERE e2.dept = e1.dept) Before: 30 rows (wrong — returns all) After: 15 rows (correct — 5 above each dept's average) Validation: - FiveSql2 43/43 - Harbour compat 51/51 - go test ./... ALL PASS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 10:13:31 +09:00
CharlesKWON	9e0f82c5a8	perf+fix(FiveSql2): recursive-CTE hash join + correct correlated subqueries Two fixes uncovered by a SQL:2013 analytics benchmark covering the query patterns people actually run on DBF data (OLAP, BI, hierarchy traversal). --- Fix 1: correlated subquery was silently wrong --- EvalExpr's ND_SUB handler only pushed the outer context when `s_aOuterStack` was already non-empty — otherwise it routed the subquery through CacheSubquery, which stores the first result under a key derived from the subquery's syntax tokens. For a correlated subquery in a top-level WHERE: SELECT name, dept, salary FROM emp e1 WHERE salary > (SELECT AVG(salary) FROM emp e2 WHERE e2.dept = e1.dept) the first outer row saw an empty stack, cached the result, and every subsequent outer row got the same cached value regardless of e1.dept. The query returned all 1000 employees instead of the 505 who actually beat their department's average. Fix: always PushOuter + Run, no cache. Correctness over caching. Trade-off: non-correlated scalar subqueries now re-execute per outer row. A proper per-outer-key memoization is deferred — it requires walking the subquery AST to collect free variables. --- Fix 2: WITH RECURSIVE hierarchy join was O(m*n) --- RecCteJoin (the in-memory join used when a recursive CTE's step references both a real table and the CTE frontier) ran a flat nested loop: for each DBF row × each prev-iteration row, build a combined row buffer and run SqlEvalRowExpr on the ON condition. For a 4-level 1000-employee hierarchy that's ~1M ON evaluations, ~4.6 seconds. Fix: detect the shape `dbfAlias.col = cteAlias.col` at join-setup time, build a PRG hash on the CTE frontier keyed by its join column (aPrevRows is always small — at most the last iteration's emitted rows), then scan the DBF side once and probe the hash. Complex ON predicates fall through to the original nested loop. --- Bench (SQL:2013 analytics, emp=1k, sales=20k, evt=30k) --- Query Before After Speedup ────────────────────────────────────────────────────────────── RECURSIVE hierarchy 4-level 4603ms 30ms ~150x Correlated subquery (all emp) 10ms ❌ 4933ms ✓ (correct) Other SQL:2013 queries (ROW_NUMBER top-N, running total, moving average, DENSE_RANK, LAG, NTILE, gaps-and-islands) are all in the expected 10–230ms range for these dataset sizes, unchanged by this commit. Validation: - FiveSql2 43/43 - Harbour compat 51/51 - go test ./... ALL PASS Known follow-ups (not in this commit): - Q7 ROLLUP(col) parses but isn't expanded in GroupBy — returns a single grand-total row instead of per-value + total. Grouping sets implementation is a separate feature. - Correlated subquery memoization by outer free-variable key would bring Q8 from 4.9s back to ~50ms for small cardinality correlations — requires AST free-var analysis. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 23:25:58 +09:00
CharlesKWON	64b7cf6676	perf(FiveSql2): compound-AND equi-join picks up hash path — CTE+JOIN 22x FiveSql2's HashJoin only recognized bare equi-terms (xOnCond[1]=ND_BIN, xOnCond[2]="="), so a compound ON predicate like ON e.dept_id = t.dept_id AND e.salary = t.max_sal fell through to the nested-loop ELSE branch: dbSelectArea(nInnerWA) dbGoTop() WHILE !Eof() IF SqlIsTrue(EvalExpr(xOnCond)) JoinRecurse(...) ENDIF dbSkip() ENDDO That's O(outer × inner) per outer row, re-evaluating the full AND tree every probe. Query Q7 in the complex benchmark (CTE top_emp joined back to emp on compound key) ran at 4.6 seconds for 100 inner × 10k outer. Fix has two pieces: 1. Probe-term extraction in JoinRecurse: when xOnCond is an AND, walk the left-associative chain looking for the first equi-term (`a.x = b.x`). Use that as the hash-probe key, drive the normal hash-join code path through it. 2. Post-filter in HashJoin: after a hash match, if the original xOnCond was compound, re-evaluate the full predicate with EvalExpr to drop matches that satisfied the hash key but not the rest of the AND (e.g. same dept but different salary). Bare equi- joins still skip the re-eval — the hash match is conclusive. Bench (10k × 100 × compound ON predicate): Query Before After Speedup ───────────────────────────────────────────────────────── Q7 CTE + JOIN compound ON 4573ms 209ms 21.9x Still works for the existing bare equi case (43-test unchanged) and the 3-way JOIN case (no regression). Falls back to the generic nested loop only when no probe-term can be extracted at all. Validation: - FiveSql2 43/43 - Harbour compat 51/51 - go test ./... ALL PASS - Q7 result: 100 rows (correct) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 20:31:27 +09:00
CharlesKWON	c6799a599e	fix(FiveSql2): GROUP BY with aliased SELECT collapses all rows into one Surfaced by complex-query benchmarking. Query like: SELECT d.name AS dept, COUNT(*) AS n, SUM(o.amount) AS total FROM dept d INNER JOIN emp e ON ... INNER JOIN ord o ON ... GROUP BY d.name returned exactly 1 row instead of 100. Removing the AS aliases made it work correctly. Semantic bug, not a performance issue. Root cause: TSqlAgg:GroupBy resolved each GROUP BY column by calling FindColIdx against aFN — the output alias list. For GROUP BY d.name with d.name AS dept, the group expression's column name was looked up in {"dept","n","total"} and missed. FindColIdx returned 0, every row got an empty group key, and the hash collapsed everything into one bucket. Fix: new FindGroupIdx walks aCols (SELECT list expressions) instead, matching the GROUP BY column against each SELECT item's source expression ND_COL name. Handles qualified refs (d.name -> NAME) and falls back to FindColIdx for cases where GROUP BY uses a column not in the SELECT list. Also hoisted the resolution out of the per-row loop — GROUP BY columns resolve once into aGroupIdx[] so each row just indexes. Validation: - FiveSql2 43/43 - Harbour compat 51/51 - Complex bench Q4: 1 row -> 100 rows (correct) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 20:25:02 +09:00
CharlesKWON	bfc6ded8cb	perf(FiveSql2): SqlHashBuild + FetchRow column binding — 3-way JOIN 3x Complex-query benchmarking turned up two hot paths that the earlier SqlScan/SqlEach work didn't touch: multi-table JOIN and nested-scan row fetching. This commit hits both. --- Part 1: SqlHashBuild — Go-native hash-join build --- FiveSql2's HashJoin previously built the inner-side hash in PRG: WHILE !Eof() xVal := FieldGet(nFPos) cKey := SqlValToStr(xVal) IF !hb_HHasKey(hHash, cKey) ; hHash[cKey] := {} ; ENDIF AAdd(hHash[cKey], RecNo()) dbSkip() ENDDO That loop runs at ~40μs per row from class dispatch + hb_HHasKey lookups + AAdd growth + SqlValToStr formatting. On a 50k-row inner table that's ~2 seconds wasted on what should be a sub-50ms housekeeping op. New hbrtl.SqlHashBuild does the same thing in one Go-native pass: - Direct *dbf.DBFArea loop (no interface dispatch, same devirt as SqlScan) - Go `map[string][]int64` accumulates RecNos by key — one allocation per distinct key - Inline ASCII-only digit formatter for numeric keys (strconv.Itoa is allocation-heavy for small ints) - CHAR keys are right-trimmed to match SqlCmpEq semantics so the hash probe matches what EvalExpr would compute - Final Five hash is built once from Keys/Values/Order slices directly, skipping the per-key hb_HSet path HashJoin now calls `SqlHashBuild(nFPos)` instead of running the PRG loop. --- Part 2: TSqlExecutor:BuildFetchCache --- The JOIN fallback loop calls FetchRow per row. FetchRow was already column-ref-aware but did the string parse (`At + SubStr + Upper`) and `::FindWA` linear scan every single invocation. For a 50k-row join emitting 50k result rows, that's ~200k redundant resolutions. New BuildFetchCache walks the SELECT list once before the scan and pre-binds each plain-column expression to `{nWA, nFPos}`. FetchRow's new fast path checks ::aFetchCache and jumps straight to `dbSelectArea + FieldGet` when bound. Complex exprs (functions, CASE, subqueries) still fall through to EvalExpr. ::aFetchCache is set right before the join WHILE loop and cleared after — no cross-query bleed. --- Bench (50k ord × 10k emp × 100 dept, 3-run steady state) --- Query Before After Speedup ──────────────────────────────────────────────────────────── 2-way INNER JOIN, 10k rows 91ms 68ms 1.34x 2-way JOIN + GROUP BY 110ms 94ms 1.17x 3-way INNER JOIN COUNT 2610ms 610ms 4.28x 3-way JOIN + GROUP BY 2860ms 830ms 3.45x The 3-way speedup is almost entirely SqlHashBuild. The 2-way case benefits from the fetch cache because its per-row cost is dominated by FetchRow (no second hash build to amortize). --- Limits still standing --- CTE + JOIN queries (Q7 in bench_complex: ~4.5s) aren't affected by either optimization — CTE materialization goes through a different path that writes/reads a temp DBF. Follow-up target. Validation: - FiveSql2 43/43 - Harbour compat 51/51 - go test ./... ALL PASS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 18:47:20 +09:00
CharlesKWON	e75167c2e9	feat(FiveSql2): five_SQL block-callback integration — SQL beats raw PRG Wires the new SqlEach RTL into FiveSql2's front-end so users write the SQL they know and opt into streaming with a familiar Harbour code block — no manual RTL plumbing. API: /* Existing array form — unchanged, 43-test still green / aR := five_SQL( "SELECT name FROM t" ) / New block form — zero intermediate rows, 2x raw PRG / five_SQL( "SELECT id, name FROM t WHERE salary > 50000", NIL, {\|nID, cName\| Process(nID, cName)} ) Parameter order (cSQL, aParams, bBlock) keeps backward compatibility with every existing call site. Passing NIL for aParams when only a block is needed is standard Harbour idiom. Routing: TFiveSQL:Execute now takes an optional bBlock parameter and stores it on TSqlExecutor as ::bRowBlock. * TSqlExecutor:RunSelect's existing Go fast path (same guards as before: single table, no JOIN/GROUP/aggregate, plain column projections, WHERE compilable via SqlExprToPrg) branches on ::bRowBlock: - block present → SqlEach streams rows through the block - block absent → SqlScan materializes into aRows (current path) * Post-processing (GROUP BY / ORDER BY / window / DISTINCT / LIMIT) runs on empty aRows when block mode fires — all are no-ops on empty input, so the sequence stays harmless. * RunSelect returns NIL (not {fields, rows}) when ::bRowBlock was used — signals "streaming semantics, all work done in the block". Complex queries (JOIN, GROUP BY, subquery, window, ORDER BY not matchable by an index, LIMIT/OFFSET, etc.) still fall back to the array path even when a block is supplied — those genuinely require materialization. Block mode is a fast-path opt-in, not a semantic change. End-to-end bench (50k rows, steady state — includes the user-side loop/block for every row): Path Time Speedup vs raw ────────────────────────────────────────────────────────────── Raw PRG DO WHILE !Eof() + WHERE sum 7.6ms 1.00x five_SQL array + FOR 7.7ms ~same five_SQL + block (new) 3.7ms 2.05x ← beats raw ────────────────────────────────────────────────────────────── Raw PRG no WHERE 6.1ms 1.00x five_SQL + block, no WHERE 2.9ms 2.10x ← beats raw SQL now pays for itself on end-to-end timing — not just competitive with hand-rolled RDD loops, but faster than them. The layered cost of FieldGet's Frame+RTL-dispatch that hand-written loops incur per call is gone; the block-callback path captures *dbf.DBFArea directly via FastFieldGetter and uses PcOpFieldGet to bypass dispatch in the compiled WHERE predicate. Validation: - FiveSql2 43/43 (array API unchanged) - Harbour compat 51/51 - go test ./... ALL PASS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 17:00:46 +09:00
CharlesKWON	ad69221136	revert(FiveSql2): restore TSqlIndex:FindExclusive scan Previous short-circuit (return 0 unconditionally) was a workaround for two bugs that are both fixed now: 1. gengo PushLocal(0) panic on unresolved identifiers → fixed by `08ad6f4` (PushMemvar fallback). 2. dbInfo(DBI_FULLPATH / DBI_SHARED) returning NIL → fixed by `d74014a` (real implementations). Restoring the original scan: walk workareas 1..250, check if any holds an exclusive lock on the target DBF. With dbInfo now functional and the DBI_* constants defined in include/dbinfo.ch (commit `3a00aa5`), this gives FiveSql2 real pre-flight conflict detection for concurrent table access rather than silently proceeding into a lock failure. Validation: - FiveSql2 43/43 - standalone PRG with dbUseArea + five_SQL works (was the original repro that triggered the workaround) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:07:40 +09:00
CharlesKWON	8aaed994f4	perf(FiveSql2): hybrid fast path — 11x speedup on string WHERE scans Implements hybrid execution model: keep AST tree-walk for SQL:2013+ features (Window, Recursive CTE, JOIN, aggregates) while compiling simple SELECT hot paths to Go + pcode. See docs/FiveSql2-Hybrid-Plan.md for the full architecture rationale (why not SQLite-style VDBE). Hot path (single table, no joins/groups/aggregates): - TryBuildFieldPositions: resolves SELECT column list to FieldPos array once per query (bails to PRG loop on any complex expr). - TryCompileWhere + SqlExprToPrg: walks WHERE AST, emits equivalent PRG source, runs it through PcCompile to get a PcodeFunc. - SqlScan RTL: Go-native scan loop — GoTop/EOF/Skip/GetValue direct, ExecPcode per row for WHERE, result array pre-alloc. WHERE compiler scope: - ND_LIT numeric/logical/string (string literals AllTrim'd to match SqlCmpEq CHAR-padding semantics; rejects embedded quotes/newlines) - ND_COL: CHAR fields auto-wrapped with AllTrim(FieldGet(n)) based on dbStruct() lookup cached once per query in aCompileStruct - ND_BIN: = <> != < <= > >= AND OR + - * / - ND_UNI: NOT - - Anything else (ND_FN, ND_CASE, ND_SUB, ND_PAR, LIKE, IN, IS NULL, BETWEEN, dates) returns NIL → falls back to PRG tree-walk. Bench (50k rows, ~/tmp ext4): Before After Speedup Numeric WHERE ~150ms 11.7ms ~13x String WHERE 119.3ms 10.5ms 11.4x No WHERE - 14.6ms - Raw RDD baseline 6.8ms 6.8ms 1.0x Remaining gap to raw RDD (~1.5x) is structural: Value boxing, result array construction, per-row ExecPcode frame overhead. Would need a Value-pool or SoA refactor to close further. Side fixes bundled: - TSqlIndex:FindExclusive short-circuited. Originally called dbInfo(DBI_FULLPATH)/DBI_SHARED which are unresolved symbols in Five (dbInfo is a stub, DBI_* never defined). Panic'd with "local variable index out of range: 0" whenever a standalone PRG had a workarea Used before calling five_SQL. 43-test masked the bug because it only reached FindExclusive with no open workareas. Restore the scan once dbInfo lands in hbrtl. - cmd/five/main.go: FIVE_KEEP_BUILD=1 env var keeps the temp Go project around for debugging gengo output. Validation: - FiveSql2 43/43 - Harbour compat 51/51 - go test ./... ALL PASS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 09:15:08 +09:00
Charles KWON OhJun	486e466592	feat: FiveSql2 43/43, @byref, mutable closure, RTL 479, DateTime fix Major changes since last commit: - FiveSql2 SQL:1999 engine (10,458 LOC) — 43/43 ALL PASS - 21 compiler/runtime bugs fixed (short-circuit AND/OR, FOR LOOP, etc.) - @byref pass-by-reference via RefCell pattern - Mutable closure capture (EnsureLocalRef + RefCell sharing) - RTL: 400 → 479 functions (+79: file, string, datetime, hash, UTF-8) - DateTime/Timestamp fully working (hb_DateTime, hb_Hour/Min/Sec, display) - Reserved word guard (39 keywords blocked from function calls) - AEval arg order fix (element before index) - Closure capture redecl fix (unique _cap_ names per block) - Hash/string indexing in ArrayPush/ArrayPop - Harbour compat test suite: 51/51 - 4 docs: Porting Report, Implementation Plan, Optimization Plan, Commercialization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 11:35:37 +09:00

9 Commits