fivedev/five - five - fivego gitea

Author	SHA1	Message	Date
CharlesKWON	f4ed42556b	checkpoint: season-wide bug fix campaign + infra Cumulative season's silent-bug hunting (~62 fixes) across the FiveSql2 SQL engine, the Five compiler/runtime, and the hbrdd RDD layer. Saved as a single checkpoint before refactoring the parser to delegate xBase command translation to the preprocessor. Highlights: FiveSql2 engine (_FiveSql2/src/) - prefix-glob index attach -> explicit convention (<table>_pk.ntx, <table>_uq.ntx, <table>.cdx) — fixes silent multi-row INSERT row-drop - DROP/CREATE TABLE FErase chain extended (.cdx, .fsc, .fsv, .dbt, .fpt) - COUNT(DISTINCT col) parsed + aggregated via hSeen hash - UNION column-count mismatch returns SQL_ERR_GRAMMAR (was silent) - DISTINCT + ORDER BY hidden-col leak fixed (trim before DISTINCT) - Derived table FROM (SELECT...) + JOIN right-side derived - Self-FK CASCADE depth 2+ via SqlGetSingleColPK pre-collect - LAG/LEAD default arg uses SqlEvalRowExpr (handles -N const exprs) - DATE literal round-trip validation (Feb 29 non-leap rejected) - CREATE OR REPLACE VIEW; CREATE VIEW errors on already-exists - AlterTable type dispatcher comma-wrapped (1-char type "A" no longer matches CHARACTER) Compiler / runtime - gengo: HB_ -> FV_ prefix on emitted Go function names (Five identity) - gengo split: emit_block.go, emit_stmt.go, folding.go extracted - parser/stmtreg.go nudges - hbrt: debug TUI/CLI restructure (debugcmd, debugkey, termios_*), windows debug stubs collapsed - thread/vm/value/class/pcinterp tightening from panic traces RDD layer (hbrdd/) - dbf: null bitmap support (null.go + null_test.go), mmap split (mmap_posix.go / mmap_windows.go), byte-level numeric parse - ntx/cdx: windows mmap parity - workarea + mem RDD: cross-area state-bleed fixes RTL (hbrtl/) - errorlog rewrite with platform-specific FD (errorlog_fd_unix / errorlog_fd_other) - sqlscan, sqlhelpers, indexrtl, datetime extensions Gates green at checkpoint: - go test ./... : PASS - FiveSql2 SQL:1999 : 43/43 - Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 09:26:25 +09:00
CharlesKWON	dd270d5d9d	perf: RTL Go-native migration — 27 optimizations, DML up to 70-90x Systematic pass through PRG hot paths, promoting them to Go RTL while preserving Harbour/FiveSql2 semantics. Full log in docs/RTL-Go-Native-Migration.md. Bench (bench_sql) vs 2026-04-08 baseline - B1 SELECT * 2,192 → 114 µs (19x) - B6 INNER JOIN 9,291 → 233 µs (40x) - B7 CTE simple 8,037 → 129 µs (62x) - B9 ROW_NUMBER 3,705 → 265 µs (14x) - B10 RANK PARTITION 4,748 → 309 µs (15x) - B12 INSERT (WA cache) 4,319 → 63 µs (69x) - B13 UPDATE (WA cache) 6,144 → 68 µs (90x) - B15 CTE+WIN+JOIN 18,395 → 1,873 µs (10x) Infrastructure - HbHash O(1) Index preserving insertion order (Harbour KEEPORDER) - HbDeepClone Go RTL (scalar-sharing, immutable hash keys) - MEMRDD auto-imported via gengo; all Five programs get mem:name driver - SQL plan + pcode caches (s_hPlanCache, s_hDmlPcodeCache) - Opt-in SqlWACacheEnable — dbUseArea/Close/Commit batched for DML SQL engine - FiveSql2 lexer ported to Go (byte FSM) with combined automatic template parameterization (literals → ?, concat queries share plan) - Go RTL: SqlDistinct, SqlGroupRows, SqlWindowPartitions, SqlWindowSortPartition, SqlWindowAssignRank, SqlComputeAggSimple, SqlBulkInsert, SqlBulkUpdate, SqlExprHasAgg, SqlEvalHaving - CTE / subquery / driving-table materialize paths use MEMRDD - SqlCoerce/SqlCmp/SqlIsTrue helpers moved from PRG to Go - SqlBulkUpdate defers Flush when WA cache active (APFS fsync was dominant B13 cost — 1.6ms/call → gone) Correctness fixes uncovered during migration - ASort default path now sorts dates/logicals/timestamps (was no-op) - ORDER BY default NULL placement matches PRG SqlRowCompare across Go fast path; explicit NULLS FIRST/LAST honored by both paths - SqlBulkUpdate respects EXCLUSIVE vs SHARED mode record locks - SqlCmp/SqlCmpEq normalize NumInt vs Double (caught by test 6b) Verification - go test ./... ALL PASS - FiveSql2 test_sql1999 43/43 - tests/compat_harbour 56/56 (+5 new: ASort dates/logicals, AScan int cross-type) - Regression test test_null_order.prg for ORDER BY NULL ordering Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 20:20:14 +09:00
CharlesKWON	54bf6f5bb4	fix: ComputeAgg qualified column lookup for Go SqlHashJoin path FindColIdx2 searched for bare column name (e.g. 'AMOUNT') but aFieldNames now contains qualified names ('o.amount') from the Go join fast path. Added fallback: try xArg[2] (the full AST name) when the bare name misses. Fixes SUM/AVG/MIN/MAX aggregation after Go-native hash join. Verified: 41/41 correctness tests pass (verify_correctness.prg). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 07:35:26 +09:00
CharlesKWON	7babfb7281	fix(FiveSql2): 9 latent bugs from static analysis sweep Systematic bug-hunt driven by an automated analysis of all FiveSql2 source files. Each fix is targeted — no speculative refactoring. --- #1 CLASSDATA hSubCache leaked across queries (CRITICAL) --- CLASSDATA hSubCache INIT { => } SHARED shared one hash across ALL TSqlExecutor instances. A non-correlated subquery cached in query A was silently returned for an unrelated query B if the subquery text happened to produce the same cache key. Converted to instance DATA initialized in New(). --- #5+#21 IS NULL / COALESCE treated empty string as NULL (HIGH) --- RETURN xL == NIL .OR. ( ValType(xL) == "C" .AND. Empty(AllTrim(xL)) ) SQL standard: '' is a valid non-NULL value. Removed the empty-string check from both IS NULL evaluation and COALESCE skip logic. --- #4 Multiple ? parameters all returned first value (HIGH) --- ND_PAR nodes had no index — EvalExpr always returned ::aParams[1]. Parser now stamps each ? with a sequential 1-based index in xNode[2]. EvalExpr uses it to return the correct ::aParams[n]. --- #10+#11 SqlEvalRowExpr missing / and \|\| operators, single-arg function eval (MEDIUM) --- Division and string concatenation fell through to RETURN NIL in the row-expression evaluator used by recursive CTEs and aggregate ComputeAgg. Also, multi-argument functions like SUBSTR(x,2,3) only received the first argument. Both fixed. --- #9 SUM/AVG/MIN/MAX of all NULLs returned 0 instead of NULL (MEDIUM) --- SQL standard requires NULL. Changed the aggregate return path to return NIL when nCount == 0 (SUM/AVG) or when xMin/xMax == NIL. --- #8 MIN/MAX used SqlCoerceNum for comparison (MEDIUM) --- Strings and dates were coerced to numbers (Val()) before comparing, making MIN('banana') == MIN('apple') == 0. Switched to SqlCmpLt which handles type-appropriate comparison. --- #7 SqlExprHasAgg only checked top-level node (MEDIUM) --- Expressions like `salary + COUNT()` were not detected as containing an aggregate because the top node was ND_BIN, not ND_FN. Made the function recursive — walks ND_BIN, ND_UNI, ND_FN args, ND_CASE branches. --- #13 SELECT only expanded first table in JOINs (MEDIUM) --- `SELECT * FROM orders o JOIN customers c ON ...` only included fields from orders. Changed the expansion loop to iterate ALL entries in ::aTables. --- #2 s_aOuterStack not unwound on subquery error (HIGH) --- SubqueryCached's PushOuter/PopOuter pair was not protected by BEGIN SEQUENCE. A runtime error inside the subquery left a stale entry on the module-level outer stack, corrupting all subsequent queries' correlated column resolution. Wrapped in SEQUENCE/RECOVER. Validation: - FiveSql2 43/43 - Harbour compat 51/51 - go test ./... ALL PASS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 17:26:05 +09:00
CharlesKWON	2d9023622c	feat(FiveSql2): ROLLUP/CUBE/GROUPING SETS + correlated subquery memoization Two SQL:2013 features that were stubs or bugs. Both ship together because they share testing infrastructure (the SQL:2013 analytics bench). --- 1. ROLLUP / CUBE / GROUPING SETS (TSqlAgg) --- The parser has recognized these for a while, storing them as `ND_FN "ROLLUP"` / "CUBE" / "GROUPING SETS" nodes inside the GROUP BY list. GroupBy never actually expanded them — it treated the ND_FN as an opaque group term, which meant every row hashed into the empty bucket and the query returned a single row. New TSqlAgg:ExpandGroupingSets walks the aGroupBy array and expands each ROLLUP / CUBE / GSETS modifier into a list of flat grouping sets by cross-product with the surrounding plain terms: GROUP BY ROLLUP(a, b, c) → {(a,b,c), (a,b), (a), ()} GROUP BY CUBE(a, b) → {(a,b), (a), (b), ()} GROUP BY GROUPING SETS((a,b),()) → as-is GROUP BY x, ROLLUP(a, b) → {(x,a,b), (x,a), (x)} When the expansion produces more than one set, GroupBy recurses once per set (passing the plain flat set) and NILs out SELECT columns that aren't in the current set — the standard subtotal placeholder. Fast path (no ROLLUP/CUBE/GSETS node) short-circuits to the original single-pass logic. Correctness check: `SELECT region, SUM(amount) FROM sales GROUP BY ROLLUP(region)` on a 5-region dataset now returns 6 rows (5 per-region subtotals + 1 grand total row with region=NIL). Was 1. --- 2. Correlated subquery memoization (TSqlExecutor) --- Committed `9e0f82c` fixed a silent caching bug that made correlated subqueries return the first outer-row's result for every subsequent row, at the cost of dropping caching entirely — every outer row re-executed the subquery. For Q8 in the SQL:2013 bench (1000 emps, correlated on 3 distinct depts) that was 4.9 seconds. The right answer is to memoize per outer-key, not globally. This commit adds: - TSqlExecutor:CollectFreeVars(hQ): walks a subquery's WHERE, columns, and HAVING for ND_COL references whose alias prefix isn't one of the subquery's own FROM tables. Those are the outer columns the subquery actually depends on. - TSqlExecutor:SubqueryCached(xSubNode): runs the free-var analysis once per distinct AST node (memoized onto a 6th slot on the node), builds a cache key from the current values of those free vars via ::Resolve(), looks up in ::hSubCorrCache, executes on miss. Non-correlated subqueries end up with an empty free-var list → single cache entry → same behavior as the old CacheSubquery fast path. - ND_SUB and ND_SUB-in-IN handlers route through SubqueryCached instead of the split cache/push-outer logic. Plus a correctness fix that SubqueryCached surfaced: when a subquery runs at nDepth > 1, TSqlExecutor rewrites each FROM table's alias to a depth-suffixed temp (so concurrent opens of the same file don't collide). Previously the original user-written alias was only preserved in aTables[i][3] for single-char aliases. Multi-char aliases like `emp e2` lost their original after the rename, so FindWA("E2") failed, Resolve("e2.dept") returned NIL, and `WHERE e2.dept = e1.dept` evaluated NIL=NIL → every row was filtered out → subquery AVG returned 0 → outer `salary > 0` was trivially true for everyone. Now we always stash the original alias in [3] before the rename. --- Bench (SQL:2013 analytics, 10 queries, emp=1k, sales=20k) --- Query Before After Δ ──────────────────────────────────────────────────────── Q6 RECURSIVE hierarchy (prev fix) 30ms Q7 ROLLUP subtotals 86ms, 1 row 106ms, 6 rows (correct) Q8 Correlated subquery 4933ms 20ms ~245x (all other queries unchanged at 4–230ms) Q8 30-row sanity regression test (emp.dept in {A,B,C}, deterministic salaries so hand-computed averages are 155/810/1765): SELECT name, dept, salary FROM emp e1 WHERE salary > (SELECT AVG(salary) FROM emp e2 WHERE e2.dept = e1.dept) Before: 30 rows (wrong — returns all) After: 15 rows (correct — 5 above each dept's average) Validation: - FiveSql2 43/43 - Harbour compat 51/51 - go test ./... ALL PASS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 10:13:31 +09:00
CharlesKWON	c6799a599e	fix(FiveSql2): GROUP BY with aliased SELECT collapses all rows into one Surfaced by complex-query benchmarking. Query like: SELECT d.name AS dept, COUNT(*) AS n, SUM(o.amount) AS total FROM dept d INNER JOIN emp e ON ... INNER JOIN ord o ON ... GROUP BY d.name returned exactly 1 row instead of 100. Removing the AS aliases made it work correctly. Semantic bug, not a performance issue. Root cause: TSqlAgg:GroupBy resolved each GROUP BY column by calling FindColIdx against aFN — the output alias list. For GROUP BY d.name with d.name AS dept, the group expression's column name was looked up in {"dept","n","total"} and missed. FindColIdx returned 0, every row got an empty group key, and the hash collapsed everything into one bucket. Fix: new FindGroupIdx walks aCols (SELECT list expressions) instead, matching the GROUP BY column against each SELECT item's source expression ND_COL name. Handles qualified refs (d.name -> NAME) and falls back to FindColIdx for cases where GROUP BY uses a column not in the SELECT list. Also hoisted the resolution out of the per-row loop — GROUP BY columns resolve once into aGroupIdx[] so each row just indexes. Validation: - FiveSql2 43/43 - Harbour compat 51/51 - Complex bench Q4: 1 row -> 100 rows (correct) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 20:25:02 +09:00
Charles KWON OhJun	486e466592	feat: FiveSql2 43/43, @byref, mutable closure, RTL 479, DateTime fix Major changes since last commit: - FiveSql2 SQL:1999 engine (10,458 LOC) — 43/43 ALL PASS - 21 compiler/runtime bugs fixed (short-circuit AND/OR, FOR LOOP, etc.) - @byref pass-by-reference via RefCell pattern - Mutable closure capture (EnsureLocalRef + RefCell sharing) - RTL: 400 → 479 functions (+79: file, string, datetime, hash, UTF-8) - DateTime/Timestamp fully working (hb_DateTime, hb_Hour/Min/Sec, display) - Reserved word guard (39 keywords blocked from function calls) - AEval arg order fix (element before index) - Closure capture redecl fix (unique _cap_ names per block) - Hash/string indexing in ArrayPush/ArrayPop - Harbour compat test suite: 51/51 - 4 docs: Porting Report, Implementation Plan, Optimization Plan, Commercialization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 11:35:37 +09:00

7 Commits