Commit Graph

6 Commits

Author SHA1 Message Date
f4ed42556b checkpoint: season-wide bug fix campaign + infra
Cumulative season's silent-bug hunting (~62 fixes) across the FiveSql2
SQL engine, the Five compiler/runtime, and the hbrdd RDD layer. Saved
as a single checkpoint before refactoring the parser to delegate xBase
command translation to the preprocessor.

Highlights:

FiveSql2 engine (_FiveSql2/src/)
- prefix-glob index attach -> explicit convention (<table>_pk.ntx,
  <table>_uq.ntx, <table>.cdx) — fixes silent multi-row INSERT row-drop
- DROP/CREATE TABLE FErase chain extended (.cdx, .fsc, .fsv, .dbt, .fpt)
- COUNT(DISTINCT col) parsed + aggregated via hSeen hash
- UNION column-count mismatch returns SQL_ERR_GRAMMAR (was silent)
- DISTINCT + ORDER BY hidden-col leak fixed (trim before DISTINCT)
- Derived table FROM (SELECT...) + JOIN right-side derived
- Self-FK CASCADE depth 2+ via SqlGetSingleColPK pre-collect
- LAG/LEAD default arg uses SqlEvalRowExpr (handles -N const exprs)
- DATE literal round-trip validation (Feb 29 non-leap rejected)
- CREATE OR REPLACE VIEW; CREATE VIEW errors on already-exists
- AlterTable type dispatcher comma-wrapped (1-char type "A" no longer
  matches CHARACTER)

Compiler / runtime
- gengo: HB_ -> FV_ prefix on emitted Go function names (Five identity)
- gengo split: emit_block.go, emit_stmt.go, folding.go extracted
- parser/stmtreg.go nudges
- hbrt: debug TUI/CLI restructure (debugcmd, debugkey, termios_*),
  windows debug stubs collapsed
- thread/vm/value/class/pcinterp tightening from panic traces

RDD layer (hbrdd/)
- dbf: null bitmap support (null.go + null_test.go), mmap split
  (mmap_posix.go / mmap_windows.go), byte-level numeric parse
- ntx/cdx: windows mmap parity
- workarea + mem RDD: cross-area state-bleed fixes

RTL (hbrtl/)
- errorlog rewrite with platform-specific FD (errorlog_fd_unix /
  errorlog_fd_other)
- sqlscan, sqlhelpers, indexrtl, datetime extensions

Gates green at checkpoint:
- go test ./...        : PASS
- FiveSql2 SQL:1999    : 43/43
- Harbour compat       : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 09:26:25 +09:00
c84cde6175 perf(fivesql2): Go-native SqlIsAggName — drop per-row substring scan
B4 GROUP+HAVING profile showed SqlIsAggName at ~9% of CPU —
SqlEvalFunc checks it for every function in every row, and the
PRG body was two string allocations + a substring scan:
  RETURN ("," + c + ",") $ ("," + AGG_FUNCTIONS + ",")

Replace with a hash lookup against the existing aggFuncSet map
in hbrtl/sqlexpr.go (already populated for SqlExprHasAgg, same
AGG_FUNCTIONS list). Upper-casing skips the allocation when the
input is already upper, which it almost always is in practice.

Bench deltas (median of 3 steady runs, 1000 iters):
  B4_GROUP_HAVING 447 → 418 us  -6.5%
  B14_COUNT       252 → 235 us  -7%
  B15_CTE_WIN_JOIN 1595 → 1577 us  -1%
Other benches unchanged (no aggregate calls per row).

FiveSql2 43/43, Harbour compat 56/56.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:40:19 +09:00
dd270d5d9d perf: RTL Go-native migration — 27 optimizations, DML up to 70-90x
Systematic pass through PRG hot paths, promoting them to Go RTL while
preserving Harbour/FiveSql2 semantics. Full log in
docs/RTL-Go-Native-Migration.md.

Bench (bench_sql) vs 2026-04-08 baseline
 - B1  SELECT *             2,192 → 114   µs   (19x)
 - B6  INNER JOIN           9,291 → 233   µs   (40x)
 - B7  CTE simple           8,037 → 129   µs   (62x)
 - B9  ROW_NUMBER           3,705 → 265   µs   (14x)
 - B10 RANK PARTITION       4,748 → 309   µs   (15x)
 - B12 INSERT (WA cache)    4,319 →  63   µs   (69x)
 - B13 UPDATE (WA cache)    6,144 →  68   µs   (90x)
 - B15 CTE+WIN+JOIN        18,395 → 1,873 µs   (10x)

Infrastructure
 - HbHash O(1) Index preserving insertion order (Harbour KEEPORDER)
 - HbDeepClone Go RTL (scalar-sharing, immutable hash keys)
 - MEMRDD auto-imported via gengo; all Five programs get mem:name driver
 - SQL plan + pcode caches (s_hPlanCache, s_hDmlPcodeCache)
 - Opt-in SqlWACacheEnable — dbUseArea/Close/Commit batched for DML

SQL engine
 - FiveSql2 lexer ported to Go (byte FSM) with combined automatic
   template parameterization (literals → ?, concat queries share plan)
 - Go RTL: SqlDistinct, SqlGroupRows, SqlWindowPartitions,
   SqlWindowSortPartition, SqlWindowAssignRank, SqlComputeAggSimple,
   SqlBulkInsert, SqlBulkUpdate, SqlExprHasAgg, SqlEvalHaving
 - CTE / subquery / driving-table materialize paths use MEMRDD
 - SqlCoerce/SqlCmp/SqlIsTrue helpers moved from PRG to Go
 - SqlBulkUpdate defers Flush when WA cache active (APFS fsync was
   dominant B13 cost — 1.6ms/call → gone)

Correctness fixes uncovered during migration
 - ASort default path now sorts dates/logicals/timestamps (was no-op)
 - ORDER BY default NULL placement matches PRG SqlRowCompare across
   Go fast path; explicit NULLS FIRST/LAST honored by both paths
 - SqlBulkUpdate respects EXCLUSIVE vs SHARED mode record locks
 - SqlCmp/SqlCmpEq normalize NumInt vs Double (caught by test 6b)

Verification
 - go test ./...              ALL PASS
 - FiveSql2 test_sql1999      43/43
 - tests/compat_harbour       56/56 (+5 new: ASort dates/logicals,
                              AScan int cross-type)
 - Regression test test_null_order.prg for ORDER BY NULL ordering

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 20:20:14 +09:00
7babfb7281 fix(FiveSql2): 9 latent bugs from static analysis sweep
Systematic bug-hunt driven by an automated analysis of all FiveSql2
source files. Each fix is targeted — no speculative refactoring.

--- #1 CLASSDATA hSubCache leaked across queries (CRITICAL) ---

  CLASSDATA hSubCache INIT { => } SHARED

shared one hash across ALL TSqlExecutor instances. A non-correlated
subquery cached in query A was silently returned for an unrelated
query B if the subquery text happened to produce the same cache key.
Converted to instance DATA initialized in New().

--- #5+#21 IS NULL / COALESCE treated empty string as NULL (HIGH) ---

  RETURN xL == NIL .OR. ( ValType(xL) == "C" .AND. Empty(AllTrim(xL)) )

SQL standard: '' is a valid non-NULL value. Removed the empty-string
check from both IS NULL evaluation and COALESCE skip logic.

--- #4 Multiple ? parameters all returned first value (HIGH) ---

ND_PAR nodes had no index — EvalExpr always returned ::aParams[1].
Parser now stamps each ? with a sequential 1-based index in xNode[2].
EvalExpr uses it to return the correct ::aParams[n].

--- #10+#11 SqlEvalRowExpr missing / and || operators, single-arg
    function eval (MEDIUM) ---

Division and string concatenation fell through to RETURN NIL in the
row-expression evaluator used by recursive CTEs and aggregate
ComputeAgg. Also, multi-argument functions like SUBSTR(x,2,3) only
received the first argument. Both fixed.

--- #9 SUM/AVG/MIN/MAX of all NULLs returned 0 instead of NULL
    (MEDIUM) ---

SQL standard requires NULL. Changed the aggregate return path to
return NIL when nCount == 0 (SUM/AVG) or when xMin/xMax == NIL.

--- #8 MIN/MAX used SqlCoerceNum for comparison (MEDIUM) ---

Strings and dates were coerced to numbers (Val()) before comparing,
making MIN('banana') == MIN('apple') == 0. Switched to SqlCmpLt
which handles type-appropriate comparison.

--- #7 SqlExprHasAgg only checked top-level node (MEDIUM) ---

Expressions like `salary + COUNT(*)` were not detected as containing
an aggregate because the top node was ND_BIN, not ND_FN. Made the
function recursive — walks ND_BIN, ND_UNI, ND_FN args, ND_CASE
branches.

--- #13 SELECT * only expanded first table in JOINs (MEDIUM) ---

`SELECT * FROM orders o JOIN customers c ON ...` only included
fields from orders. Changed the expansion loop to iterate ALL
entries in ::aTables.

--- #2 s_aOuterStack not unwound on subquery error (HIGH) ---

SubqueryCached's PushOuter/PopOuter pair was not protected by
BEGIN SEQUENCE. A runtime error inside the subquery left a stale
entry on the module-level outer stack, corrupting all subsequent
queries' correlated column resolution. Wrapped in SEQUENCE/RECOVER.

Validation:
  - FiveSql2 43/43
  - Harbour compat 51/51
  - go test ./... ALL PASS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 17:26:05 +09:00
6c8d5f8b3b fix(FiveSql2): correlated scalar subquery with JOIN — 3 interacting bugs
A scalar correlated subquery with a JOIN inside:

    SELECT e.name,
      (SELECT SUM(o.qty * p.price)
       FROM ord o INNER JOIN prod p ON o.prod_id = p.id
       WHERE o.emp_id = e.id) AS revenue
    FROM emp e WHERE e.dept = 'SALES'

returned wrong values (equal to SUM(qty) instead of SUM(qty*price))
or zero for all but the first outer row. Root cause was a triple
interaction between three independent bugs.

--- Bug 1: Subquery cache leaked across five_SQL invocations ---

hSubCorrCache, aSubCacheSlots, aSemiJoinSlots, nSubCacheSeq were
declared as DATA ... INIT { => } / {} / 0. In Five's compiled output,
hash/array INIT literals may share the same backing instance across
New() calls, so the cache from query A (SUM qty, no join) was still
there when query B ran, providing a hit on the same key — returning
A's cached (wrong) value instead of re-executing B's subquery.

Fix: explicit initialization in New().

--- Bug 2: aJoins alias mutation across subquery invocations ---

RunSelect's join-alias sync loop mutated aJoins[i][3] from the
user alias ("p") to the depth-suffixed temp alias ("FA_0003").
aJoins was a direct reference into hQuery["joins"], so the mutation
persisted across re-executions of the same hQuery. On the 2nd call,
the sync loop couldn't find a matching aTables entry because the
stale temp alias ("FA_0003") didn't match the new one ("FA_0005").
The join table's workarea was positioned wrong → empty join result.

Fix: deep-clone both ::aTables and aJoins at the start of RunSelect
so each invocation starts from the parsed originals.

--- Bug 3: SqlCollectCols stripped alias prefixes ---

When adding hidden columns for complex aggregate arguments (e.g.
SUM(o.qty * p.price)), SqlCollectCols returned bare names like
"qty" and "price" instead of qualified "o.qty" / "p.price". In a
JOIN context, unqualified "price" routed FetchRow to the first
table (ord) instead of prod — FieldPos returned 0, the column was
silently NIL, and the multiplication collapsed to qty*1 = qty.

Fix: new SqlCollectColExprs returns the original ND_COL AST nodes
with qualified names preserved. The hidden-column loop now inserts
these directly so FetchRow's dot-qualified path resolves to the
correct workarea via FindWA.

--- Verification ---

Deterministic 5-emp / 6-order / 3-product test:

    Expected revenues per emp:
      Emp 1: 2*10 + 3*20 = 80    → got 80.00 ✓
      Emp 2: 1*10 + 4*30 = 130   → got 130.00 ✓
      Emp 3: 5*20 = 100          → got 100.00 ✓
      Emp 4: no orders = 0       → got 0 ✓
      Emp 5: 7*10 = 70           → got 70.00 ✓

Also verified SUM(qty*2) and SUM(p.price) variants.

Validation:
  - FiveSql2 43/43
  - Harbour compat 51/51
  - go test ./... ALL PASS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 11:33:35 +09:00
486e466592 feat: FiveSql2 43/43, @byref, mutable closure, RTL 479, DateTime fix
Major changes since last commit:
- FiveSql2 SQL:1999 engine (10,458 LOC) — 43/43 ALL PASS
- 21 compiler/runtime bugs fixed (short-circuit AND/OR, FOR LOOP, etc.)
- @byref pass-by-reference via RefCell pattern
- Mutable closure capture (EnsureLocalRef + RefCell sharing)
- RTL: 400 → 479 functions (+79: file, string, datetime, hash, UTF-8)
- DateTime/Timestamp fully working (hb_DateTime, hb_Hour/Min/Sec, display)
- Reserved word guard (39 keywords blocked from function calls)
- AEval arg order fix (element before index)
- Closure capture redecl fix (unique _cap_ names per block)
- Hash/string indexing in ArrayPush/ArrayPop
- Harbour compat test suite: 51/51
- 4 docs: Porting Report, Implementation Plan, Optimization Plan, Commercialization

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:35:37 +09:00