Files
five/hbrtl
CharlesKWON bfc6ded8cb perf(FiveSql2): SqlHashBuild + FetchRow column binding — 3-way JOIN 3x
Complex-query benchmarking turned up two hot paths that the earlier
SqlScan/SqlEach work didn't touch: multi-table JOIN and nested-scan
row fetching. This commit hits both.

--- Part 1: SqlHashBuild — Go-native hash-join build ---

FiveSql2's HashJoin previously built the inner-side hash in PRG:

    WHILE !Eof()
      xVal := FieldGet(nFPos)
      cKey := SqlValToStr(xVal)
      IF !hb_HHasKey(hHash, cKey) ; hHash[cKey] := {} ; ENDIF
      AAdd(hHash[cKey], RecNo())
      dbSkip()
    ENDDO

That loop runs at ~40μs per row from class dispatch + hb_HHasKey
lookups + AAdd growth + SqlValToStr formatting. On a 50k-row inner
table that's ~2 seconds wasted on what should be a sub-50ms
housekeeping op.

New hbrtl.SqlHashBuild does the same thing in one Go-native pass:

  - Direct *dbf.DBFArea loop (no interface dispatch, same devirt as
    SqlScan)
  - Go `map[string][]int64` accumulates RecNos by key — one
    allocation per distinct key
  - Inline ASCII-only digit formatter for numeric keys (strconv.Itoa
    is allocation-heavy for small ints)
  - CHAR keys are right-trimmed to match SqlCmpEq semantics so the
    hash probe matches what EvalExpr would compute
  - Final Five hash is built once from Keys/Values/Order slices
    directly, skipping the per-key hb_HSet path

HashJoin now calls `SqlHashBuild(nFPos)` instead of running the
PRG loop.

--- Part 2: TSqlExecutor:BuildFetchCache ---

The JOIN fallback loop calls FetchRow per row. FetchRow was already
column-ref-aware but did the string parse (`At + SubStr + Upper`)
and `::FindWA` linear scan every single invocation. For a 50k-row
join emitting 50k result rows, that's ~200k redundant resolutions.

New BuildFetchCache walks the SELECT list once before the scan and
pre-binds each plain-column expression to `{nWA, nFPos}`. FetchRow's
new fast path checks ::aFetchCache and jumps straight to
`dbSelectArea + FieldGet` when bound. Complex exprs (functions,
CASE, subqueries) still fall through to EvalExpr.

::aFetchCache is set right before the join WHILE loop and cleared
after — no cross-query bleed.

--- Bench (50k ord × 10k emp × 100 dept, 3-run steady state) ---

  Query                        Before      After     Speedup
  ────────────────────────────────────────────────────────────
  2-way INNER JOIN, 10k rows   91ms        68ms      1.34x
  2-way JOIN + GROUP BY        110ms       94ms      1.17x
  3-way INNER JOIN COUNT       2610ms      610ms     4.28x
  3-way JOIN + GROUP BY        2860ms      830ms     3.45x

The 3-way speedup is almost entirely SqlHashBuild. The 2-way case
benefits from the fetch cache because its per-row cost is dominated
by FetchRow (no second hash build to amortize).

--- Limits still standing ---

CTE + JOIN queries (Q7 in bench_complex: ~4.5s) aren't affected by
either optimization — CTE materialization goes through a different
path that writes/reads a temp DBF. Follow-up target.

Validation:
  - FiveSql2 43/43
  - Harbour compat 51/51
  - go test ./... ALL PASS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:47:20 +09:00
..