TSqlExecutor:FetchRow was the per-row workhorse for aggregation,
HAVING, and window queries. Even with the pre-built aFetchCache
binding columns to (nWA, nFPos), the PRG FOR loop paid one method
dispatch per column per row (dbSelectArea, FieldGet, AllTrim,
AAdd) — profile pinned it at ~30% of B4 CPU.
SqlFetchRowFast collapses the cache-path loop into a single Go
call:
- bound entry: SelectByNum + area.GetValue directly
- unbound (aggregate/expression): self:EvalExpr via Send
- character values: TrimSpace inline
The PRG FetchRow keeps its original cache-miss fallback path
unchanged for rare queries where aFetchCache isn't built.
Bench deltas (median of 3 steady runs, 1000 iters):
B4_GROUP_HAVING 418 → 327 us -22% (1.28x)
B9_ROW_NUMBER 191 → 120 us -37% (1.59x)
B10_RANK_PART 228 → 135 us -41% (1.69x)
B11_SUM_OVER 249 → 156 us -37% (1.60x)
B14_COUNT 235 → 219 us -7%
B15_CTE_WIN_JOIN 1577 → 1452 us -8%
Single-table SELECT (B1-B3, B5-B7, B8) stays flat — those already
hit the column-binding fast path and don't need aggregate dispatch.
FiveSql2 43/43, Harbour compat 56/56.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>