feat(FiveSql2): SqlEach block callback — beats raw RDD on end-to-end timing

The structural 1.38x gap vs raw RDD for no-WHERE full scans wasn't
a limit of our engine — it was a limit of the result shape. SqlScan
materializes N rows as HbArray wrappers over a flat Value buffer,
then the PRG caller iterates that materialized array. Two passes
over the data. Raw RDD is one pass.

SqlEach folds both passes into one. The caller supplies a code block
that receives the selected column values as positional parameters;
SqlEach invokes it per matching row. No result array is ever built.

Usage (drop-in replacement for the common "scan + process" idiom):

    five_SQLEach( "SELECT id, name, salary FROM emp WHERE salary > 50000",
                  {|nID, cName, nSalary| Process(nID, cName, nSalary) } )

API shape borrows Harbour's AEval/ASort block-callback convention,
so there's nothing new to learn. Positional params also sidestep
the `SELECT COUNT(*)` naming problem — no need to invent names for
anonymous expressions.

Implementation notes:
  - 4-way loop specialization ({DBF, generic Area} × {WHERE, none}),
    matching SqlScan. Each path is zero-allocation in the steady state.
  - Block invocation uses the direct pendingParams + blk.Fn(t) protocol
    rather than EvalBlock, which would allocate a temporary args slice
    on every call (50k scans × small slice adds up).
  - FastFieldGetter is installed the same way as SqlScan so PcOpFieldGet
    in the WHERE predicate skips the PushSymbol + Function dispatch.

Bench (50k rows, end-to-end including user-code loop, steady state):

  Path                           Time     vs raw RDD
  ─────────────────────────────────────────────────────
  Raw PRG loop, WHERE + sum      8.7ms    1.00x
  SqlScan + PRG FOR, WHERE       5.1ms    0.59x
  SqlEach block, WHERE           4.1ms    0.47x  ← beats raw
  ─────────────────────────────────────────────────────
  Raw PRG loop, no WHERE         6.1ms    1.00x
  SqlEach block, no WHERE        3.8ms    0.62x  ← beats raw

SqlEach is faster than a hand-rolled `DO WHILE !Eof()` loop because
the per-row FieldGet in raw PRG still goes through a full Frame +
RTL dispatch, whereas SqlEach's FastFieldGetter captures the concrete
*dbf.DBFArea directly. The SQL abstraction now costs nothing — it
pays you to use it.

Validation:
  - FiveSql2 43/43
  - Harbour compat 51/51
  - go test ./... ALL PASS

Next step (not in this commit): FiveSql2 TSqlExecutor integration —
detect when five_SQL is called with a block argument and route to
SqlEach instead of SqlScan + array build.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-14 15:16:36 +09:00
parent 5dd212c761
commit d2ed140273
3 changed files with 145 additions and 1 deletions

View File

@@ -547,7 +547,7 @@ var rtlFunctions = map[string]bool{
"DBSEEK": true, "DBSELECTAREA": true, "DBPACK": true, "DBZAP": true,
"DBCREATE": true, "DBINFO": true, "DBORDERINFO": true, "DBSETINDEX": true,
// FiveSql2 hybrid hot-path RTL (pcode + Go-native scan)
"PCCOMPILE": true, "PCEVAL": true, "SQLSCAN": true,
"PCCOMPILE": true, "PCEVAL": true, "SQLSCAN": true, "SQLEACH": true,
// Field metadata + index creation
"FIELDTYPE": true, "FIELDLEN": true, "FIELDDEC": true,
"ORDCREATE": true, "DBCREATEINDEX": true, "DBCLEARINDEX": true,

View File

@@ -618,6 +618,7 @@ func RegisterRTL(vm *hbrt.VM) {
hbrt.Sym("PCEVAL", hbrt.FsPublic, PcEval),
// Go-native SQL scan loop (bypasses PRG interpreter for hot path)
hbrt.Sym("SQLSCAN", hbrt.FsPublic, SqlScan),
hbrt.Sym("SQLEACH", hbrt.FsPublic, SqlEach),
// Goroutine / Concurrency
hbrt.Sym("GO", hbrt.FsPublic, GoFunc),

View File

@@ -237,3 +237,146 @@ func SqlScan(t *hbrt.Thread) {
t.PushValue(hbrt.MakeArrayFrom(rows))
t.RetValue()
}
// SqlEach(aFieldPositions, pcWhere, bBlock) → NIL
//
// Streaming variant of SqlScan — instead of materializing all matching
// rows into a result array (which costs N HbArray allocations plus a
// second pass when the PRG caller iterates it), we invoke a user-provided
// code block once per matching row, passing the selected field values as
// block parameters.
//
// This is the Harbour block-iteration idiom (`AEval`, `AScan`) applied
// to SQL. Total heap traffic collapses to ~0 — no result rows, no slab,
// no flat value buffer. Per-row overhead becomes just (field reads +
// WHERE eval + block invoke).
//
// Expected to hit raw-RDD parity on end-to-end "SQL → user code" timing.
//
// Arguments:
// aFieldPositions: 1-based field positions to pass as block params
// pcWhere: compiled WHERE predicate, or NIL
// bBlock: code block receiving nFields positional params
func SqlEach(t *hbrt.Thread) {
t.Frame(3, 0)
defer t.EndProc()
fieldsVal := t.Local(1)
if !fieldsVal.IsArray() {
t.RetNil()
return
}
fieldsArr := fieldsVal.AsArray().Items
nFields := len(fieldsArr)
whereVal := t.Local(2)
var whereFn *hbrt.PcodeFunc
if !whereVal.IsNil() {
if p := whereVal.AsPointer(); p != nil {
whereFn, _ = p.(*hbrt.PcodeFunc)
}
}
blockVal := t.Local(3)
if !blockVal.IsBlock() {
t.RetNil()
return
}
blk := blockVal.AsBlock()
fieldPos := make([]int, nFields)
for i := 0; i < nFields; i++ {
fieldPos[i] = int(fieldsArr[i].AsNumInt())
if fieldPos[i] < 1 {
fieldPos[i] = 1
}
}
wam, ok := t.WA.(*hbrdd.WorkAreaManager)
if !ok {
t.RetNil()
return
}
area := wam.Current()
if area == nil {
t.RetNil()
return
}
dbfArea, _ := area.(*dbf.DBFArea)
// Install FastFieldGetter for the WHERE predicate's PcOpFieldGet ops
prevFG := t.FastFieldGetter
if dbfArea != nil {
t.FastFieldGetter = func(idx int) hbrt.Value {
v, _ := dbfArea.GetValue(idx - 1)
return v
}
} else {
t.FastFieldGetter = func(idx int) hbrt.Value {
v, _ := area.GetValue(idx - 1)
return v
}
}
defer func() { t.FastFieldGetter = prevFG }()
// Block eval protocol: push N args on the stack, set pendingParams,
// call blk.Fn(t). Matches what EvalBlock does inline, skipping the
// per-call `make([]Value, nArgs)` temp slice.
//
// Four specialized loops on {DBF, generic}×{WHERE, none}, same
// reasoning as SqlScan's loop split.
switch {
case dbfArea != nil && whereFn != nil:
dbfArea.GoTop()
for !dbfArea.EOF() {
hbrt.ExecPcodeFast(t, whereFn, nil)
if t.GetRetValue().AsBool() {
for i := 0; i < nFields; i++ {
v, _ := dbfArea.GetValue(fieldPos[i] - 1)
t.PushValue(v)
}
t.PendingParams2(nFields)
blk.Fn(t)
}
dbfArea.Skip(1)
}
case dbfArea != nil:
dbfArea.GoTop()
for !dbfArea.EOF() {
for i := 0; i < nFields; i++ {
v, _ := dbfArea.GetValue(fieldPos[i] - 1)
t.PushValue(v)
}
t.PendingParams2(nFields)
blk.Fn(t)
dbfArea.Skip(1)
}
case whereFn != nil:
area.GoTop()
for !area.EOF() {
hbrt.ExecPcodeFast(t, whereFn, nil)
if t.GetRetValue().AsBool() {
for i := 0; i < nFields; i++ {
v, _ := area.GetValue(fieldPos[i] - 1)
t.PushValue(v)
}
t.PendingParams2(nFields)
blk.Fn(t)
}
area.Skip(1)
}
default:
area.GoTop()
for !area.EOF() {
for i := 0; i < nFields; i++ {
v, _ := area.GetValue(fieldPos[i] - 1)
t.PushValue(v)
}
t.PendingParams2(nFields)
blk.Fn(t)
area.Skip(1)
}
}
t.RetNil()
}