Files
five/hbrt/call.go
CharlesKWON 1f63c7fe63 perf(vm): symbol hoist + Function() stack shift — global 3-15%
The VM call path (PushSymbol → Function → Frame) is traversed by every
PRG function call. Three changes together cut per-call overhead across
the entire bench suite.

Changes
 - hbrt/call.go Function(): replace pop-push dance with a single slice
   shift (N+2 pops + N pushes → 1 copy of N slots + sp adjust). Kills
   the per-call `make([]Value, nArgs)` heap alloc. Resolved function
   pointer is cached back into sym.Func so subsequent calls on the
   same Symbol skip the VM lookup entirely.
 - hbrt/vm.go GetSym(): new helper. Generated code calls it with a
   pointer to a package-level `*Symbol` slot so FindSymbol (which takes
   the VM RWMutex + map lookup) runs at most once per symbol per
   process. Nil results are intentionally NOT cached — an init-order
   miss becomes a retry on the next call instead of a permanent sticky
   failure.
 - hbrt/thread.go pushPendingSym(): scalar fast slot for depth=1 call
   nesting (common case). Nil syms still go through the slice so the
   "empty vs stored nil" ambiguity can't produce a false pop.
 - compiler/gengo/gengo.go: emit `t.PushSymbol(t.GetSym(&_sym_<file>_<NAME>, "NAME"))`
   for every function call site, with a per-file prefix so multi-PRG
   builds don't collide on identical symbol names.

Bugs fixed during bring-up
 - pendingSymFast == nil was ambiguous ("unused" vs "nil stored"). Nil
   syms now spill to the slice, preserving distinguishability.
 - The old varName-reuse branch at the PushSymbol emit site skipped
   the GetSym wrapper, emitting a raw `t.PushSymbol(varName)` against
   an uninitialized package-level *Symbol. Every call path now funnels
   through emitPushSymbol.

bench_sql deltas vs prior build
 - B1  SELECT *          114 →  97 µs   (15%)
 - B4  GROUP_HAVING      584 → 554 µs   (5%)
 - B8  RECURSIVE CTE     150 → 141 µs   (6%)
 - B10 RANK PARTITION    310 → 296 µs   (5%)
 - B11 SUM OVER          335 → 320 µs   (4%)
 - B14 COUNT             295 → 281 µs   (5%)
 - B15 CTE+WIN+JOIN     1891 → 1826 µs  (3%)

Verification
 - go test ./...               ALL PASS
 - FiveSql2 test_sql1999       43/43
 - tests/compat_harbour        56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 20:41:48 +09:00

83 lines
2.5 KiB
Go

// Copyright (c) 2026 Charles KWON OhJun (charleskwonohjun@gmail.com)
// All rights reserved.
package hbrt
import "strings"
// pendingCall stores the symbol for the next Function/Do call.
// This avoids storing Go pointers in Value.data (which GC can't trace).
// PushSymbol records the function symbol for the next call.
// The actual symbol is stored in Thread, not on the eval stack.
// A marker NIL is pushed to keep stack positions correct.
// Harbour: hb_xvmPushSymbol
func (t *Thread) PushSymbol(sym *Symbol) {
t.pushPendingSym(sym)
t.push(MakeNil()) // placeholder for symbol position
}
// Function calls the function with nArgs arguments.
// Stack layout before: [sym_placeholder] [nil/self] [arg1] ... [argN]
// Stack after: [retval]
// Harbour: hb_xvmFunction
func (t *Thread) Function(nArgs int) {
sym := t.popPendingSym()
if sym == nil {
panic(t.runtimeError("no function symbol for call"))
}
// Resolve function. First call for an external/lazy symbol misses
// sym.Func and walks the VM symbol table — cache the resolved Func
// back into the Symbol so subsequent calls skip the ToUpper +
// RWMutex + map lookup. Symbols are shared read-mostly so a racy
// write is safe (both racers resolve to the same Func pointer).
fn := sym.Func
if fn == nil && t.vm != nil {
found := t.vm.FindSymbol(strings.ToUpper(sym.Name))
if found != nil {
fn = found.Func
sym.Func = fn
}
}
if fn == nil {
panic(t.runtimeError("undefined function: " + sym.Name))
}
// Stack at entry (bottom → top):
// [sym placeholder] [self/NIL] [arg1] … [argN]
// Frame() expects only [arg1..argN] on the eval stack so it can
// copy them into the callee's locals. The old code achieved this
// by pop-popping args, popping the two placeholders, then pushing
// the args back — an O(N) copy plus a heap allocation per call.
// Shift the args two slots left in place instead: one slice move,
// zero heap.
if nArgs > 0 {
base := t.sp - nArgs - 2
copy(t.stack[base:base+nArgs], t.stack[t.sp-nArgs:t.sp])
}
// Two slots freed at top — keep them nil so the GC can release any
// references they held (matches pop()'s clearing semantics).
t.stack[t.sp-1] = cachedNil
t.stack[t.sp-2] = cachedNil
t.sp -= 2
// Set pending params count and symbol for Frame()
t.pendingParams = nArgs
t.pendingCallSym = sym
// Call
fn(t)
// Push return value
t.push(t.retVal)
}
// Do calls the function but discards the return value.
// Harbour: hb_xvmDo
func (t *Thread) Do(nArgs int) {
t.Function(nArgs)
t.pop() // discard return value
}