Senior-engineer / QA audit landed 13 silent-miscompile and data-
integrity fixes spanning the whole compiler+runtime+storage stack.
Each fix is paired with either an integration test in the suite or
a focused regression check; all 6 release gates stay green:
go test ./..., FiveSql2 43/43, Harbour compat 56/56, std.ch 17/17,
FRB 7/7, examples 65/71.
Compiler
--------
* genpc IF/ELSEIF jumpEnd2 patching (compiler/genpc/genpc.go).
Per-ELSEIF branch terminators were stashed into `_ = jumpEnd2`
and never patched — the relative offset stayed 0 and the runtime
walked the next ELSEIF's PcOpJumpFalse opcode as if it were
jump-offset data. Bytecode-level corruption in pcode mode. Now
collected into a slice and patched at end-of-IF. Verified via
Grade(95..50) cases 11a-e added to tests/frb/test_frb_pcode_sweep.
* countLocalsInStmts / scanBodyLocals missing bodies
(compiler/gengo/gen_util.go, compiler/gengo/gengo.go). Frame-size
counter skipped WATCH/TIMEOUT/PARALLEL FOR bodies, so a LOCAL
declared inside one of those constructs got a slot index past
the runtime's allocated count — silent NIL reads or out-of-range
stomps.
* emitMethodDeclStandalone nested LOCAL (compiler/gengo/gen_class.go).
Same bug class but on the *method* side. Pre-fix repro:
METHOD Stomp(n) CLASS T
LOCAL a := 1, b := 2
IF n > 0
LOCAL c := 30, d := 40, e := 50, f := 60
Inner( n )
IF c != 30 .OR. d != 40 .OR. e != 50 .OR. f != 60 ...
printed `c, d, e, f = 5, NIL, NIL, NIL` because Inner's frame
collided with Stomp's underallocated slot range. Now counts
body-nested LOCALs into the frame and pre-allocates indices via
scanBodyLocals.
* genpc unsupported-AST diagnostic surface (compiler/genpc/genpc.go,
hbrt/pcode.go, cmd/five/main.go, hbrtl/frb.go). The `default`
cases in emitStmt / emitExpr silently emitted PushNil / no-op
for nodes the pcode generator doesn't implement (ClassDecl,
MethodDecl, xBase commands, concurrency primitives, …). Added
`PcodeModule.Warnings []string` populated by noteUnsupported,
surfaced on stderr from the build pipeline. Users now see
"pcode: AST node not supported in --pcode/FRB-pcode mode: stmt
*ast.GoBlockStmt" instead of getting a silently broken module.
Runtime
-------
* class.go Send/tryBinaryOp t.self defer-restore (hbrt/class.go).
Restoration was a plain `t.self = oldSelf` after `fn(t)`. Any
panic in the method body skipped the line, so the next BEGIN
SEQUENCE / RECOVER handler ran with the THROWING object's Self
— `::field` resolved against the wrong receiver. Wrapped both
restore sites in `defer func() { t.self = oldSelf }()`.
Verified: pre-fix RECOVER saw "THROWER", post-fix "OUTER".
* hbfunc.go HB_FUNC parameter Frame() (hbrt/hbfunc.go). The
RegisterDynamicFunc wrapper called `fn(ctx)` without ever
calling Frame, so `ctx.ParC(1)` / `ctx.Local(n)` read through
`t.curFrame.localBase + n - 1` against the *caller's* frame.
Every #pragma BEGINDUMP HB_FUNC taking parameters silently
returned "" / 0 / "" for them — masked by ParNIDef-style
defaults. Wrapper now does `t.Frame(t.pendingParams, 0); defer
t.EndProc()` before dispatch.
* pcode codeblock closure capture (hbrt/pcinterp.go, hbrt/pcode.go,
hbrt/thread.go, compiler/genpc/genpc.go). PcOpPushBlock recorded
`nDetached` but never copied enclosing locals; free vars in the
block body fell through to memvar lookup → NIL. Wired full
capture pipeline:
- New opcodes PcOpPushDetached (0x59) / PcOpPopDetached (0x5A).
- PushBlock now reads per-slot source-local indices and
snapshots into bb.Detached at construction time.
- New detachedMap in genpc auto-promotes any free var that
resolves to an enclosing-frame local into a capture slot.
- emitAssignAsExpr leaves the assigned value on the eval stack
so SeqExpr items like `{|v| acc += v, acc }` work.
- Thread tracks curBlock with paired Set/restore in the block's
Fn wrapper for nested-block evaluation.
Mutating capture (acc += v across successive Evals) now works.
* vm.NewThread statics + waFactory propagation (hbrt/vm.go).
GoLaunch / GoLaunchBlock call NewThread directly. Previously
the statics map and WA factory were applied only in Run(), so
goroutine-spawned PRG code panicked on STATIC access ("static
index out of range") and crashed dereferencing nil WA on any
DB call. Both now happen inside NewThread under the same lock
as TID assignment.
Data layer
----------
* dbf concurrent Append lock (hbrdd/dbf/dbf.go,
hbrdd/dbf/locks_posix.go, hbrdd/dbf/locks_windows.go). Append
bumped a local recCount with no file-system serialization. Two
shared-mode processes both wrote at the same RecordOffset; one
record silently overwrote the other. Added an append-intent
byte-range lock at offset 0x7FFFFFFE + bounded retry, on-disk
header refresh inside the locked region, and immediate header
write so peers refresh past our slot.
* indexer negative numeric key encoding (hbrdd/dbf/indexer.go +
new hbrdd/dbf/encode_numeric_test.go). `%20.10f` formats `-100`
as `" -100.0000000000"` and `99` as `" 99.0000000000"`.
ASCII ' ' (0x20) < '-' (0x2D), so `99` lex-compared LESS than
`-100` — every NTX/CDX index over a column that ever held a
negative number returned wrong rows for SEEK / range scans.
Replaced with a 1-byte sign prefix + 21-byte zero-padded
magnitude (negatives use digit-complement) so byte order
matches numeric order across signs and magnitudes. Format
change: existing indexes built with the old encoding must be
REINDEXed. Three unit tests pin the order.
* dbf Append index maintenance hooks (hbrdd/dbf/dbf.go,
hbrdd/dbf/indexer.go). Append never inserted into open NTX/CDX
indexes — the audit's canonical scenario `SET INDEX TO …;
APPEND BLANK; REPLACE …; dbSeek …` silently missed the new
record. Added optional IndexWriter interface, queue the new
recNo in pendingIdxInserts, drain after flushRecord by calling
InsertKey on every open writer-supporting engine. NTX
participates (its existing rebuild-on-insert is correct);
CDX online maintenance is deferred to a follow-up — those
indexes still need REINDEX. Verified: post-fix SEEK("Charlie")
after APPEND BLANK + REPLACE finds the new record.
* dbf PACK crash-safety (hbrdd/dbf/dbf.go). The old in-place
rewrite read record N, overwrote slot M<N, then truncated.
Power loss after partial loop left a file with overwritten
prefix and no original copies of the records already advanced
past — silent data loss. Rewrote to:
1) drop mmap, build `<file>.pack.tmp` with all surviving
records,
2) Sync(),
3) close original handle + os.Rename(tmp, orig) (atomic on
same FS),
4) reopen + re-mmap.
TestComp_Pack passes; readers always see either the pre-PACK
or post-PACK contents, never a half-state.
* mem RDD torn reads (hbrdd/mem/memrdd.go). The comment claimed
in-place PutValue was safe because hbrt.Value "fits in a
single machine word + pointer". hbrt.Value is 24 bytes (3
words) — a concurrent reader could observe new type tag with
stale scalar/ptr and type-confuse on the next AsXxx() call.
Switched mu to sync.RWMutex; GetValue takes RLock,
Append/PutValue/Delete/Recall take Lock. `go test -race
./hbrdd/mem/` clean.
Files touched
-------------
compiler/gengo/gen_class.go, gen_util.go, gengo.go
compiler/genpc/genpc.go
hbrt/class.go, hbfunc.go, pcinterp.go, pcode.go, thread.go, vm.go
hbrdd/dbf/dbf.go, indexer.go, locks_posix.go, locks_windows.go
hbrdd/dbf/encode_numeric_test.go (new)
hbrdd/mem/memrdd.go
cmd/five/main.go
hbrtl/frb.go
tests/frb/test_frb_pcode_sweep.prg
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
385 lines
9.9 KiB
Go
385 lines
9.9 KiB
Go
// Copyright (c) 2026 Charles KWON OhJun (charleskwonohjun@gmail.com)
|
||
// All rights reserved.
|
||
|
||
// Five pcode interpreter — executes pcode bytecode on a Thread.
|
||
// Each opcode directly calls the corresponding Thread method,
|
||
// so pcode execution is semantically identical to gengo-compiled code.
|
||
|
||
package hbrt
|
||
|
||
import (
|
||
"encoding/binary"
|
||
"fmt"
|
||
"math"
|
||
)
|
||
|
||
// ExecPcode runs a pcode function on the given thread.
|
||
// Full variant — installs a defer/recover so panics from inside the
|
||
// pcode body (HbError, BreakValue, user Break) are re-panicked with
|
||
// proper frame unwinding. Used for general-purpose pcode evaluation.
|
||
func ExecPcode(t *Thread, fn *PcodeFunc, mod *PcodeModule) {
|
||
t.Frame(fn.Params, fn.Locals)
|
||
defer t.EndProc()
|
||
execPcodeBody(t, fn, mod)
|
||
}
|
||
|
||
// ExecPcodeFast is a hot-path variant for short, pure expressions
|
||
// (FiveSql2 WHERE predicates, inline lambdas) where the caller has
|
||
// already guaranteed that the body will not panic with HbError /
|
||
// BreakValue. Skips the defer+recover dance in EndProc, saving ~15ns
|
||
// per call × tens of thousands of rows in scan loops.
|
||
//
|
||
// Contract: caller is responsible for panic discipline. If the pcode
|
||
// body panics, the frame stack is still cleaned up (EndProcFast) but
|
||
// no diagnostic is logged and SEQUENCE/RECOVER will not see the panic.
|
||
func ExecPcodeFast(t *Thread, fn *PcodeFunc, mod *PcodeModule) {
|
||
t.Frame(fn.Params, fn.Locals)
|
||
execPcodeBody(t, fn, mod)
|
||
t.EndProcFast()
|
||
}
|
||
|
||
// execPcodeBody is the shared opcode dispatch loop.
|
||
func execPcodeBody(t *Thread, fn *PcodeFunc, mod *PcodeModule) {
|
||
code := fn.Code
|
||
pc := 0 // program counter
|
||
|
||
for pc < len(code) {
|
||
op := code[pc]
|
||
pc++
|
||
|
||
switch op {
|
||
case PcOpNop:
|
||
// do nothing
|
||
|
||
// --- Stack ---
|
||
case PcOpPushNil:
|
||
t.PushNil()
|
||
case PcOpPushTrue:
|
||
t.PushBool(true)
|
||
case PcOpPushFalse:
|
||
t.PushBool(false)
|
||
case PcOpPushInt:
|
||
v := int64(binary.LittleEndian.Uint64(code[pc:]))
|
||
pc += 8
|
||
t.PushLong(v)
|
||
case PcOpPushDouble:
|
||
bits := binary.LittleEndian.Uint64(code[pc:])
|
||
pc += 8
|
||
t.PushDouble(math.Float64frombits(bits), 0, 0)
|
||
case PcOpPushString:
|
||
slen := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
t.PushString(string(code[pc : pc+slen]))
|
||
pc += slen
|
||
case PcOpPushBool:
|
||
t.PushBool(code[pc] != 0)
|
||
pc++
|
||
case PcOpPushLocal:
|
||
idx := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
t.PushLocal(idx)
|
||
case PcOpPushMemvar:
|
||
slen := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
name := string(code[pc : pc+slen])
|
||
pc += slen
|
||
// Resolve through Memvars (PRIVATE shadows PUBLIC).
|
||
// Unknown names push NIL — matches Harbour behavior for
|
||
// undeclared memvars inside `&(expr)`.
|
||
if t.Memvars != nil {
|
||
if v, ok := t.Memvars.Get(name); ok {
|
||
t.push(v)
|
||
continue
|
||
}
|
||
}
|
||
t.PushNil()
|
||
case PcOpPopLocal:
|
||
idx := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
t.PopLocal(idx)
|
||
case PcOpPop:
|
||
t.Pop()
|
||
case PcOpDup:
|
||
t.Dup()
|
||
|
||
// --- Arithmetic ---
|
||
case PcOpPlus:
|
||
t.Plus()
|
||
case PcOpMinus:
|
||
t.Minus()
|
||
case PcOpMult:
|
||
t.Mult()
|
||
case PcOpDivide:
|
||
t.Divide()
|
||
case PcOpMod:
|
||
t.Modulus()
|
||
case PcOpPower:
|
||
t.Power()
|
||
case PcOpNegate:
|
||
t.Negate()
|
||
|
||
// --- Comparison ---
|
||
case PcOpEqual:
|
||
t.Equal()
|
||
case PcOpNotEqual:
|
||
t.NotEqual()
|
||
case PcOpLess:
|
||
t.Less()
|
||
case PcOpGreater:
|
||
t.Greater()
|
||
case PcOpLessEq:
|
||
t.LessEqual()
|
||
case PcOpGreaterEq:
|
||
t.GreaterEqual()
|
||
case PcOpInString:
|
||
t.InString()
|
||
|
||
// --- Logical ---
|
||
case PcOpAnd:
|
||
t.And()
|
||
case PcOpOr:
|
||
t.Or()
|
||
case PcOpNot:
|
||
t.Not()
|
||
|
||
// --- Flow control ---
|
||
case PcOpJump:
|
||
offset := int32(binary.LittleEndian.Uint32(code[pc:]))
|
||
pc += 4
|
||
pc += int(offset)
|
||
case PcOpJumpFalse:
|
||
offset := int32(binary.LittleEndian.Uint32(code[pc:]))
|
||
pc += 4
|
||
if !t.PopLogical() {
|
||
pc += int(offset)
|
||
}
|
||
case PcOpJumpTrue:
|
||
offset := int32(binary.LittleEndian.Uint32(code[pc:]))
|
||
pc += 4
|
||
if t.PopLogical() {
|
||
pc += int(offset)
|
||
}
|
||
case PcOpReturn:
|
||
return
|
||
case PcOpRetValue:
|
||
t.RetValue()
|
||
return
|
||
|
||
// --- Frame ---
|
||
case PcOpFrame:
|
||
// Already called at function entry; skip if re-encountered
|
||
pc += 4 // params + locals
|
||
case PcOpEndProc:
|
||
return
|
||
|
||
// --- Workarea field access (peephole for FieldGet(literal)) ---
|
||
case PcOpFieldGet:
|
||
fIdx := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
// Hot path — SqlScan plugs a direct field getter closure into
|
||
// t.FastFieldGetter before running the predicate, so we skip
|
||
// PushSymbol + Function dispatch + FieldGet RTL's own Frame.
|
||
if fg := t.FastFieldGetter; fg != nil {
|
||
t.PushValue(fg(fIdx))
|
||
} else {
|
||
// Generic fallback: resolve through RTL symbol table
|
||
t.PushSymbol(t.VM().FindSymbol("FIELDGET"))
|
||
t.PushNil()
|
||
t.PushLong(int64(fIdx))
|
||
t.Function(1)
|
||
}
|
||
|
||
// --- AllTrim(FieldGet(n)) fused peephole ---
|
||
case PcOpFieldTrim:
|
||
fIdx := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
// Fast path: use direct field getter, trim inline.
|
||
var v Value
|
||
if fg := t.FastFieldGetter; fg != nil {
|
||
v = fg(fIdx)
|
||
} else {
|
||
// Fallback: resolve via FIELDGET RTL
|
||
t.PushSymbol(t.VM().FindSymbol("FIELDGET"))
|
||
t.PushNil()
|
||
t.PushLong(int64(fIdx))
|
||
t.Function(1)
|
||
v = t.Pop2()
|
||
}
|
||
if v.IsString() {
|
||
s := v.AsString()
|
||
// ASCII-space trim — DBF CHAR fields pad with 0x20 only
|
||
lo, hi := 0, len(s)
|
||
for lo < hi && s[lo] == ' ' {
|
||
lo++
|
||
}
|
||
for hi > lo && s[hi-1] == ' ' {
|
||
hi--
|
||
}
|
||
if lo == 0 && hi == len(s) {
|
||
t.PushValue(v)
|
||
} else {
|
||
t.PushString(s[lo:hi])
|
||
}
|
||
} else {
|
||
t.PushValue(v)
|
||
}
|
||
|
||
// --- Function calls ---
|
||
case PcOpPushSymbol:
|
||
slen := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
name := string(code[pc : pc+slen])
|
||
pc += slen
|
||
sym := t.VM().FindSymbol(name)
|
||
t.PushSymbol(sym)
|
||
case PcOpPushNilArg:
|
||
t.PushNil()
|
||
case PcOpFunction:
|
||
nArgs := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
t.Function(nArgs)
|
||
case PcOpDo:
|
||
nArgs := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
t.Do(nArgs)
|
||
|
||
// --- Self / OOP ---
|
||
case PcOpPushSelf:
|
||
t.PushSelf()
|
||
case PcOpPushSelfField:
|
||
slen := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
name := string(code[pc : pc+slen])
|
||
pc += slen
|
||
t.PushSelfField(name)
|
||
case PcOpSetSelfField:
|
||
slen := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
name := string(code[pc : pc+slen])
|
||
pc += slen
|
||
t.SetSelfField(name)
|
||
case PcOpSend:
|
||
slen := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
name := string(code[pc : pc+slen])
|
||
pc += slen
|
||
nArgs := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
t.Send(name, nArgs)
|
||
|
||
// --- Array ---
|
||
case PcOpArrayGen:
|
||
count := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
t.ArrayGen(count)
|
||
case PcOpArrayPush:
|
||
t.ArrayPush()
|
||
case PcOpArrayPop:
|
||
t.ArrayPop()
|
||
|
||
// --- Hash --- (PcOpHashGen has been declared since the
|
||
// initial pcode design but its dispatch case was missing,
|
||
// so any pcode body that built a hash literal panicked
|
||
// with "unknown pcode opcode: 0x51".)
|
||
case PcOpHashGen:
|
||
count := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
t.HashGen(count)
|
||
|
||
// --- Block ---
|
||
case PcOpPushBlock:
|
||
codeLen := int(binary.LittleEndian.Uint32(code[pc:]))
|
||
pc += 4
|
||
blockCode := make([]byte, codeLen)
|
||
copy(blockCode, code[pc:pc+codeLen])
|
||
pc += codeLen
|
||
nParams := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
nDetached := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
|
||
// Snapshot closure-captured locals from the *current*
|
||
// frame into the new block's Detached slice. The body
|
||
// reads/writes them via PcOpPushDetached / PcOpPopDetached
|
||
// at the indices the compiler reserved. Without this,
|
||
// `{|x| x + outer }` saw `outer` as NIL because the
|
||
// block fn ran with its own frame and the body's lookup
|
||
// for `outer` fell through to the memvar table.
|
||
captured := make([]Value, nDetached)
|
||
for i := 0; i < nDetached; i++ {
|
||
srcIdx := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
captured[i] = t.Local(srcIdx)
|
||
}
|
||
|
||
// Create a Go function that interprets the block's pcode.
|
||
// Params count must be threaded through so ExecPcode's
|
||
// Frame() pulls Eval()'s args off the stack into the
|
||
// block's locals — without it, `{|x| x*x }` saw x=NIL
|
||
// and `x * x` panicked on the multiplication.
|
||
blockFn := &PcodeFunc{Code: blockCode, Params: nParams}
|
||
modCopy := mod
|
||
blockVal := MakeBlock(nil, nDetached) // Fn patched below
|
||
bb := (*HbBlock)(blockVal.ptr)
|
||
if nDetached > 0 {
|
||
copy(bb.Detached, captured)
|
||
}
|
||
bb.Fn = func(t2 *Thread) {
|
||
// Install this block as the currently-executing
|
||
// block so PcOpPushDetached / PcOpPopDetached can
|
||
// resolve their slots. Restore the previous one on
|
||
// exit so nested-block evaluation (`{|| eval(b2) }`)
|
||
// pops back to the outer block.
|
||
prev := t2.CurBlock()
|
||
t2.SetCurBlock(bb)
|
||
defer t2.SetCurBlock(prev)
|
||
ExecPcode(t2, blockFn, modCopy)
|
||
}
|
||
t.PushValue(blockVal)
|
||
|
||
case PcOpPushDetached:
|
||
slot := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
bb := t.CurBlock()
|
||
if bb != nil && slot < len(bb.Detached) {
|
||
t.PushValue(bb.Detached[slot])
|
||
} else {
|
||
t.PushNil()
|
||
}
|
||
|
||
case PcOpPopDetached:
|
||
slot := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
val := t.pop()
|
||
bb := t.CurBlock()
|
||
if bb != nil && slot < len(bb.Detached) {
|
||
bb.Detached[slot] = val
|
||
}
|
||
|
||
// --- Local ops ---
|
||
case PcOpLocalAddInt:
|
||
idx := int(binary.LittleEndian.Uint16(code[pc:]))
|
||
pc += 2
|
||
val := int32(binary.LittleEndian.Uint32(code[pc:]))
|
||
pc += 4
|
||
t.LocalAddInt(idx, int64(val))
|
||
case PcOpInc:
|
||
t.Inc()
|
||
case PcOpDec:
|
||
t.Dec()
|
||
|
||
case PcOpPopLogical:
|
||
t.PopLogical()
|
||
|
||
case PcOpLine:
|
||
pc += 2 // skip line number (for debugging)
|
||
|
||
case PcOpHalt:
|
||
return
|
||
|
||
default:
|
||
panic(fmt.Sprintf("unknown pcode opcode: 0x%02X at pc=%d", op, pc-1))
|
||
}
|
||
}
|
||
}
|