Files
five/compiler/genpc/genpc.go
CharlesKWON cde86730b8 fix(compiler,hbrt,hbrdd,cli): pre-1.0 audit — 13 critical fixes
Senior-engineer / QA audit landed 13 silent-miscompile and data-
integrity fixes spanning the whole compiler+runtime+storage stack.
Each fix is paired with either an integration test in the suite or
a focused regression check; all 6 release gates stay green:
go test ./..., FiveSql2 43/43, Harbour compat 56/56, std.ch 17/17,
FRB 7/7, examples 65/71.

Compiler
--------

* genpc IF/ELSEIF jumpEnd2 patching (compiler/genpc/genpc.go).
  Per-ELSEIF branch terminators were stashed into `_ = jumpEnd2`
  and never patched — the relative offset stayed 0 and the runtime
  walked the next ELSEIF's PcOpJumpFalse opcode as if it were
  jump-offset data. Bytecode-level corruption in pcode mode. Now
  collected into a slice and patched at end-of-IF. Verified via
  Grade(95..50) cases 11a-e added to tests/frb/test_frb_pcode_sweep.

* countLocalsInStmts / scanBodyLocals missing bodies
  (compiler/gengo/gen_util.go, compiler/gengo/gengo.go). Frame-size
  counter skipped WATCH/TIMEOUT/PARALLEL FOR bodies, so a LOCAL
  declared inside one of those constructs got a slot index past
  the runtime's allocated count — silent NIL reads or out-of-range
  stomps.

* emitMethodDeclStandalone nested LOCAL (compiler/gengo/gen_class.go).
  Same bug class but on the *method* side. Pre-fix repro:

      METHOD Stomp(n) CLASS T
         LOCAL a := 1, b := 2
         IF n > 0
            LOCAL c := 30, d := 40, e := 50, f := 60
            Inner( n )
            IF c != 30 .OR. d != 40 .OR. e != 50 .OR. f != 60 ...

  printed `c, d, e, f = 5, NIL, NIL, NIL` because Inner's frame
  collided with Stomp's underallocated slot range. Now counts
  body-nested LOCALs into the frame and pre-allocates indices via
  scanBodyLocals.

* genpc unsupported-AST diagnostic surface (compiler/genpc/genpc.go,
  hbrt/pcode.go, cmd/five/main.go, hbrtl/frb.go). The `default`
  cases in emitStmt / emitExpr silently emitted PushNil / no-op
  for nodes the pcode generator doesn't implement (ClassDecl,
  MethodDecl, xBase commands, concurrency primitives, …). Added
  `PcodeModule.Warnings []string` populated by noteUnsupported,
  surfaced on stderr from the build pipeline. Users now see
  "pcode: AST node not supported in --pcode/FRB-pcode mode: stmt
  *ast.GoBlockStmt" instead of getting a silently broken module.

Runtime
-------

* class.go Send/tryBinaryOp t.self defer-restore (hbrt/class.go).
  Restoration was a plain `t.self = oldSelf` after `fn(t)`. Any
  panic in the method body skipped the line, so the next BEGIN
  SEQUENCE / RECOVER handler ran with the THROWING object's Self
  — `::field` resolved against the wrong receiver. Wrapped both
  restore sites in `defer func() { t.self = oldSelf }()`.
  Verified: pre-fix RECOVER saw "THROWER", post-fix "OUTER".

* hbfunc.go HB_FUNC parameter Frame() (hbrt/hbfunc.go). The
  RegisterDynamicFunc wrapper called `fn(ctx)` without ever
  calling Frame, so `ctx.ParC(1)` / `ctx.Local(n)` read through
  `t.curFrame.localBase + n - 1` against the *caller's* frame.
  Every #pragma BEGINDUMP HB_FUNC taking parameters silently
  returned "" / 0 / "" for them — masked by ParNIDef-style
  defaults. Wrapper now does `t.Frame(t.pendingParams, 0); defer
  t.EndProc()` before dispatch.

* pcode codeblock closure capture (hbrt/pcinterp.go, hbrt/pcode.go,
  hbrt/thread.go, compiler/genpc/genpc.go). PcOpPushBlock recorded
  `nDetached` but never copied enclosing locals; free vars in the
  block body fell through to memvar lookup → NIL. Wired full
  capture pipeline:
  - New opcodes PcOpPushDetached (0x59) / PcOpPopDetached (0x5A).
  - PushBlock now reads per-slot source-local indices and
    snapshots into bb.Detached at construction time.
  - New detachedMap in genpc auto-promotes any free var that
    resolves to an enclosing-frame local into a capture slot.
  - emitAssignAsExpr leaves the assigned value on the eval stack
    so SeqExpr items like `{|v| acc += v, acc }` work.
  - Thread tracks curBlock with paired Set/restore in the block's
    Fn wrapper for nested-block evaluation.
  Mutating capture (acc += v across successive Evals) now works.

* vm.NewThread statics + waFactory propagation (hbrt/vm.go).
  GoLaunch / GoLaunchBlock call NewThread directly. Previously
  the statics map and WA factory were applied only in Run(), so
  goroutine-spawned PRG code panicked on STATIC access ("static
  index out of range") and crashed dereferencing nil WA on any
  DB call. Both now happen inside NewThread under the same lock
  as TID assignment.

Data layer
----------

* dbf concurrent Append lock (hbrdd/dbf/dbf.go,
  hbrdd/dbf/locks_posix.go, hbrdd/dbf/locks_windows.go). Append
  bumped a local recCount with no file-system serialization. Two
  shared-mode processes both wrote at the same RecordOffset; one
  record silently overwrote the other. Added an append-intent
  byte-range lock at offset 0x7FFFFFFE + bounded retry, on-disk
  header refresh inside the locked region, and immediate header
  write so peers refresh past our slot.

* indexer negative numeric key encoding (hbrdd/dbf/indexer.go +
  new hbrdd/dbf/encode_numeric_test.go). `%20.10f` formats `-100`
  as `"     -100.0000000000"` and `99` as `"        99.0000000000"`.
  ASCII ' ' (0x20) < '-' (0x2D), so `99` lex-compared LESS than
  `-100` — every NTX/CDX index over a column that ever held a
  negative number returned wrong rows for SEEK / range scans.
  Replaced with a 1-byte sign prefix + 21-byte zero-padded
  magnitude (negatives use digit-complement) so byte order
  matches numeric order across signs and magnitudes. Format
  change: existing indexes built with the old encoding must be
  REINDEXed. Three unit tests pin the order.

* dbf Append index maintenance hooks (hbrdd/dbf/dbf.go,
  hbrdd/dbf/indexer.go). Append never inserted into open NTX/CDX
  indexes — the audit's canonical scenario `SET INDEX TO …;
  APPEND BLANK; REPLACE …; dbSeek …` silently missed the new
  record. Added optional IndexWriter interface, queue the new
  recNo in pendingIdxInserts, drain after flushRecord by calling
  InsertKey on every open writer-supporting engine. NTX
  participates (its existing rebuild-on-insert is correct);
  CDX online maintenance is deferred to a follow-up — those
  indexes still need REINDEX. Verified: post-fix SEEK("Charlie")
  after APPEND BLANK + REPLACE finds the new record.

* dbf PACK crash-safety (hbrdd/dbf/dbf.go). The old in-place
  rewrite read record N, overwrote slot M<N, then truncated.
  Power loss after partial loop left a file with overwritten
  prefix and no original copies of the records already advanced
  past — silent data loss. Rewrote to:
    1) drop mmap, build `<file>.pack.tmp` with all surviving
       records,
    2) Sync(),
    3) close original handle + os.Rename(tmp, orig) (atomic on
       same FS),
    4) reopen + re-mmap.
  TestComp_Pack passes; readers always see either the pre-PACK
  or post-PACK contents, never a half-state.

* mem RDD torn reads (hbrdd/mem/memrdd.go). The comment claimed
  in-place PutValue was safe because hbrt.Value "fits in a
  single machine word + pointer". hbrt.Value is 24 bytes (3
  words) — a concurrent reader could observe new type tag with
  stale scalar/ptr and type-confuse on the next AsXxx() call.
  Switched mu to sync.RWMutex; GetValue takes RLock,
  Append/PutValue/Delete/Recall take Lock. `go test -race
  ./hbrdd/mem/` clean.

Files touched
-------------

  compiler/gengo/gen_class.go, gen_util.go, gengo.go
  compiler/genpc/genpc.go
  hbrt/class.go, hbfunc.go, pcinterp.go, pcode.go, thread.go, vm.go
  hbrdd/dbf/dbf.go, indexer.go, locks_posix.go, locks_windows.go
  hbrdd/dbf/encode_numeric_test.go  (new)
  hbrdd/mem/memrdd.go
  cmd/five/main.go
  hbrtl/frb.go
  tests/frb/test_frb_pcode_sweep.prg

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 05:29:56 +09:00

1013 lines
28 KiB
Go

// Copyright (c) 2026 Charles KWON OhJun (charleskwonohjun@gmail.com)
// All rights reserved.
// genpc — Five pcode generator. Compiles AST to bytecode for FRB interpreter mode.
// Mirrors gengo's logic but emits bytecode opcodes instead of Go source code.
package genpc
import (
"encoding/binary"
"five/compiler/ast"
"five/compiler/token"
"five/hbrt"
"fmt"
"math"
"sort"
"strconv"
"strings"
)
// Generate compiles an AST file to a PcodeModule.
func Generate(file *ast.File) *hbrt.PcodeModule {
g := &generator{
mod: &hbrt.PcodeModule{
Name: file.Name,
Funcs: make(map[string]*hbrt.PcodeFunc),
},
}
for _, d := range file.Decls {
switch decl := d.(type) {
case *ast.FuncDecl:
g.emitFunc(decl)
default:
// ClassDecl, MethodDecl, top-level VarDecl, etc. are not
// expressible in pcode form today — record the kind so the
// caller can surface a clear "rebuild without --pcode"
// diagnostic instead of silently dropping the declaration.
g.noteUnsupported(fmt.Sprintf("%T", decl))
}
}
g.mod.Warnings = g.Warnings()
return g.mod
}
// CompileExpr compiles a single expression AST to a standalone PcodeFunc
// that, when executed, leaves the expression's value on the stack as a
// return value. Used by FiveSql2 for prepared-statement-style caching:
// compile WHERE / SELECT expressions once per query, execute per row.
//
// The returned function takes zero parameters and zero locals.
// Caller provides field access context via the current workarea.
func CompileExpr(expr ast.Expr) *hbrt.PcodeFunc {
g := &generator{
mod: &hbrt.PcodeModule{Funcs: make(map[string]*hbrt.PcodeFunc)},
locals: make(map[string]int),
}
// Note: ExecPcode emits its own Frame/EndProc around this code.
// We just emit the expression evaluation + RetValue.
g.emitExpr(expr)
g.emit(hbrt.PcOpRetValue)
return &hbrt.PcodeFunc{
Name: "_EXPR",
Code: g.code,
Params: 0,
Locals: 0,
}
}
type generator struct {
mod *hbrt.PcodeModule
code []byte
locals map[string]int
// detached, when non-nil, intercepts IdentExpr/AssignExpr lookups
// inside a block body. Names found in the *enclosing* locals are
// auto-promoted to capture slots in declaration order so the body
// emits PcOpPushDetached / PcOpPopDetached against the slot index
// rather than falling through to the runtime memvar table.
detached *detachedMap
// Unsupported AST node kinds encountered during emit, recorded
// once per kind. Exposed via Module().Warnings so the build
// pipeline can surface a clear "node X not supported in pcode
// mode" diagnostic instead of silently emitting PushNil/no-op
// (the previous behavior, which masked bugs as wrong results).
unsupported map[string]bool
}
// noteUnsupported records that an AST node kind was hit by the
// silent-fallback path. Caller emits PushNil/Pop to keep the stack
// shape valid; the diagnostic itself is collected and reported once
// per kind at module-level after Generate completes.
func (g *generator) noteUnsupported(kind string) {
if g.unsupported == nil {
g.unsupported = map[string]bool{}
}
g.unsupported[kind] = true
}
// Warnings returns the accumulated unsupported-node diagnostics in
// stable (sorted) order so build output is deterministic.
func (g *generator) Warnings() []string {
if len(g.unsupported) == 0 {
return nil
}
out := make([]string, 0, len(g.unsupported))
for k := range g.unsupported {
out = append(out, "pcode: AST node not supported in --pcode/FRB-pcode mode: "+k+
" (emitted as no-op; rebuild without --pcode to keep this construct)")
}
sort.Strings(out)
return out
}
// detachedMap accumulates the closure captures requested by a block
// body in encounter order. The enclosing scope's locals map is read
// to translate a free-variable name into the source-local index that
// the PushBlock op must snapshot.
type detachedMap struct {
enclosing map[string]int // outer scope's local name -> 1-based local index
slot map[string]int // captured name -> 0-based detached slot
srcOrder []int // 0-based slot -> source local index (1-based)
}
func newDetachedMap(enclosing map[string]int) *detachedMap {
return &detachedMap{
enclosing: enclosing,
slot: map[string]int{},
}
}
// resolve returns (slot, true) if `name` resolves through the
// enclosing scope and reserves a capture slot for it on first use.
// Returns (0, false) for names not in the enclosing scope — caller
// falls back to the memvar lookup or another resolution path.
func (d *detachedMap) resolve(name string) (int, bool) {
if d == nil {
return 0, false
}
if s, ok := d.slot[name]; ok {
return s, true
}
src, ok := d.enclosing[name]
if !ok {
return 0, false
}
s := len(d.srcOrder)
d.slot[name] = s
d.srcOrder = append(d.srcOrder, src)
return s, true
}
func (d *detachedMap) sources() []int {
if d == nil {
return nil
}
return d.srcOrder
}
func (g *generator) emit(b ...byte) {
g.code = append(g.code, b...)
}
func (g *generator) emitU16(v uint16) {
var buf [2]byte
binary.LittleEndian.PutUint16(buf[:], v)
g.code = append(g.code, buf[:]...)
}
func (g *generator) emitI32(v int32) {
var buf [4]byte
binary.LittleEndian.PutUint32(buf[:], uint32(v))
g.code = append(g.code, buf[:]...)
}
func (g *generator) emitI64(v int64) {
var buf [8]byte
binary.LittleEndian.PutUint64(buf[:], uint64(v))
g.code = append(g.code, buf[:]...)
}
func (g *generator) emitF64(v float64) {
var buf [8]byte
binary.LittleEndian.PutUint64(buf[:], math.Float64bits(v))
g.code = append(g.code, buf[:]...)
}
func (g *generator) emitString(op byte, s string) {
g.emit(op)
g.emitU16(uint16(len(s)))
g.code = append(g.code, []byte(s)...)
}
func (g *generator) pc() int {
return len(g.code)
}
// placeholder for jump offset, returns position to patch
func (g *generator) emitJumpPlaceholder(op byte) int {
g.emit(op)
pos := g.pc()
g.emitI32(0) // placeholder
return pos
}
func (g *generator) patchJump(pos int) {
offset := int32(g.pc() - pos - 4) // relative to after the offset bytes
binary.LittleEndian.PutUint32(g.code[pos:], uint32(offset))
}
// --- Function ---
func (g *generator) emitFunc(fn *ast.FuncDecl) {
g.code = nil
g.locals = make(map[string]int)
// Build local map. PRG is case-insensitive so all keys are
// uppercased here; every lookup site below must mirror this.
idx := 1
for _, p := range fn.Params {
g.locals[strings.ToUpper(p.Name)] = idx
idx++
}
for _, d := range fn.Decls {
if vd, ok := d.(*ast.VarDecl); ok && vd.Scope == ast.ScopeLocal {
for _, v := range vd.Vars {
g.locals[strings.ToUpper(v.Name)] = idx
idx++
}
}
}
for _, s := range fn.Body {
if vd, ok := s.(*ast.VarDecl); ok && vd.Scope == ast.ScopeLocal {
for _, v := range vd.Vars {
g.locals[strings.ToUpper(v.Name)] = idx
idx++
}
}
}
nLocals := idx - 1 - len(fn.Params)
// Emit LOCAL initializers
localIdx := len(fn.Params) + 1
for _, d := range fn.Decls {
vd, ok := d.(*ast.VarDecl)
if !ok || vd.Scope != ast.ScopeLocal {
continue
}
for _, v := range vd.Vars {
if v.Init != nil {
g.emitExpr(v.Init)
g.emit(hbrt.PcOpPopLocal)
g.emitU16(uint16(localIdx))
}
localIdx++
}
}
// Emit body
for _, s := range fn.Body {
g.emitStmt(s)
}
// Implicit return NIL
g.emit(hbrt.PcOpPushNil)
g.emit(hbrt.PcOpRetValue)
pf := &hbrt.PcodeFunc{
Name: fn.Name,
Code: make([]byte, len(g.code)),
Params: len(fn.Params),
Locals: nLocals,
}
copy(pf.Code, g.code)
g.mod.Funcs[strings.ToUpper(fn.Name)] = pf
}
// --- Statements ---
func (g *generator) emitStmt(stmt ast.Stmt) {
switch s := stmt.(type) {
case *ast.ReturnStmt:
if s.Value != nil {
g.emitExpr(s.Value)
g.emit(hbrt.PcOpRetValue)
} else {
g.emit(hbrt.PcOpPushNil)
g.emit(hbrt.PcOpRetValue)
}
case *ast.ExprStmt:
if assign, ok := s.X.(*ast.AssignExpr); ok {
g.emitAssign(assign)
} else if call, ok := s.X.(*ast.CallExpr); ok {
g.emitCallStmt(call)
} else {
g.emitExpr(s.X)
g.emit(hbrt.PcOpPop)
}
case *ast.IfStmt:
g.emitIf(s)
case *ast.DoWhileStmt:
g.emitDoWhile(s)
case *ast.ForStmt:
g.emitFor(s)
case *ast.ExitStmt:
// handled by loop
g.emit(hbrt.PcOpHalt) // placeholder
case *ast.QOutStmt:
g.emitQOut(s)
case *ast.VarDecl:
// Mid-function LOCAL
for _, v := range s.Vars {
if v.Init != nil {
g.emitExpr(v.Init)
if idx, ok := g.locals[strings.ToUpper(v.Name)]; ok {
g.emit(hbrt.PcOpPopLocal)
g.emitU16(uint16(idx))
} else {
g.emit(hbrt.PcOpPop)
}
}
}
default:
// Unsupported statement — record once per kind so the build
// pipeline can surface a clear "AST node not supported in
// pcode mode" warning instead of silently dropping the stmt.
g.noteUnsupported(fmt.Sprintf("stmt %T", stmt))
}
}
func (g *generator) emitIf(s *ast.IfStmt) {
g.emitExpr(s.Cond)
jumpFalse := g.emitJumpPlaceholder(hbrt.PcOpJumpFalse)
for _, stmt := range s.Body {
g.emitStmt(stmt)
}
if len(s.ElseIfs) > 0 || len(s.ElseBody) > 0 {
// `jumpEnds` collects every "branch-taken → skip rest of IF"
// jump that has to be patched once the entire IF chain ends.
// Original code only stashed each ELSEIF's terminator in `_ =
// jumpEnd2` and never patched it, so the offset stayed 0 and
// the runtime kept walking into the next ELSEIF's
// PcOpJumpFalse opcode as if it were data — silent bytecode
// corruption in pcode mode.
jumpEnds := []int{g.emitJumpPlaceholder(hbrt.PcOpJump)}
g.patchJump(jumpFalse)
for _, elif := range s.ElseIfs {
g.emitExpr(elif.Cond)
nextJump := g.emitJumpPlaceholder(hbrt.PcOpJumpFalse)
for _, stmt := range elif.Body {
g.emitStmt(stmt)
}
jumpEnds = append(jumpEnds, g.emitJumpPlaceholder(hbrt.PcOpJump))
g.patchJump(nextJump)
}
for _, stmt := range s.ElseBody {
g.emitStmt(stmt)
}
for _, j := range jumpEnds {
g.patchJump(j)
}
} else {
g.patchJump(jumpFalse)
}
}
func (g *generator) emitDoWhile(s *ast.DoWhileStmt) {
loopStart := g.pc()
for _, stmt := range s.Body {
g.emitStmt(stmt)
}
g.emitExpr(s.Cond)
// Jump back if true
g.emit(hbrt.PcOpJumpTrue)
offset := int32(loopStart - g.pc() - 4)
g.emitI32(offset)
}
func (g *generator) emitFor(s *ast.ForStmt) {
idx, ok := g.locals[strings.ToUpper(s.Var)]
if !ok {
return
}
// Init: var := start
g.emitExpr(s.Start)
g.emit(hbrt.PcOpPopLocal)
g.emitU16(uint16(idx))
// Detect step direction statically (matches gengo's emitFor):
// * no Step → +1, ascending
// * literal -N → descending
// * unary MINUS → descending
// Anything else (variable, expression) defaults to ascending.
// Without this we always emitted `var <= to`, which made `FOR
// 5 TO 1 STEP -1` exit on the first iteration; and we always
// stepped by hardcoded +1, which made `FOR i := 1 TO 10 STEP
// 2` summed 1+2+...+10 (55) instead of 1+3+5+7+9 (25).
negStep := false
if s.Step != nil {
if lit, ok := s.Step.(*ast.LiteralExpr); ok {
if lit.Kind == token.INT && len(lit.Value) > 0 && lit.Value[0] == '-' {
negStep = true
}
}
if un, ok := s.Step.(*ast.UnaryExpr); ok && un.Op == token.MINUS {
negStep = true
}
}
loopStart := g.pc()
// Comparison: ascending → var <= to; descending → var >= to.
g.emit(hbrt.PcOpPushLocal)
g.emitU16(uint16(idx))
g.emitExpr(s.To)
if negStep {
g.emit(hbrt.PcOpGreaterEq)
} else {
g.emit(hbrt.PcOpLessEq)
}
jumpOut := g.emitJumpPlaceholder(hbrt.PcOpJumpFalse)
// Body
for _, stmt := range s.Body {
g.emitStmt(stmt)
}
// Increment: var := var + step (re-evaluating step per iter is
// fine; constant-folding can hoist it later). Push var, push
// step, add, store back.
g.emit(hbrt.PcOpPushLocal)
g.emitU16(uint16(idx))
if s.Step != nil {
g.emitExpr(s.Step)
} else {
g.emit(hbrt.PcOpPushInt)
g.emitI64(1)
}
g.emit(hbrt.PcOpPlus)
g.emit(hbrt.PcOpPopLocal)
g.emitU16(uint16(idx))
// Jump back to comparison
g.emit(hbrt.PcOpJump)
g.emitI32(int32(loopStart - g.pc() - 4))
g.patchJump(jumpOut)
}
func (g *generator) emitQOut(s *ast.QOutStmt) {
sym := "QOUT"
if s.IsQQ {
sym = "QQOUT"
}
g.emitString(hbrt.PcOpPushSymbol, sym)
g.emit(hbrt.PcOpPushNil)
for _, expr := range s.Exprs {
g.emitExpr(expr)
}
g.emit(hbrt.PcOpFunction)
g.emitU16(uint16(len(s.Exprs)))
}
// --- Expressions ---
func (g *generator) emitExpr(expr ast.Expr) {
switch e := expr.(type) {
case *ast.LiteralExpr:
switch e.Kind {
case token.INT:
g.emit(hbrt.PcOpPushInt)
v := parseInt64(e.Value)
g.emitI64(v)
case token.DOUBLE:
g.emit(hbrt.PcOpPushDouble)
v := parseFloat64(e.Value)
g.emitF64(v)
case token.STRING:
g.emitString(hbrt.PcOpPushString, e.Value)
case token.TRUE:
g.emit(hbrt.PcOpPushTrue)
case token.FALSE:
g.emit(hbrt.PcOpPushFalse)
case token.NIL_LIT:
g.emit(hbrt.PcOpPushNil)
}
case *ast.IdentExpr:
upper := strings.ToUpper(e.Name)
if upper == "SELF" {
g.emit(hbrt.PcOpPushSelf)
return
}
// Locals are keyed case-insensitively. Look up via uppercase
// (also covers blocks: their params are stored ToUpper). The
// previous raw `e.Name` lookup missed any caller that wrote
// the identifier in different case from the declaration —
// `{|x| x * x }` invoked via Eval(b, 7) silently saw x=NIL.
if idx, ok := g.locals[upper]; ok {
g.emit(hbrt.PcOpPushLocal)
g.emitU16(uint16(idx))
} else if slot, ok := g.detached.resolve(upper); ok {
// Free variable that resolves to an enclosing-frame
// local — promote to a closure capture slot and read it
// from this block's Detached at runtime.
g.emit(hbrt.PcOpPushDetached)
g.emitU16(uint16(slot))
} else {
// Unknown at compile time → runtime memvar lookup. This
// makes `&(expr)` and the debugger's `p` see PRIVATEs
// (including the frame-local injection the debugger does).
g.emitString(hbrt.PcOpPushMemvar, upper)
}
case *ast.BinaryExpr:
g.emitExpr(e.Left)
g.emitExpr(e.Right)
g.emitBinaryOp(e.Op)
case *ast.UnaryExpr:
g.emitExpr(e.X)
switch e.Op {
case token.MINUS:
g.emit(hbrt.PcOpNegate)
case token.NOT:
g.emit(hbrt.PcOpNot)
}
case *ast.CallExpr:
g.emitCall(e)
case *ast.IIfExpr:
g.emitExpr(e.Cond)
jumpFalse := g.emitJumpPlaceholder(hbrt.PcOpJumpFalse)
g.emitExpr(e.True)
jumpEnd := g.emitJumpPlaceholder(hbrt.PcOpJump)
g.patchJump(jumpFalse)
g.emitExpr(e.False)
g.patchJump(jumpEnd)
case *ast.SelfExpr:
g.emit(hbrt.PcOpPushSelf)
case *ast.SendExpr:
g.emitExpr(e.Object)
if e.HasParens {
for _, arg := range e.Args {
g.emitExpr(arg)
}
g.emitString(hbrt.PcOpSend, strings.ToUpper(e.Method))
g.emitU16(uint16(len(e.Args)))
} else {
if _, isSelf := e.Object.(*ast.SelfExpr); isSelf {
// Replace with PushSelfField (pop the self we pushed)
g.code = g.code[:len(g.code)] // keep self on stack... actually use dedicated op
g.emit(hbrt.PcOpPop) // remove self
g.emitString(hbrt.PcOpPushSelfField, strings.ToUpper(e.Method))
}
}
case *ast.ArrayLitExpr:
for _, item := range e.Items {
g.emitExpr(item)
}
g.emit(hbrt.PcOpArrayGen)
g.emitU16(uint16(len(e.Items)))
case *ast.BlockExpr:
// `{|p| body }` — compile body to its own pcode buffer with
// the block's params occupying locals 1..len(Params). Free
// variables in the body that resolve to an enclosing-frame
// local are routed through Detached[i]: PcOpPushDetached /
// PcOpPopDetached. The block creator (PcOpPushBlock) records
// each captured slot's source-local index so the interpreter
// snapshots the enclosing value into Detached[i] at block
// construction time.
//
// Without this, every closure that referenced a caller local
// fell through to the runtime memvar lookup and silently
// returned NIL — silently breaking AEval/Eval/SqlScan
// predicates in --pcode / FRB-pcode mode.
savedCode := g.code
savedLocals := g.locals
savedDet := g.detached
g.code = nil
g.locals = make(map[string]int, len(e.Params))
g.detached = newDetachedMap(savedLocals) // capture-on-demand
for i, p := range e.Params {
g.locals[strings.ToUpper(p)] = i + 1
}
g.emitExpr(e.Body)
g.emit(hbrt.PcOpRetValue)
body := g.code
captureIdx := g.detached.sources() // src indices in capture order
g.code = savedCode
g.locals = savedLocals
g.detached = savedDet
g.emit(hbrt.PcOpPushBlock)
g.emitI32(int32(len(body)))
g.code = append(g.code, body...)
g.emitU16(uint16(len(e.Params))) // nParams
g.emitU16(uint16(len(captureIdx))) // nDetached
for _, srcIdx := range captureIdx {
g.emitU16(uint16(srcIdx))
}
case *ast.SeqExpr:
// Comma-separated expression list inside a code block:
// `{|| e1, e2, e3 }`. Evaluate each in order, pop intermediate
// results so only the last value remains. Same semantics as
// gengo's SeqExpr handler.
for i, item := range e.Items {
g.emitExpr(item)
if i < len(e.Items)-1 {
g.emit(hbrt.PcOpPop)
}
}
case *ast.HashLitExpr:
// `{ "k" => 1, ... }` — push each key+value pair, HashGen
// builds the hash from the top-N stack pairs. Without this
// case, the hash literal silently fell through to PushNil
// and any subsequent `h[key]` panicked at ArrayPush with
// "argument error (op: [])".
for i, k := range e.Keys {
g.emitExpr(k)
g.emitExpr(e.Values[i])
}
g.emit(hbrt.PcOpHashGen)
g.emitU16(uint16(len(e.Keys)))
case *ast.IndexExpr:
// arr[idx] — push array, push index, ArrayPush reads element.
// (ArrayPush is the "get" op; ArrayPop is the "set" op — names
// kept to match the Harbour stack-machine convention.)
// Without this case, indexed reads in pcode silently emitted
// PushNil via the default fallback, so `arr[i]` always
// returned NIL and `n + arr[i]` panicked at the +.
g.emitExpr(e.X)
g.emitExpr(e.Index)
g.emit(hbrt.PcOpArrayPush)
case *ast.PostfixExpr:
// `x++` / `x--` — read current value (becomes the expression
// result), apply Inc/Dec to the LOCAL slot, leave the
// pre-modification value on the stack so it round-trips
// correctly when used as an expression. As a statement the
// caller does Pop afterward.
// Without this case, postfix on pcode-mode silently emitted
// PushNil → `n++` was a no-op, breaking DO WHILE / FOR
// patterns that mutate the loop counter.
if id, isIdent := e.X.(*ast.IdentExpr); isIdent {
if idx, found := g.locals[strings.ToUpper(id.Name)]; found {
g.emit(hbrt.PcOpPushLocal)
g.emitU16(uint16(idx))
delta := int64(1)
if e.Op == token.DEC {
delta = -1
}
g.emit(hbrt.PcOpLocalAddInt)
g.emitU16(uint16(idx))
g.emitI32(int32(delta))
return
}
}
// Anything else (memvar, alias->field, arr[i]) — emit the
// expression as a no-op for now and document the gap.
g.emitExpr(e.X)
case *ast.AliasExpr:
// Pcode mode: only the M-> / MEMVAR-> namespace (memvar
// access) is wired up. The general workarea-alias form
// (`FOO->bar`, `(expr)->(body)`) needs new opcodes for
// alias dispatch + workarea context save/restore — until
// then it falls through to the generic NIL fallback so
// callers see "missing data" rather than crash.
if aliasIdent, ok1 := e.Alias.(*ast.IdentExpr); ok1 {
if fieldIdent, ok2 := e.Field.(*ast.IdentExpr); ok2 {
upper := strings.ToUpper(aliasIdent.Name)
if upper == "M" || upper == "MEMVAR" {
g.emitString(hbrt.PcOpPushMemvar, fieldIdent.Name)
return
}
}
}
g.emit(hbrt.PcOpPushNil)
case *ast.AssignExpr:
// Assignment as an expression — perform the store and leave
// the assigned value on the eval stack so a containing
// expression (e.g. SeqExpr inside `{|| acc += v, acc }`) can
// consume it. emitAssign by itself is statement-shaped and
// pops the value; we route through it then push the final
// value with a load matching the destination.
g.emitAssignAsExpr(e)
default:
// Record the unsupported kind and emit PushNil so the stack
// shape stays valid — callers can keep compiling but the
// build pipeline raises a clear pcode-mode-incompat warning.
g.noteUnsupported(fmt.Sprintf("expr %T", expr))
g.emit(hbrt.PcOpPushNil)
}
}
// emitAssignAsExpr emits an assignment whose value remains on the
// eval stack (expression context). Mirrors emitAssign's storage
// paths but appends a value-producing load so callers — typically
// SeqExpr items inside a code block body — can chain.
func (g *generator) emitAssignAsExpr(a *ast.AssignExpr) {
// Local / detached compound op.
if a.Op != token.ASSIGN {
if op, ok := compoundBinOp(a.Op); ok {
if ident, isIdent := a.Left.(*ast.IdentExpr); isIdent {
up := strings.ToUpper(ident.Name)
if idx, found := g.locals[up]; found {
g.emit(hbrt.PcOpPushLocal)
g.emitU16(uint16(idx))
g.emitExpr(a.Right)
g.emit(op)
g.emit(hbrt.PcOpDup) // keep value as expression result
g.emit(hbrt.PcOpPopLocal)
g.emitU16(uint16(idx))
return
}
if slot, ok := g.detached.resolve(up); ok {
g.emit(hbrt.PcOpPushDetached)
g.emitU16(uint16(slot))
g.emitExpr(a.Right)
g.emit(op)
g.emit(hbrt.PcOpDup)
g.emit(hbrt.PcOpPopDetached)
g.emitU16(uint16(slot))
return
}
}
}
}
// Plain assignment.
if ident, ok := a.Left.(*ast.IdentExpr); ok {
up := strings.ToUpper(ident.Name)
if idx, found := g.locals[up]; found {
g.emitExpr(a.Right)
g.emit(hbrt.PcOpDup)
g.emit(hbrt.PcOpPopLocal)
g.emitU16(uint16(idx))
return
}
if slot, ok := g.detached.resolve(up); ok {
g.emitExpr(a.Right)
g.emit(hbrt.PcOpDup)
g.emit(hbrt.PcOpPopDetached)
g.emitU16(uint16(slot))
return
}
}
// Self field setter — :=. PcOpSetSelfField consumes the value
// and pushes nothing; re-emit Right after to leave the value.
if send, ok := a.Left.(*ast.SendExpr); ok {
if _, isSelf := send.Object.(*ast.SelfExpr); isSelf {
g.emitExpr(a.Right)
g.emit(hbrt.PcOpDup)
g.emitString(hbrt.PcOpSetSelfField, strings.ToUpper(send.Method))
return
}
}
// Fallback: evaluate Right and leave it as the expression value
// (no destination wired). Mirrors the statement-form fallback
// minus the trailing Pop.
g.emitExpr(a.Right)
}
func (g *generator) emitBinaryOp(op token.Kind) {
switch op {
case token.PLUS:
g.emit(hbrt.PcOpPlus)
case token.MINUS:
g.emit(hbrt.PcOpMinus)
case token.STAR:
g.emit(hbrt.PcOpMult)
case token.SLASH:
g.emit(hbrt.PcOpDivide)
case token.PERCENT:
g.emit(hbrt.PcOpMod)
case token.POWER:
g.emit(hbrt.PcOpPower)
case token.EQ, token.EXEQ:
g.emit(hbrt.PcOpEqual)
case token.NEQ:
g.emit(hbrt.PcOpNotEqual)
case token.LT:
g.emit(hbrt.PcOpLess)
case token.GT:
g.emit(hbrt.PcOpGreater)
case token.LTE:
g.emit(hbrt.PcOpLessEq)
case token.GTE:
g.emit(hbrt.PcOpGreaterEq)
case token.AND:
g.emit(hbrt.PcOpAnd)
case token.OR:
g.emit(hbrt.PcOpOr)
case token.DOLLAR:
g.emit(hbrt.PcOpInString)
}
}
func (g *generator) emitCall(e *ast.CallExpr) {
if ident, ok := e.Func.(*ast.IdentExpr); ok {
// Peephole: FieldGet(<int literal>) → PcOpFieldGet <idx>.
// Skips the entire PushSymbol + Function + Frame + RTL path in
// favor of a direct workarea field access. Huge win for WHERE
// predicates on scan loops where this is the per-row hot op.
if strings.EqualFold(ident.Name, "FieldGet") && len(e.Args) == 1 {
if lit, ok := e.Args[0].(*ast.LiteralExpr); ok && lit.Kind == token.INT {
if n, err := strconv.Atoi(lit.Value); err == nil && n > 0 && n <= 0xFFFF {
g.emit(hbrt.PcOpFieldGet)
g.emitU16(uint16(n))
return
}
}
}
// Peephole: AllTrim(FieldGet(<int literal>)) → PcOpFieldTrim <idx>.
// Fuses the character-field CHAR-trim normalization that
// SqlExprToPrg auto-wraps into one opcode, saving one Function
// dispatch + one intermediate string allocation per row.
if strings.EqualFold(ident.Name, "AllTrim") && len(e.Args) == 1 {
if inner, ok := e.Args[0].(*ast.CallExpr); ok {
if innerIdent, ok := inner.Func.(*ast.IdentExpr); ok &&
strings.EqualFold(innerIdent.Name, "FieldGet") &&
len(inner.Args) == 1 {
if lit, ok := inner.Args[0].(*ast.LiteralExpr); ok && lit.Kind == token.INT {
if n, err := strconv.Atoi(lit.Value); err == nil && n > 0 && n <= 0xFFFF {
g.emit(hbrt.PcOpFieldTrim)
g.emitU16(uint16(n))
return
}
}
}
}
}
g.emitString(hbrt.PcOpPushSymbol, strings.ToUpper(ident.Name))
g.emit(hbrt.PcOpPushNil)
for _, arg := range e.Args {
g.emitExpr(arg)
}
g.emit(hbrt.PcOpFunction)
g.emitU16(uint16(len(e.Args)))
} else {
g.emitExpr(e.Func)
for _, arg := range e.Args {
g.emitExpr(arg)
}
g.emit(hbrt.PcOpDo)
g.emitU16(uint16(len(e.Args)))
}
}
func (g *generator) emitCallStmt(e *ast.CallExpr) {
if ident, ok := e.Func.(*ast.IdentExpr); ok {
g.emitString(hbrt.PcOpPushSymbol, strings.ToUpper(ident.Name))
g.emit(hbrt.PcOpPushNil)
for _, arg := range e.Args {
g.emitExpr(arg)
}
g.emit(hbrt.PcOpDo)
g.emitU16(uint16(len(e.Args)))
} else {
g.emitExpr(e.Func)
for _, arg := range e.Args {
g.emitExpr(arg)
}
g.emit(hbrt.PcOpDo)
g.emitU16(uint16(len(e.Args)))
}
}
func (g *generator) emitAssign(a *ast.AssignExpr) {
// Compound operators (+=, -=, *=, /=, %=, ^=) need to fold the
// existing left-hand value with the right. Without this they got
// emitted as plain `:=`, dropping the accumulator: `n += i`
// behaved as `n := i`. So the FOR loop reduce idiom (e.g.
// `n := 0 ; FOR i := 1 TO 10 ; n += i ; NEXT`) returned only
// the LAST iteration's increment.
if a.Op != token.ASSIGN {
op, ok := compoundBinOp(a.Op)
if ok {
if ident, isIdent := a.Left.(*ast.IdentExpr); isIdent {
up := strings.ToUpper(ident.Name)
if idx, found := g.locals[up]; found {
g.emit(hbrt.PcOpPushLocal)
g.emitU16(uint16(idx))
g.emitExpr(a.Right)
g.emit(op)
g.emit(hbrt.PcOpPopLocal)
g.emitU16(uint16(idx))
return
}
if slot, ok := g.detached.resolve(up); ok {
// Compound on a captured outer local — read/
// write through Detached so the closure mutates
// the captured snapshot.
g.emit(hbrt.PcOpPushDetached)
g.emitU16(uint16(slot))
g.emitExpr(a.Right)
g.emit(op)
g.emit(hbrt.PcOpPopDetached)
g.emitU16(uint16(slot))
return
}
}
}
}
if ident, ok := a.Left.(*ast.IdentExpr); ok {
up := strings.ToUpper(ident.Name)
if idx, found := g.locals[up]; found {
g.emitExpr(a.Right)
g.emit(hbrt.PcOpPopLocal)
g.emitU16(uint16(idx))
return
}
if slot, ok := g.detached.resolve(up); ok {
g.emitExpr(a.Right)
g.emit(hbrt.PcOpPopDetached)
g.emitU16(uint16(slot))
return
}
}
// Self field assignment
if send, ok := a.Left.(*ast.SendExpr); ok {
if _, isSelf := send.Object.(*ast.SelfExpr); isSelf {
g.emitExpr(a.Right)
g.emitString(hbrt.PcOpSetSelfField, strings.ToUpper(send.Method))
return
}
}
g.emitExpr(a.Right)
g.emit(hbrt.PcOpPop)
}
// compoundBinOp maps an `<op>=` token to the binary opcode it
// produces against the left-hand value. Returns false for ASSIGN
// (the caller should take the plain-store path).
func compoundBinOp(k token.Kind) (byte, bool) {
switch k {
case token.PLUSEQ:
return hbrt.PcOpPlus, true
case token.MINUSEQ:
return hbrt.PcOpMinus, true
case token.STAREQ:
return hbrt.PcOpMult, true
case token.SLASHEQ:
return hbrt.PcOpDivide, true
case token.PERCENTEQ:
return hbrt.PcOpMod, true
case token.POWEREQ:
return hbrt.PcOpPower, true
}
return 0, false
}
func parseInt64(s string) int64 {
var v int64
for _, c := range s {
if c >= '0' && c <= '9' {
v = v*10 + int64(c-'0')
}
}
if len(s) > 0 && s[0] == '-' {
v = -v
}
return v
}
func parseFloat64(s string) float64 {
var v float64
var dec float64
inDec := false
for _, c := range s {
if c == '.' {
inDec = true
dec = 0.1
continue
}
if c >= '0' && c <= '9' {
if inDec {
v += float64(c-'0') * dec
dec *= 0.1
} else {
v = v*10 + float64(c-'0')
}
}
}
if len(s) > 0 && s[0] == '-' {
v = -v
}
return v
}