Files
five/hbrt/pcode.go
CharlesKWON b1d89b9783 perf(FiveSql2): PcOpFieldTrim fused peephole — string WHERE at raw RDD parity
Second pcode peephole to match the one added for FieldGet(literal).
SqlExprToPrg auto-wraps CHAR column references with AllTrim() to
match SqlCmpEq's CHAR-padding trim semantics, so every string WHERE
predicate evaluates `AllTrim(FieldGet(n)) == 'literal'` per row.

Before this commit each of those per-row evaluations did:
  1. PushSymbol ALLTRIM
  2. PushSymbol FIELDGET → Function(1)  [1 RTL Frame]
  3. parseCharField → MakeString       [alloc: copies raw bytes]
  4. Function(1) → AllTrim RTL         [1 RTL Frame]
  5. strings.TrimSpace                  [alloc: new string]
  6. Return, continue

New opcode `PcOpFieldTrim <idx>` (0x47) fuses the two RTL calls into
a single opcode that:
  1. Calls FastFieldGetter directly (no Frame/Function dispatch).
  2. Walks the returned string with ASCII-space trim in place.
  3. Pushes `s[lo:hi]` — a sub-slice, no new allocation.
  4. Short-circuits back to the same string if no trim needed.

genpc recognizes the shape `AllTrim(FieldGet(<int-literal>))` in
emitCall and emits the fused opcode automatically — no SQL-side
API change. Matches the existing FieldGet peephole's shape.

Bench impact (50k rows, 3-run steady state, vs raw RDD baseline 6.2ms):

  String WHERE          before 7.9ms → after 6.2ms   1.00x (parity!)
  Numeric WHERE         6.9ms (unchanged)            1.11x
  No WHERE              9.1ms (unchanged)            1.47x

String WHERE is now at parity with the raw Harbour-style RDD scan.
Compared to session start (119ms), that's a 19x speedup.

Validation:
  - FiveSql2 43/43
  - Harbour compat 51/51
  - go test ./... ALL PASS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 14:03:03 +09:00

126 lines
3.6 KiB
Go

// Copyright (c) 2026 Charles KWON OhJun (charleskwonohjun@gmail.com)
// All rights reserved.
// Five pcode — stack-based bytecode for FRB interpreter mode.
// Each opcode maps 1:1 to a Thread method call, making the pcode
// a direct serialization of what gengo generates as Go code.
//
// Format: [opcode:1byte] [operands:variable]
// Strings: [len:uint16 LE] [bytes]
// Numbers: int64 = 8 bytes LE, float64 = 8 bytes LE
package hbrt
// Opcode definitions
const (
// Stack operations
PcOpNop byte = 0x00
PcOpPushNil byte = 0x01
PcOpPushTrue byte = 0x02
PcOpPushFalse byte = 0x03
PcOpPushInt byte = 0x04 // + int64 LE
PcOpPushDouble byte = 0x05 // + float64 LE (8 bytes)
PcOpPushString byte = 0x06 // + uint16 len + bytes
PcOpPushLocal byte = 0x07 // + uint16 index
PcOpPopLocal byte = 0x08 // + uint16 index
PcOpPop byte = 0x09
PcOpDup byte = 0x0A
// Arithmetic
PcOpPlus byte = 0x10
PcOpMinus byte = 0x11
PcOpMult byte = 0x12
PcOpDivide byte = 0x13
PcOpMod byte = 0x14
PcOpPower byte = 0x15
PcOpNegate byte = 0x16
// Comparison
PcOpEqual byte = 0x20
PcOpNotEqual byte = 0x21
PcOpLess byte = 0x22
PcOpGreater byte = 0x23
PcOpLessEq byte = 0x24
PcOpGreaterEq byte = 0x25
PcOpInString byte = 0x26
// Logical
PcOpAnd byte = 0x28
PcOpOr byte = 0x29
PcOpNot byte = 0x2A
// String
PcOpConcat byte = 0x2C // same as Plus for strings
// Flow control
PcOpJump byte = 0x30 // + int32 LE (relative offset)
PcOpJumpFalse byte = 0x31 // + int32 LE
PcOpJumpTrue byte = 0x32 // + int32 LE
PcOpReturn byte = 0x33
PcOpRetValue byte = 0x34
// Frame
PcOpFrame byte = 0x38 // + uint16 params + uint16 locals
PcOpEndProc byte = 0x39
// Function calls
PcOpPushSymbol byte = 0x40 // + uint16 string len + name
PcOpPushNilArg byte = 0x41 // push NIL for function self
PcOpFunction byte = 0x42 // + uint16 nArgs
PcOpDo byte = 0x43 // + uint16 nArgs
// Workarea field access — skips PushSymbol + Function dispatch
// for `FieldGet(n)` where n is a literal. Emitted by genpc as a
// peephole optimization. Operand: uint16 1-based field position.
PcOpFieldGet byte = 0x46
// `AllTrim(FieldGet(n))` peephole — fetch the field, trim the
// result in place, push one string. Skips two Function dispatches
// (FieldGet + AllTrim) and one intermediate string allocation
// per invocation. Operand: uint16 1-based field position.
PcOpFieldTrim byte = 0x47
// Self / OOP
PcOpPushSelf byte = 0x48
PcOpPushSelfField byte = 0x49 // + uint16 len + name
PcOpSetSelfField byte = 0x4A // + uint16 len + name
PcOpSend byte = 0x4B // + uint16 len + name + uint16 nArgs
// Array / Hash
PcOpArrayGen byte = 0x50 // + uint16 count
PcOpHashGen byte = 0x51 // + uint16 count
PcOpArrayPush byte = 0x52
PcOpArrayPop byte = 0x53
// Block
PcOpPushBlock byte = 0x58 // + uint32 codeLen + pcode bytes + uint16 nDetached
// Local operations
PcOpLocalAddInt byte = 0x60 // + uint16 index + int32 value
PcOpInc byte = 0x61
PcOpDec byte = 0x62
// Special
PcOpPopLogical byte = 0x70 // pop and store logical result
PcOpPushBool byte = 0x71 // + 1 byte (0 or 1)
// Line info (for debugging)
PcOpLine byte = 0xFE // + uint16 lineNo
PcOpHalt byte = 0xFF
)
// PcodeFunc represents a pcode-compiled function.
type PcodeFunc struct {
Name string
Code []byte // bytecode
Params int // number of parameters
Locals int // number of locals
}
// PcodeModule represents a compiled pcode module (multiple functions).
type PcodeModule struct {
Name string
Funcs map[string]*PcodeFunc
Strings []string // string constant pool
}