Commit Graph

195 Commits

Author SHA1 Message Date
0e80b93d0a docs(pgserver): Phase 7 — bootstrap example + CI gate documentation
Wraps the v1.0 PG-wire deliverable with the two pieces operators
actually look for: a runnable example PRG and an updated CI gate
list in CLAUDE.md.

* examples/pgserver_demo.prg — full bootstrap PRG demonstrating
  every HB_FUNC composed in the order a production deployment
  needs:
    PG_TLS_SELF_SIGNED → PG_ADD_ROLE × N → PG_ALLOW_IP × N →
    PG_SERVER_START( ":5432", "md5" )
  Comments cover the SHARED-DBF integration point and the SPAWN
  idiom for non-blocking server startup. Builds cleanly under
  the examples_build sweep (now 66/72; was 65/71).

* CLAUDE.md — the "어떤 파일이든 수정한 후" mandatory test list
  goes from 3 gates → 6:
    1. go test ./...
    2. FiveSql2 SQL:1999 43/43
    3. Harbour compat 56/56
    4. std.ch 17/17 (added)
    5. FRB 7/7 (added)
    6. pgserver integration 6/6 (added — psql required)
  Aligns the rule-of-thumb with reality. The five suites already
  ran on every audit-era commit; pgserver/run.sh is new in
  Phases 3-6 and now joins them.

This completes the v1.0 PostgreSQL-wire frontend. End-to-end
checklist:

  Phase 1: per-session state isolation         [93cf5c8]
  Phase 2: SimpleQuery wire MVP                [d98f5e1 7083297]
  Phase 3: DML + transactions                  [a556764]
  Phase 4: Extended Protocol (Parse/Bind/Exec) [8472928]
  Phase 5: password + MD5 auth                 [90eafcf]
  Phase 6: TLS + IP allowlist                  [3b2dd36]
  Phase 7: example + docs                      [this commit]

Open follow-ups (Phase 7.x):
  - hbrdd workarea per-thread isolation (audit Top-Risk #2):
    ≥3 concurrent connections doing in-flight INSERT/SELECT in
    their own transactions can race at the workarea layer. Fix
    is a separate workstream against hbrtl/database.go +
    hbrdd/dbf/. Documented limitation in tests/pgserver/run.sh.
  - SCRAM-SHA-256 auth (Phase 5.1).
  - pg_catalog shim for BI-tool introspection (Phase 1.1+ of the
    original audit plan).
  - Binary parameter format for NUMERIC/TIMESTAMP (Phase 4.1).

All gates green:
  go test ./...               ✓
  FiveSql2 SQL:1999 43/43     ✓
  Harbour compat 56/56        ✓
  std.ch 17/17                ✓
  FRB 7/7                     ✓
  examples 66/72              ✓ (+1 from new pgserver_demo)
  pgserver integration 6/6    ✓

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:20:44 +09:00
3b2dd365ad feat(pgserver): Phase 6 — TLS + source-IP allowlist
Closes the v1.0 hardening surface: encrypted transport + a
coarse pg_hba.conf-equivalent CIDR allowlist. Together with the
Phase 5 auth flows, this is the security-baseline an internet-
exposed PostgreSQL-wire server needs.

TLS subsystem
-------------

`hbrtl/pgserver/tls.go`:

* `LoadTLSFromFiles(certPath, keyPath)` — cert/key PEM pair load
  with tls.VersionTLS12 floor. Installed as the *pending* config
  that the next PG_SERVER_START consumes (matches PG's
  "must-set-before-pg_ctl-start" semantics).

* `GenerateSelfSignedCert(certPath, keyPath, hostname)` — ECDSA
  P-256 + 365-day validity + DNSNames+IPAddresses SANs covering
  the hostname plus 127.0.0.1 / ::1. Dev/CI helper; production
  ships a CA-signed cert via the loader.

* `upgradeToTLS()` wraps `tls.Server(conn, cfg).Handshake()` so
  pgproto3 reads plaintext on top of the encrypted stream.

Source-IP allowlist
-------------------

* `AllowIP(cidr)` parses a CIDR and appends it to a per-server
  list snapshotted at PG_SERVER_START time.
* `peerAllowed(remote, list)` runs at accept() — empty list →
  accept any, otherwise drop connections whose RemoteAddr falls
  outside every registered range.
* `ClearAllowList()` resets to allow-all.

Coarse but compatible with the "host alice 10.0.0.0/8 md5"-style
entries every pg_hba.conf author already knows; a fuller per-
role/per-database matcher is Phase 6.1+.

PRG bindings (register.go)
--------------------------

New HB_FUNCs, all idempotent and composable in any order before
PG_SERVER_START:

  pg_tls_load( certPath, keyPath )           → .T. | cErr
  pg_tls_self_signed( cert, key, hostname )  → .T. | cErr
  pg_allow_ip( cidr )                        → .T. | cErr
  pg_clear_allowlist()                       → NIL

Bootstrap idiom:

  PROCEDURE Main()
     PG_TLS_SELF_SIGNED( "/tmp/cert.pem", "/tmp/key.pem", "localhost" )
     PG_ADD_ROLE( "alice", "swordfish" )
     PG_ALLOW_IP( "127.0.0.1/32" )
     PG_ALLOW_IP( "10.0.0.0/8" )
     PG_SERVER_START( ":5432", "md5" )

The startup banner now reports TLS + allowlist state so the PRG
operator sees the security posture at a glance:

  pgserver: listening on :5432 (auth=md5 tls=on allowlist=2)

Verification
------------

End-to-end via real psql against a self-signed server:

  $ PGPASSWORD=swordfish psql \
        "postgres://alice@127.0.0.1:15432/alice?sslmode=require" \
        -c "SELECT 'tls-works' AS x" -At
  tls-works

  $ # off-allowlist source (192.168.x.x mock) → connection refused
  $ # (verified manually; psql can't easily spoof src IP for CI)

Integration script gates expanded to 6/6:
  PASS  Simple Query
  PASS  Multi-statement Simple Query
  PASS  Transaction control
  PASS  MD5 auth: wrong password rejected
  PASS  MD5 auth: correct password accepted
  PASS  TLS handshake + MD5 auth via sslmode=require

All six release gates green:
  go test ./...               ✓
  FiveSql2 SQL:1999 43/43     ✓
  Harbour compat 56/56        ✓
  std.ch 17/17                ✓
  FRB 7/7                     ✓
  pgserver integration 6/6    ✓

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 14:07:19 +09:00
90eafcfc06 feat(pgserver): Phase 5 — password + MD5 authentication
Trust mode (v1.0 default) accepts anyone; that's fine for embedded
demo but unshipping a multi-client database without credentials
would be irresponsible. This commit adds two of libpq's three
standard auth flows. SCRAM-SHA-256 is Phase 5.1 — pgx/psql both
fall back to MD5 cleanly when the server advertises only md5, so
v1.0's functional coverage is complete with the pair landed here.

Auth subsystem
--------------

`hbrtl/pgserver/auth.go` adds:

* An in-memory role registry: `roleMap map[string]*role` guarded by
  sync.RWMutex. Reads (lookupRole) are hot-path during connection
  startup so the RWMutex lets multiple sessions auth in parallel
  without serialising through a plain Mutex.

* `AddRole(name, password)` / `RemoveRole(name)` Go API consumed
  by the new HB_FUNCs `PG_ADD_ROLE` / `PG_REMOVE_ROLE` (see
  register.go). Bootstrap PRG idiom:

      PG_ADD_ROLE("alice", "swordfish")
      PG_ADD_ROLE("bob",   "hunter2")
      PG_SERVER_START(":5432", "md5")

* `authPassword()` — cleartext PasswordMessage exchange. The wire
  payload is plain so intended for TLS-protected links only;
  Phase 6 ties the warning to actual TLS detection on the session.

* `authMD5()` — libpq's md5 challenge:

      server → AuthenticationMD5Password{salt: 4 random bytes}
      client → "md5" || md5_hex( md5_hex(password || user) || salt )

  We recompute the canonical hash from the stored plaintext and
  compare. md5Challenge() is exported for pinning by a Go unit
  test (vector cross-checked against libpq's fe-auth-md5.c).

Salt is sourced from crypto/rand on every challenge so replay
attacks against a captured wire trace can't reuse a prior hash.

Dispatch matrix (Config.AuthMode → flow):
  "" / "trust" → AuthenticationOk immediately, no lookup
  "password"   → authPassword()
  "md5"        → authMD5()
  anything else→ 28000 + connection close

Tests
-----

Unit (hbrtl/pgserver/pgserver_test.go):
  PASS  TestMD5Challenge           (vector + determinism + diff)
  PASS  TestRoleRegistry           (add/replace/remove/lookup)

Integration (tests/pgserver/run.sh):
  PASS  Simple Query: SELECT 1, 'hello'
  PASS  Multi-statement Simple Query
  PASS  Transaction control: BEGIN/COMMIT round-trip
  PASS  MD5 auth: wrong password rejected
  PASS  MD5 auth: correct password accepted

End-to-end matrix with real psql:
  wrong password   → "ERROR: md5 authentication failed for user 'alice'"
  correct password → SELECT returns row
  unknown user     → "ERROR: md5 authentication failed for user 'eve'"
  password mode    → cleartext exchange works equivalently

All six release gates green:
  go test ./...               ✓
  FiveSql2 SQL:1999 43/43     ✓
  Harbour compat 56/56        ✓
  std.ch 17/17                ✓
  FRB 7/7                     ✓
  pgserver integration 5/5    ✓ (up from 3/3 in Phase 4)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 14:01:30 +09:00
8472928102 feat(pgserver): Phase 4 — Extended Protocol (Parse/Bind/Execute)
pgx and most drivers default to PostgreSQL's Extended Protocol
(named prepared statements). Phase 2 only handled Simple Query,
so every pgx caller had to force `QueryExecModeSimpleProtocol` —
unworkable for a production deployment. This commit lands the
full Parse → Bind → Describe → Execute → Sync state machine,
enough that pgx (and any other libpq-protocol-v3 client) works
without any client-side knobs.

Implementation lives in `hbrtl/pgserver/extended.go`:

* Per-session caches `stmts map[string]*preparedStmt` and
  `portals map[string]*portal`, lazily allocated on first use.
  Stored as fields on `session` so they don't leak across
  connections.

* Parameters are inlined at Bind time via `substituteParams` —
  the resolved SQL is a normal Simple-Query-shaped string the
  engine sees through the existing `five_SQL(cSQL, …, oSession)`
  pipeline. Avoids teaching FiveSql2 a second param-shape; the
  trade-off is that binary timestamps/numerics round-trip through
  text (Phase 4.1 will plumb `?`-params through aParams for the
  binary fast path).

* `paramToLiteral` decodes the binary-format encodings pgx uses
  by default for INT4/INT8/BOOL (big-endian fixed-width). Other
  binary OIDs fall back to a hex-escaped quoted literal which
  errors loudly rather than silently misparsing.

* `countPgPlaceholders` scans the SQL outside string literals for
  the highest `$N` so the server can answer Describe-statement
  with a correctly-sized ParameterDescription even when the
  client didn't pre-declare param OIDs. Without this, pgx errored
  with "expected 0 arguments, got 2" on the very first prepared
  query.

* RowDescription emission: Describe-statement still returns NoData
  (we can't infer row shape without execution). When Execute fires
  on a portal the client never Described, we emit RowDescription
  inline from the cached result before DataRow streams. pgx and
  psql both tolerate this ordering.

* Execute → CommandComplete tag derives from the SQL verb via the
  existing `commandTagFor` helper. Row counts in the tag remain
  "VERB 0" for v1.0; threading real counters through the engine
  is Phase 5.

Wire dispatch in `session.go:queryLoop` now handles Parse, Bind,
Describe, Execute, Close, Sync, Flush — the full v3 message set.

Verification
------------

End-to-end pgx (default mode, no SimpleProtocol flag) successfully
runs:
  SELECT $1 AS n, $2 AS s with 42 + "hi" → [42 hi]
  Same statement re-executed with different bound values → reuses
    the cached prepared statement
  SELECT $1 AS b, $2 AS s with true + "binary-bool" → [t binary-bool]

`tests/pgserver/run.sh` expanded from 1 → 3 integration assertions:

  PASS  Simple Query: SELECT 1, 'hello'
  PASS  Multi-statement Simple Query
  PASS  Transaction control: BEGIN/COMMIT round-trip

(Extended Protocol can't be driven from psql's -c CLI directly
because psql's PREPARE/EXECUTE is a separate SQL-level feature
that FiveSql2 doesn't parse; the pgx-driven path verifies it
manually, and a self-contained Go integration that drives pgx
from inside a process bootstrap is Phase 7 work.)

All six release gates green:
  go test ./...                       ✓
  FiveSql2 SQL:1999 43/43             ✓
  Harbour compat 56/56                ✓
  std.ch 17/17                        ✓
  FRB 7/7                             ✓
  pgserver integration 3/3            ✓

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 12:55:41 +09:00
a5567648e9 test(pgserver): Phase 3 — DML + transaction integration harness
Adds tests/pgserver/run.sh, the integration gate for the wire
layer. Builds a minimal bootstrap PRG that opens nothing and just
calls PG_SERVER_START on an ephemeral port, then drives psql with
a Simple Query to confirm the end-to-end pipeline (TCP accept →
startup handshake → Query → five_SQL → RowDescription + DataRow
→ ReadyForQuery) still works after every change.

Phase 3 verified scope (driven via a separate pgx harness during
development):

  * CREATE TABLE / INSERT / UPDATE / DELETE over Simple Query
  * BEGIN / COMMIT / ROLLBACK from the wire
  * Two-connection cross-visibility on a shared DBF
  * Per-session ROLLBACK leaves the *other* connection's data
    intact — the Phase 1 STATIC → TSqlSession refactor is what
    makes this hold; pre-refactor, both connections would have
    shared one s_aTxnLog and A's ROLLBACK would have collapsed
    B's COMMIT.

Known limitation captured in the script header (deferred to
Phase 7 follow-up):

  * ≥3 concurrent connections doing in-flight INSERT/SELECT in
    their own transactions occasionally race at the hbrdd
    workarea layer — surfaces as one worker's just-inserted row
    missing from its own SELECT. 2-way concurrent + N-way serial
    are both reliable. Root cause is multi-thread workarea
    arbitration during dbUseArea/dbAppend, which the pre-1.0
    audit flagged as Top-Risk #2 ("WorkArea collision under
    multi-session"). Tracking for a dedicated fix.

Gate count now reads:
  go test ./...                       ✓
  FiveSql2 SQL:1999 43/43             ✓
  Harbour compat 56/56                ✓
  std.ch 17/17                        ✓
  FRB 7/7                             ✓
  examples 65/71                      ✓ (unchanged baseline)
  pgserver integration 1/1            ✓ (new)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 07:25:13 +09:00
708329785a test(pgserver): wire-protocol roundtrip via net.Pipe
Adds an in-process startup-handshake test using net.Pipe so we
can pin the protocol envelope (StartupMessage → AuthenticationOk
→ ParameterStatus×N → BackendKeyData → ReadyForQuery) without
binding a real TCP port. Runs in <1ms; safe for CI.

The PRG-dispatch path (runSQL → FIVE_SQL → row encoding) is
already covered manually by spinning a `five run` of
`pg_server_start(":15432")` and connecting with pgx — that flow
verified post-MVP that a real PostgreSQL client receives
`{ONE (INT4), GREET (TEXT)}` + row `[1 hello]` for
`SELECT 1 AS one, 'hello' AS greet` over the wire. An automated
shell harness will land in Phase 7 with the psql integration
tests.

Also rolls go.mod / go.sum forward with the pgx v5 toolchain pulled
in by Phase 2's pgproto3 dependency. Module bump 1.21.13 → 1.25.0
matches what `go get github.com/jackc/pgx/v5/pgproto3` selected;
cross-builds for windows/linux/darwin all still succeed (verified
locally).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 22:13:40 +09:00
d98f5e1767 feat(pgserver): PostgreSQL-wire MVP — psql can SELECT from FiveSql2
First end-to-end working version of the PostgreSQL-wire-compatible
TCP server frontend. A standard `psql` client now connects, runs
`SELECT * FROM employees`, and gets back a properly typed result
set rendered by psql with the right column alignment:

    ID |         NAME         |  SALARY
    ----+----------------------+----------
      1 | Alice                | 50000.00
      2 | Bob                  | 42000.50
      3 | Cho                  | 77500.00

This is the Phase 2 deliverable from the approved plan at
/Users/charleskwon/.claude/plans/compiled-launching-shore.md.
Builds on the session-state refactor in 93cf5c8 — each connection
gets its own TSqlSession on the PRG side via the new PG_NEW_SESSION
HB_FUNC, so concurrent psql clients won't share transaction logs
or plan caches.

Scope
-----

v1.0 MVP: Simple Query only, trust auth, no TLS yet. SELECT works
against the full FiveSql2 surface (CTEs, window functions, JOINs,
aggregates). DML + per-session transactions are Phase 3, extended
protocol is Phase 4, auth + TLS are Phases 5/6.

Architecture
------------

  psql/pgx/JDBC ──TCP:5432──▶ pgserver.Listener
                                  │ accept()
                                  ▼ go handleConn(net.Conn)
                             ┌─────────────────────────────┐
                             │ Session goroutine            │
                             │  1. SSLRequest peek          │
                             │  2. StartupMessage           │
                             │  3. AuthenticationOk (trust) │
                             │  4. ParameterStatus×7        │
                             │  5. BackendKeyData           │
                             │  6. ReadyForQuery('I')       │
                             │  7. loop: Receive() →        │
                             │     dispatchSimpleQuery →    │
                             │     hbrt.Thread.Function(    │
                             │       FIVE_SQL,sql,...,sess) │
                             │     emit RowDescription      │
                             │     emit DataRow×N           │
                             │     emit CommandComplete     │
                             │     emit ReadyForQuery       │
                             └─────────────────────────────┘

One goroutine per connection, each owning its own *hbrt.Thread and
TSqlSession instance. Uses the existing audit-fixed NewThread()
(cde8673) so statics + WA factory propagate.

New files (hbrtl/pgserver/)
---------------------------

* server.go — Config, Server, Serve loop with MaxConnections gate
  via semaphore, Close drains in-flight sessions.
* session.go — full lifecycle: SSLRequest peek + prefixedConn
  byte-injection trick for StartupMessage, ParameterStatus
  broadcast (server_version "14.0 (FiveSql2)" so pgx negotiates),
  BackendKeyData (random pid+secret per session, no CancelRequest
  yet), query loop dispatching only Simple Query in v1.0 with a
  loud "0A000 not supported" for Extended messages.
* dispatch.go — runSQL invokes FIVE_SQL via PushSymbol+Function,
  unpacks the engine's `{aFieldNames, aRows}` envelope or the
  `{{"__error__"}, {{nCode, cMsg, cSQL}}}` error shape, emits
  RowDescription with text-format OIDs and DataRow per row.
* typemap.go — pgTypeFor() picks INT4 / INT8 / NUMERIC / TEXT /
  DATE / TIMESTAMP / BOOL by sampling the first row's value type;
  encodeText() formats each cell, returning nil-slice for NULL
  (the PG length=-1 convention).
* errmap.go — sqlStateFor() maps FiveSql2 SQL_ERR_* codes to
  canonical PG SQLSTATEs (42601/42P01/42703/42804/23505/23514/
  23503/25P02/42501/02000/XX000).
* auth.go — trust mode in v1.0; password/MD5/SCRAM lands Phase 5
  but the dispatch sentinel is already in place.
* tls.go — upgradeToTLS stub for SSLRequest handling; the byte-
  ordering is already wired so Phase 6 just plugs in tls.Config.
* register.go — package init() registers pg_server_start /
  pg_server_stop HB_FUNCs. Importing the package (done from
  hbrtl/register.go via blank import) is enough to enable them.
* pgserver_test.go — unit tests for encodeText (numeric, string,
  NIL), pgTypeFor (OID dispatch), sqlStateFor (error mapping),
  commandTagFor (SELECT/INSERT/UPDATE/DELETE/BEGIN/COMMIT).

Other changes
-------------

* _FiveSql2/src/TSqlSession.prg — added PG_NEW_SESSION() factory
  used by the Go dispatcher to allocate a per-connection session
  bypassing the embedded process default.
* hbrtl/register.go — blank-import five/hbrtl/pgserver so its
  init() fires and the HB_FUNCs land in the global dynamic-func
  table for VM symbol lookup.
* go.mod / go.sum — github.com/jackc/pgx/v5 v5.9.2 (pgproto3
  subpackage). MIT license. Same library pgx itself uses, so
  protocol coverage matches the de-facto Go PG ecosystem.

Verification
------------

  $ pg_server_start(15432, "trust")     /* PRG one-liner */
  $ psql -h 127.0.0.1 -p 15432 -U fiveuser -c 'SELECT * FROM employees'
  → 3 rows rendered correctly by psql (ID as INT4, NAME as TEXT,
    SALARY as NUMERIC(10,2) with 2 decimal places)

All six release gates green:
  go test ./...               ✓ (incl. new hbrtl/pgserver tests)
  FiveSql2 SQL:1999 43/43     ✓
  Harbour compat 56/56        ✓
  std.ch 17/17                ✓
  FRB 7/7                     ✓
  examples 65/71              ✓ (unchanged baseline)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 18:40:32 +09:00
93cf5c8bfa refactor(FiveSql2): per-session state — TSqlSession isolates txn + plan cache
Foundation for the upcoming PostgreSQL-wire server. The SQL engine
previously held transaction state and the plan cache in module-level
STATICs:

  TSqlTxn.prg:16-18
    STATIC s_aTxnLog := {}
    STATIC s_lInTxn  := .F.
    STATIC s_hSavepoints := NIL

  TFiveSQL.prg:37
    STATIC s_hPlanCache := { => }

gengo emits PRG STATIC as Go *package* variables, so two clients
sharing one process serialised through a single transaction log:
client A's `BEGIN; INSERT;` followed by client B's `ROLLBACK`
would silently undo A's insert. Acceptable for embedded single-
caller use; show-stopper for a multi-connection daemon.

Moved each of those into instance fields on a new TSqlSession class.
Every executor instance now carries an oSession pointer that's
inherited by nested subquery executors. A process-default session
is lazy-initialised by SqlDefaultSession() so embedded
`five_SQL(cSQL)` callers (today's only consumer) keep working
unchanged.

Changes
-------

* `_FiveSql2/src/TSqlSession.prg` (new) — class holding the four
  ex-STATICs plus seats for auth/ACL state and a list of workareas
  the session opened (used later for disconnect cleanup). Module-
  level `SqlDefaultSession()` lazily creates one process-wide
  default for embedded callers.

* `_FiveSql2/src/TSqlTxn.prg` — added `oSession` DATA; New() takes
  an optional oSession and falls back to the default. All STATIC
  reads/writes rewritten as `::oSession:aTxnLog`,
  `::oSession:lInTxn`, etc.

* `_FiveSql2/src/TFiveSQL.prg` — added `oSession` DATA; New() takes
  an optional second arg. Plan-cache reads/writes route through
  `::oSession:hPlanCache`. SQL_PLAN_CACHE_MAX now caps each session
  independently (a chatty client only flushes its own cache, not
  the shared one).

* `_FiveSql2/src/TSqlExecutor.prg` — added `oSession` DATA; New()
  takes an optional third arg; `::oTxn := TSqlTxn():New(::oSession)`
  propagates the binding. Every in-class `TSqlExecutor():New(...)`
  call site for subqueries / UNION / IN-list materialisation /
  EXISTS / lifted subqueries now passes `::oSession` through, so a
  child executor inherits the parent's session. Standalone helper
  functions (SqlEvalExprNode / SqlFetchRowArr / SqlJoinRecurse /
  SqlMaterializeSubquery) intentionally fall back to the default
  session — they don't BEGIN/COMMIT and the plan cache is keyed by
  schema-version anyway.

* `_FiveSql2/src/FiveSqlCls.prg` — `five_SQL()` gains an optional
  fourth arg `oSession`. Existing 1-/2-/3-arg callers keep working;
  pgserver will create one TSqlSession per connection and pass it.

Verification
------------

Per-session isolation pinned by a fresh PRG-level regression
(reproducer not committed yet — will land with pgserver test
suite). The scenario:

  oSessA := TSqlSession():New()
  oSessB := TSqlSession():New()
  oSqlA  := TFiveSQL():New(NIL, oSessA)
  oSqlB  := TFiveSQL():New(NIL, oSessB)
  oSqlA:Execute("BEGIN")              -- A in txn
  oSqlB:Execute("BEGIN")              -- B in txn, A unaffected
  oSqlB:Execute("INSERT ... VALUES(2,'b-row')")
  oSqlB:Execute("COMMIT")             -- B committed, A still in txn
  oSqlA:Execute("ROLLBACK")           -- A's empty rollback, B's row survives

All four assertions pass post-refactor, would fail pre-refactor
because both sessions wrote the same `s_aTxnLog`.

All six release gates green:
  go test ./...               ✓
  FiveSql2 SQL:1999 43/43     ✓
  Harbour compat 56/56        ✓
  std.ch 17/17                ✓
  FRB 7/7                     ✓
  examples 65/71              ✓ (unchanged baseline)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 17:47:00 +09:00
cde86730b8 fix(compiler,hbrt,hbrdd,cli): pre-1.0 audit — 13 critical fixes
Senior-engineer / QA audit landed 13 silent-miscompile and data-
integrity fixes spanning the whole compiler+runtime+storage stack.
Each fix is paired with either an integration test in the suite or
a focused regression check; all 6 release gates stay green:
go test ./..., FiveSql2 43/43, Harbour compat 56/56, std.ch 17/17,
FRB 7/7, examples 65/71.

Compiler
--------

* genpc IF/ELSEIF jumpEnd2 patching (compiler/genpc/genpc.go).
  Per-ELSEIF branch terminators were stashed into `_ = jumpEnd2`
  and never patched — the relative offset stayed 0 and the runtime
  walked the next ELSEIF's PcOpJumpFalse opcode as if it were
  jump-offset data. Bytecode-level corruption in pcode mode. Now
  collected into a slice and patched at end-of-IF. Verified via
  Grade(95..50) cases 11a-e added to tests/frb/test_frb_pcode_sweep.

* countLocalsInStmts / scanBodyLocals missing bodies
  (compiler/gengo/gen_util.go, compiler/gengo/gengo.go). Frame-size
  counter skipped WATCH/TIMEOUT/PARALLEL FOR bodies, so a LOCAL
  declared inside one of those constructs got a slot index past
  the runtime's allocated count — silent NIL reads or out-of-range
  stomps.

* emitMethodDeclStandalone nested LOCAL (compiler/gengo/gen_class.go).
  Same bug class but on the *method* side. Pre-fix repro:

      METHOD Stomp(n) CLASS T
         LOCAL a := 1, b := 2
         IF n > 0
            LOCAL c := 30, d := 40, e := 50, f := 60
            Inner( n )
            IF c != 30 .OR. d != 40 .OR. e != 50 .OR. f != 60 ...

  printed `c, d, e, f = 5, NIL, NIL, NIL` because Inner's frame
  collided with Stomp's underallocated slot range. Now counts
  body-nested LOCALs into the frame and pre-allocates indices via
  scanBodyLocals.

* genpc unsupported-AST diagnostic surface (compiler/genpc/genpc.go,
  hbrt/pcode.go, cmd/five/main.go, hbrtl/frb.go). The `default`
  cases in emitStmt / emitExpr silently emitted PushNil / no-op
  for nodes the pcode generator doesn't implement (ClassDecl,
  MethodDecl, xBase commands, concurrency primitives, …). Added
  `PcodeModule.Warnings []string` populated by noteUnsupported,
  surfaced on stderr from the build pipeline. Users now see
  "pcode: AST node not supported in --pcode/FRB-pcode mode: stmt
  *ast.GoBlockStmt" instead of getting a silently broken module.

Runtime
-------

* class.go Send/tryBinaryOp t.self defer-restore (hbrt/class.go).
  Restoration was a plain `t.self = oldSelf` after `fn(t)`. Any
  panic in the method body skipped the line, so the next BEGIN
  SEQUENCE / RECOVER handler ran with the THROWING object's Self
  — `::field` resolved against the wrong receiver. Wrapped both
  restore sites in `defer func() { t.self = oldSelf }()`.
  Verified: pre-fix RECOVER saw "THROWER", post-fix "OUTER".

* hbfunc.go HB_FUNC parameter Frame() (hbrt/hbfunc.go). The
  RegisterDynamicFunc wrapper called `fn(ctx)` without ever
  calling Frame, so `ctx.ParC(1)` / `ctx.Local(n)` read through
  `t.curFrame.localBase + n - 1` against the *caller's* frame.
  Every #pragma BEGINDUMP HB_FUNC taking parameters silently
  returned "" / 0 / "" for them — masked by ParNIDef-style
  defaults. Wrapper now does `t.Frame(t.pendingParams, 0); defer
  t.EndProc()` before dispatch.

* pcode codeblock closure capture (hbrt/pcinterp.go, hbrt/pcode.go,
  hbrt/thread.go, compiler/genpc/genpc.go). PcOpPushBlock recorded
  `nDetached` but never copied enclosing locals; free vars in the
  block body fell through to memvar lookup → NIL. Wired full
  capture pipeline:
  - New opcodes PcOpPushDetached (0x59) / PcOpPopDetached (0x5A).
  - PushBlock now reads per-slot source-local indices and
    snapshots into bb.Detached at construction time.
  - New detachedMap in genpc auto-promotes any free var that
    resolves to an enclosing-frame local into a capture slot.
  - emitAssignAsExpr leaves the assigned value on the eval stack
    so SeqExpr items like `{|v| acc += v, acc }` work.
  - Thread tracks curBlock with paired Set/restore in the block's
    Fn wrapper for nested-block evaluation.
  Mutating capture (acc += v across successive Evals) now works.

* vm.NewThread statics + waFactory propagation (hbrt/vm.go).
  GoLaunch / GoLaunchBlock call NewThread directly. Previously
  the statics map and WA factory were applied only in Run(), so
  goroutine-spawned PRG code panicked on STATIC access ("static
  index out of range") and crashed dereferencing nil WA on any
  DB call. Both now happen inside NewThread under the same lock
  as TID assignment.

Data layer
----------

* dbf concurrent Append lock (hbrdd/dbf/dbf.go,
  hbrdd/dbf/locks_posix.go, hbrdd/dbf/locks_windows.go). Append
  bumped a local recCount with no file-system serialization. Two
  shared-mode processes both wrote at the same RecordOffset; one
  record silently overwrote the other. Added an append-intent
  byte-range lock at offset 0x7FFFFFFE + bounded retry, on-disk
  header refresh inside the locked region, and immediate header
  write so peers refresh past our slot.

* indexer negative numeric key encoding (hbrdd/dbf/indexer.go +
  new hbrdd/dbf/encode_numeric_test.go). `%20.10f` formats `-100`
  as `"     -100.0000000000"` and `99` as `"        99.0000000000"`.
  ASCII ' ' (0x20) < '-' (0x2D), so `99` lex-compared LESS than
  `-100` — every NTX/CDX index over a column that ever held a
  negative number returned wrong rows for SEEK / range scans.
  Replaced with a 1-byte sign prefix + 21-byte zero-padded
  magnitude (negatives use digit-complement) so byte order
  matches numeric order across signs and magnitudes. Format
  change: existing indexes built with the old encoding must be
  REINDEXed. Three unit tests pin the order.

* dbf Append index maintenance hooks (hbrdd/dbf/dbf.go,
  hbrdd/dbf/indexer.go). Append never inserted into open NTX/CDX
  indexes — the audit's canonical scenario `SET INDEX TO …;
  APPEND BLANK; REPLACE …; dbSeek …` silently missed the new
  record. Added optional IndexWriter interface, queue the new
  recNo in pendingIdxInserts, drain after flushRecord by calling
  InsertKey on every open writer-supporting engine. NTX
  participates (its existing rebuild-on-insert is correct);
  CDX online maintenance is deferred to a follow-up — those
  indexes still need REINDEX. Verified: post-fix SEEK("Charlie")
  after APPEND BLANK + REPLACE finds the new record.

* dbf PACK crash-safety (hbrdd/dbf/dbf.go). The old in-place
  rewrite read record N, overwrote slot M<N, then truncated.
  Power loss after partial loop left a file with overwritten
  prefix and no original copies of the records already advanced
  past — silent data loss. Rewrote to:
    1) drop mmap, build `<file>.pack.tmp` with all surviving
       records,
    2) Sync(),
    3) close original handle + os.Rename(tmp, orig) (atomic on
       same FS),
    4) reopen + re-mmap.
  TestComp_Pack passes; readers always see either the pre-PACK
  or post-PACK contents, never a half-state.

* mem RDD torn reads (hbrdd/mem/memrdd.go). The comment claimed
  in-place PutValue was safe because hbrt.Value "fits in a
  single machine word + pointer". hbrt.Value is 24 bytes (3
  words) — a concurrent reader could observe new type tag with
  stale scalar/ptr and type-confuse on the next AsXxx() call.
  Switched mu to sync.RWMutex; GetValue takes RLock,
  Append/PutValue/Delete/Recall take Lock. `go test -race
  ./hbrdd/mem/` clean.

Files touched
-------------

  compiler/gengo/gen_class.go, gen_util.go, gengo.go
  compiler/genpc/genpc.go
  hbrt/class.go, hbfunc.go, pcinterp.go, pcode.go, thread.go, vm.go
  hbrdd/dbf/dbf.go, indexer.go, locks_posix.go, locks_windows.go
  hbrdd/dbf/encode_numeric_test.go  (new)
  hbrdd/mem/memrdd.go
  cmd/five/main.go
  hbrtl/frb.go
  tests/frb/test_frb_pcode_sweep.prg

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 05:29:56 +09:00
c5dd74c044 fix(pp): codeblock-in-macro + multi-line ;-continuation for #command
Three silent-miscompile fixes in the preprocessor that were
masking real bugs in Harbour-style PRG.

1. Brace tokenizer (compiler/pp/command.go)

`{` and `}` now tokenize as standalone separator tokens. The
matcher previously only split on `,()[]"'` etc., so a codeblock
literal `{|| ... }` in a macro argument became the tokens `{||`,
`""`, `}`. The capture-depth tracker only matched exact `{`/`}`,
so `{||` was invisible as an opener while the standalone `}`
wrongly decremented depth — `TEST_LINE( o:VarPut({|| "" }) )`
truncated mid-argument and the parser later choked at the inner
`}` with `expected ), got } "}"`.

Fix: add `{` and `}` to tokenizeLine's separator set. Now
`{|| ... }` lexes as `{`, `||`, `""`, `}` and balances cleanly.

2. ;-continuation join for non-`#` lines (compiler/pp/pp.go)

The existing line-joiner only collapsed trailing `;` continuations
on `#`-prefixed directives. Plain source code using the same
convention — e.g. Harbour's TEST macro:

   TEST t004 STATIC s_once := NIL, S_C ;
             INIT hb_threadOnce( @s_once, {|| ... } ) ;
             CODE x := S_C

was processed one physical line at a time, so the TEST pattern
never matched the full logical statement. The first row passed
through unrewritten, fell through to the parser as an expression,
and gengo silently absorbed it as part of the *previous*
function's body. Six TEST macros' STATIC declarations all ended
up tagged with t003's function name, producing duplicate
`static_T003_S_ONCE` decls and a Go compile failure.

Fix: add the same trailing-`;` join logic to user code, with
blank-line fillers inserted post-join so source line numbers in
parser errors still align with the original file.

3. Block-comment-aware continuation join

Inline `/* ... */` at the end of a continuation row hid the
trailing `;` from the joiner's HasSuffix check. The fix calls
stripBlockComments on the next-line peek before testing for `;`,
so chains like

   AAdd( aResult, { cChildBase, ;
                    aRefs[ "fk" ][ j ][ 1 ], ;     /* child col */
                    aRefs[ "fk" ][ j ][ 3 ], ;     /* parent col */
                    ...

keep folding instead of stopping after one row and leaving a
dangling `,` at end of line.

Results
-------
Harbour-core compat sweep: 25/30 → 28/30 (remaining lnlenli1 +
keywords are //NOTEST stress files, intentionally unbalanced).
All 6 release gates green: go test ./..., FiveSql2 43/43,
Harbour compat 56/56, std.ch 17/17, FRB 7/7, examples 65/71.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 05:28:54 +09:00
ce7b067785 fix(cli): multi-PRG build adds every input dir to the include path
Each PRG file's preprocessor instance was set up with only its OWN
directory on the include search path (`filepath.Dir(prgFile)`).
That worked for self-contained files but broke any multi-file
build where one PRG `#include`s a header that lives next to a
SIBLING PRG — the other file's directory wasn't on the path, so
the include silently failed and PP just skipped it ("// #include
\"FiveSqlDef.ch\" — not found (skipped)").

This was the root cause behind test_sql_standards's mass-failure
pattern. The test does

   #include "FiveSqlDef.ch"
   ...
   Assert( ..., h["columns"][1][1][1] == ND_FN .AND. ... )

`FiveSqlDef.ch` lives in `_FiveSql2/src/` (next to TSqlExecutor.prg
and friends), but the test source sits in `_FiveSql2/test/`.
Building with `./five build _FiveSql2/test/test_sql_standards.prg
_FiveSql2/src/*.prg` should resolve the header from a sibling
input file's directory — but only the test's own dir was searched,
so ND_FN / ND_LIT / ND_BIN / ND_UNI all stayed undefined and the
identifiers fell through to runtime memvar lookup, returning NIL.
Every assertion that compared against the constants therefore
silently failed (24 / 64 passing because non-constant assertions
still worked).

buildMultiPRGWithIncludes now seeds the user-include list with the
directory of every input PRG before handing off to buildMultiPRG.
A test under one directory can now resolve a `#include` that lives
next to a sibling source file in the same multi-file build.

Result: test_sql_standards goes from 24 / 64 to **64 / 64**. The
parser was already correct end-to-end — every SQL:2003-2023
construct it had been advertising actually worked; the test just
couldn't read the constants it was asserting against.

Wired test_sql_standards into the std.ch runner with a per-test
override so it picks up the FiveSql2 src files. Suite stands at
17/17.

Other gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  FiveSql2 standards : 64/64  (was 24/64)
  Harbour compat     : 56/56
  std.ch suite       : 17/17
  FRB suite          : 7/7

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 19:21:45 +09:00
af0d54d352 fix(lexer): {array}[index] no longer mis-tokenises [ as bracket-string
The lexer's isStringBracket disambiguator decides whether `[` opens
an indexing operator or a Harbour bracket-string literal. The
heuristic checks the previous token's kind and treats the bracket
as indexing only when preceded by an IDENT, RPAREN, RBRACKET, or a
literal. RBRACE was missing — so

   FieldPut(3, {"Kim","Lee","Park","Choi","Yoon"}[Int(Mod(i-1,5))+1])

tokenised the `[` after `}` as a bracket-string opener, swallowed
through the first `]` it found, and produced bogus parse errors
("expected ), got STRING …"). RBRACE is now in the indexing-context
set, so an inline array-literal followed by `[index]` works.

Surfaced by the examples/ build sweep — fixed test_all_rdd,
test_index_adv, test_multi_rdd, test_rdd_full all in one go.

The sweep itself is committed as tests/examples_build.sh — builds
every PRG under examples/ and reports any compiler / preprocessor
errors. Run it after compiler changes to catch regressions in
broad-coverage user-style code that the focused suites don't
exercise.

Current sweep state: 65 / 71 examples build cleanly. The remaining
6 failures are all #pragma BEGINDUMP blocks that import external
Go packages (http, websocket, sqlite, time) — not Five-side bugs.

Other gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56
  std.ch suite       : 16/16
  FRB suite          : 7/7
  examples build     : 65/71 (rest = external Go deps)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 21:06:13 +09:00
2008266da7 feat(pp,rtl): Tier 2 audit followups — JOIN hash + PP validation + C heuristic
Three medium-priority audit items in one commit, each independently
revertible.

  * **#18 JOIN hash-join fast path.** New std.ch shape:
        JOIN WITH <alias> TO <file> [FIELDS ...] ON <mfield> = <dfield>
    expands to a 6-arg __dbJoin call with the master/detail key
    field names. Runtime detects the extra args, builds an O(M)
    hash over the detail's key column, then probes per master row
    for O(N+M) total — vs the FOR form's O(N*M). For 1k×1k that's
    2k vs 1M operations; the gap widens with N. The original FOR
    form is unchanged and stays the fallback for arbitrary
    predicates. New helper dbHashKey type-tags the key string so
    `1` (numeric), `"1"` (string), and `.T.` (logical) don't
    collide in the bucket map.

  * **#38 PP rule result-marker validation.** ParseRule now walks
    the result template after parseMarkers and warns about every
    `<name>` (or `<(name)>` / `<.name.>` / `<{name}>` / `#<name>`
    / `<"name">`) that doesn't match a pattern marker. Warnings
    flow into pp.errors via handleDirective with the directive's
    filename:line, so a typo'd `<NaMe>` in an `#xcommand`
    case-sensitive rule fails the build with a clear diagnostic
    instead of silently producing broken expansions.

  * **#44 looksLikeInlineC heuristic strengthened.** Catches more
    of the common Harbour-PRG-with-C-inline-block shapes that
    used to fall through and produce cryptic Go-side errors:
    function-like #define, `extern "C"` linkage blocks, C return-
    type declarations (`int foo(`, `static char* bar(`), and the
    hb_ret*() helper family used by Harbour's C FFI return
    setters. Two small predicate helpers (allLetters,
    allIdentChars) keep the C-vs-Go disambiguation tight enough
    that legit Go code (`func name() int { ... }`) doesn't trip.

  * **#28 LIST/DISPLAY pagination** — explicitly deferred. Proper
    pagination requires interactive terminal handling (Inkey(0)
    for the keypress) which would hang in CI / batch mode. Will
    revisit when an interactive terminal layer needs it for
    other reasons.

Test fixtures: tests/std_ch/test_join_hash.prg verifies the new
ON-form path produces the same output as the FOR form would.
std.ch runner now stands at 16/16.

Other gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56
  std.ch suite       : 16/16
  FRB suite          : 7/7

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 19:21:19 +09:00
29ca02e1bc fix(genpc,parser,pcinterp): pcode wider regression sweep (Tier 1 #3)
Six more silent miscompiles in the pcode path, all uncovered by a
new pcode regression sweep that exercises the full PRG surface a
dynamic FrbCompile body could legitimately use.

  * **xBase-keyword shadowing of variable names.** parseIdentStmt
    and parseExprStmt's fallback switches consumed an entire line
    when the leading IDENT matched LABEL / REPORT / ACCEPT / INPUT
    / NOTE / etc. Those words are also extremely common LOCAL /
    PRIVATE names — `LOCAL label ; label := "x"` had the
    assignment swallowed because the switch didn't peek at the
    next token. Both switches now look at peek(1): an assignment
    operator, [], (, -, ++, --, or `.` means it's a variable /
    call / member access, not the xBase command, and we fall
    through to expression parsing. Real silent bug — bit
    test_frb_pcode_sweep's `LOCAL label` declaration.

  * **`arr[i]` indexing not implemented in genpc.** ast.IndexExpr
    fell through to the default PushNil path, so any indexed read
    in a pcode-mode body returned NIL. New case emits the array,
    the index, and PcOpArrayPush (the get-op; PcOpArrayPop is the
    set-op — naming follows Harbour convention). Hashes go
    through the same opcode, which already special-cases
    IsHash() in ops_collection.go.

  * **Hash literals not implemented in genpc + dispatch missing
    in pcinterp.** `{ "k" => v, ... }` fell to PushNil. Added
    HashLitExpr emit (Push key, Push value pairs, then PcOpHashGen
    with count). Also wired up the PcOpHashGen dispatch in
    execPcodeBody — it had been declared in pcode.go since the
    initial design but the case statement was never added, so
    even hand-written modules couldn't use hashes.

  * **`x++` / `x--` postfix were silent no-ops.** PostfixExpr fell
    to PushNil and the surrounding ExprStmt then popped the NIL.
    DO WHILE loops with `n--` couldn't terminate; FOR loops with
    `i++` in the body were broken too. New case: PushLocal +
    LocalAddInt(±1).

  * **BlockExpr (`{|p| body }`) wasn't compiled.** Eval(b, n)
    inside a pcode body returned NIL. Added: build the body in a
    sub-codebuffer with the block's params occupying its locals,
    emit PcOpRetValue at the end, then PushBlock with the
    serialized bytes. Format extended with a uint16 nParams field
    so the runtime's PcOpPushBlock dispatch can set
    PcodeFunc.Params correctly — without it, ExecPcode's
    Frame(0, 0) pulled none of Eval's args and the block saw
    every parameter as NIL.

  * **All g.locals accesses were case-sensitive.** PRG is case-
    insensitive, but the pcode generator stored block params via
    strings.ToUpper while every other lookup site (function decl,
    mid-decl, ForStmt, IdentExpr read, AssignExpr write,
    PostfixExpr) used the raw .Name. So `{|x| x*x }` stored "X"
    but read "x" and missed. Normalized: all insertions and all
    lookups now go through strings.ToUpper.

  * **SeqExpr in pcode** — added the matching emit for comma-
    separated expression lists in code blocks (`{|| a, b, c }`).
    Same shape as the gengo SeqExpr case from Wave 1.

Test fixture: tests/frb/test_frb_pcode_sweep.prg covers 14 shapes
(string ops, arithmetic, comparison chains, array indexing, DO
WHILE with postfix, nested IF, IIf, hash literal + indexing,
block + Eval, character iteration). All 14 pass. Wired into the
FRB runner — suite now stands at 7/7.

Other gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56
  std.ch suite       : 15/15
  FRB suite          : 7/7

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 11:32:38 +09:00
dca7bb22e5 fix(gengo): count nested LOCALs into the function frame
Function-entry Frame() allocation counted only top-level LOCAL
declarations from fn.Body. Mid-function LOCALs hidden inside an
IF / FOR / WHILE / DO CASE / SWITCH / SEQUENCE block weren't
included, so the runtime allocated a frame too small to hold them.
Subsequent reads/writes via PopLocalFast / PushLocalFast / LocalAdd
to those slot indices then either silently scribbled past the frame
(read-back saw NIL) or panicked with "local variable index out of
range" once the index exceeded the underlying slice.

This is the underlying bug behind frb_demo Section 4 — the
`LOCAL ch := Channel(1)` declared inside `IF pAsync != NIL` got
slot N+1 from the codegen but the runtime only allocated N. The
Channel value was scribbled past the frame, ChReceive then read
NIL from a non-existent slot, and the goroutine's ChSend(49) had
nowhere to land.

New helper gen_util.go::countLocalsInStmts walks every nested body
(IF + ElseIfs + ElseBody, ForStmt, ForEachStmt, DoWhileStmt,
SeqStmt's Body + RecoverBody, SwitchStmt's Cases + Otherwise) and
totals every ScopeLocal VarDecl. The function-emit caller adds this
to the top-level count before sizing the Frame.

Test fixture (tests/frb/test_frb_goroutine.prg) reproduces the
demo Section 4 shape — `LOCAL ch := Channel(1)` inside IF, then
`Go("WORKER", ch, 7)`, then ChReceive(ch). Wired into the FRB
runner so it stands at 6/6.

Other gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56
  std.ch suite       : 15/15
  FRB suite          : 6/6

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 07:05:22 +09:00
6a30c4e50e fix(gengo): compound assign for non-LOCAL LHS
Audit follow-up after Wave 1's pcode `+=` fix surfaced a parallel
class of silent miscompiles in the *gengo* (native-Go) emit path.
Three real bugs hiding behind happy-path test coverage:

  * `arr[i] += x` was ASSIGN-only — the IndexExpr branch returned
    after emitting `arr[i] := x`, dropping the original element.
    Now: PushArray + Push index, ArrayPush to read, fold with RHS,
    re-do PushArray + index, ArrayPop to store.

  * `alias->field += x` (and the M-> / MEMVAR-> namespace variants)
    were ASSIGN-only too. Same shape of bug — `x->v += 7` compiled
    as `x->v := 7`. Compound branch reads via PushAliasField (or
    PushMemvar for M->), folds, stores via SetAliasField (or
    PopMemvar).

  * PRIVATE / PUBLIC mid-function declarations were treated as
    extra LOCAL slots. emitMidVarDecl extended `locals` past the
    function's declared count and emitted `PopLocalFast(idx)` for
    the init. The slot didn't exist at runtime, so the init either
    silently scribbled past the frame (small N) or panicked with
    "local variable index out of range" once exercised. New logic:
    PRIVATE/PUBLIC declarations bypass the locals table and emit
    `PopMemvar(name)` for the init expression. The runtime auto-
    creates the memvar.

  * Memvar assignment fallback. After the LOCAL/STATIC checks miss
    in emitAssign, the bottom path used to be a one-line WARN that
    emitted RHS + `Pop()` — silently discarding the value. PRIVATE
    pSum stayed at its initial value forever. Now: ASSIGN goes
    through PopMemvar; compound forms read via PushMemvar, fold,
    write back via PopMemvar.

Test fixture (tests/std_ch/test_compound_lhs.prg) covers all four
shapes. The std.ch runner picks it up so the regression suite now
stands at 15/15.

Other gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56
  std.ch suite       : 15/15
  FRB suite          : 5/5

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 05:14:28 +09:00
efb615bed9 fix(frb,genpc): in-process compile + 4 pcode bugs
Compiling _FiveSql2/test/test_sql_extreme.prg + a sweep of the FRB
demos surfaced four real bugs in the dynamic-compilation pipeline.
All fixes shipped together because they were on the same critical
path; each is independently revertible.

  * **pcode FOR loop ignored STEP and direction.** emitFor in
    compiler/genpc emitted a fixed `<= to` comparison and a hardcoded
    `+1` increment, then deleted the actual step expression with
    slice arithmetic on the byte buffer. Result: `FOR 5 TO 1 STEP
    -1` exited on the first iteration; `FOR 1 TO 10 STEP 2` summed
    1..10 (55) instead of 1+3+5+7+9 (25). Rewritten to mirror
    gengo's emitFor: detect negative step from a literal `-N` or
    unary MINUS, pick `<=` vs `>=` accordingly, and emit a clean
    `var := var + step` increment per iteration.

  * **pcode compound `+=` operator stored only the RHS.** emitAssign
    looked at AssignExpr.Op only for the := case; +=/-=/etc.
    silently took the same path, so `n += i` compiled as `n := i`,
    discarding the accumulator. Loop reduces were wrong: `Reverse`
    returned "" and `n := 0; FOR i ... n += i; NEXT` returned only
    the last increment. New compoundBinOp helper maps PLUSEQ /
    MINUSEQ / STAREQ / SLASHEQ / PERCENTEQ / POWEREQ to their
    matching binary opcode; emitAssign emits `local + rhs ; pop
    local` for compound forms.

  * **Pcode body stack leaks polluted the caller's frame.** A pcode
    function whose body left intermediate values on the data stack
    (FOR control values, etc.) returned with extra entries past
    its declared retVal. FrbDoFunc / FrbExecFunc / FrbRunFunc then
    pushed retVal on top of those leaks, so the caller saw the
    leaked values where its own preceding arguments should have
    been: `? "Fibonacci(10) =", FrbDo(...), "(expect 55)"` printed
    `1 55 (expect 55)` because the FOR loop's `1` lived in arg-1's
    slot. Two new Thread methods (`SP()` / `SetSP(int)`) let the
    three FRB dispatchers snapshot stack depth before the inner
    call and clamp it back afterward, so the leaks evaporate before
    they reach the caller's frame.

  * **FrbExec / FrbRun recursed into the host's Main forever.** Both
    looked up "MAIN" via t.VM().FindSymbol, which always resolved
    to the OUTER program's Main since FRB modules deliberately keep
    Main local. Compile + run + unload became compile + recurse +
    OOM. Both now look up Main via mod.FindFunc("MAIN") (module
    scope) — Frbload's policy of leaving Main module-local now
    actually has the intended effect.

Plus an architectural improvement: in-memory compilation no longer
depends on shelling out to an external `five` binary. New
hbrtl.frbCompileInProc parses + preprocesses + generates pcode in
process, building a FrbModule directly. FrbCompile and FrbExec use
this exclusively, which means dynamic compilation works from any
directory regardless of PATH and without a second process. The
plugin-mode path (with its runtime-version-mismatch fragility) is
left available via hbrt.FrbCompileSource for callers that want it,
but FrbCompile no longer reaches for it by default.

Test suite: tests/frb/ holds five fixtures + a runner. 5/5 pass:
test_frb_simple / test_frb_pcode_load / test_frb_compile /
test_frb_loop / test_frb_step.

Other gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56
  std.ch suite       : 14/14
  FRB suite          : 5/5

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 10:25:35 +09:00
3ce0eceed5 fix(pp): apply rules to every ;-separated statement on a line
Until now applyRules looked at the *first* token of each physical
line. PRG legitimately packs multiple statements on a single line
with `;` as an intra-line separator (e.g. `dbCommit(); CLOSE ALL`),
and after Wave 1 removed the parser's xBase fallback for CLOSE/
COMMIT/etc., a `;`-separated `CLOSE ALL` on a line that started
with another statement would slip past std.ch entirely. The parser
then saw `CLOSE` / `ALL` as IDENTifiers, the runtime tried to
dispatch `CLOSE` as a function, and the user got a "no function
symbol for call" panic at execution time.

Fix: at applyRules entry, check for top-level `;` (paren / bracket
/ brace / string-literal balanced), split the line into statement
segments, recursively apply rules to each, rejoin with `;`. Two
new helpers (`hasTopLevelSemi` / `splitTopLevelSemi`) keep the
balancing logic small and self-contained.

Found by compiling _FiveSql2/test/test_sql_extreme.prg, which packs
the typical xBase one-liner DBF setup `dbAppend(); FieldPut(...);
...; dbCommit(); CLOSE ALL` across many rows of test data. The
test was panicking at the first such line; with this fix it now
runs to completion: 15/15 PASS.

All FiveSql2 SQL tests green together for the first time:
  test_sql1999       : 43/43
  test_sql1999_hard  : 10/10
  test_sql_extreme   : 15/15
  test_sql_challenge : 15/15
                       --
                       83 / 83

Other gates green:
  go test ./...      : PASS
  Harbour compat     : 56/56
  std.ch suite       : 14/14

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 08:27:47 +09:00
412351b67d feat(rtl): LIST/DISPLAY TO FILE — text output redirection
Wire up TO FILE for both LIST and DISPLAY: __dbList grows a 9th
parameter cFile, opens it (truncating any prior content) when non-
empty, and writes the formatted rows there via fmt.Fprintln. Default
behavior (no TO FILE) still goes to stdout.

std.ch gets two new rules placed *before* the regular LIST/DISPLAY
patterns so they win when TO FILE is present:

  LIST    [<v,...>] TO FILE <(f)> [OFF] [FOR] [WHILE] [NEXT] ...
  DISPLAY [<v,...>] TO FILE <(f)> [OFF] [FOR] [WHILE] [NEXT] ...

Open failure raises a clear *HbError ("LIST/DISPLAY TO FILE: cannot
create <path> — <syscall reason>") so callers know exactly what went
wrong instead of getting partial-or-empty output.

TO PRINTER stays rejected via __dbNotImpl — Five doesn't drive a
printer port. Test coverage: tests/std_ch/test_list_to_file.prg
exercises four shapes (full LIST, single-row DISPLAY, OFF + FOR with
explicit fields, and confirms TO PRINTER still raises). Wired into
the std.ch runner so the regression suite now stands at 14/14.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56
  std.ch suite       : 14/14

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 08:15:32 +09:00
3a7f1dea72 feat(rtl,tests): pre-release UX round (Wave 5)
Three audit findings around polish + a release-readiness commit:

  * #UX1 LIST/DISPLAY output: dropped \r\n (unix terminals showed a
    stray ^M), moved the newline to AFTER each row (no more leading
    blank line), and added the `*` deleted-record marker after the
    record number — matches xBase LIST/DISPLAY convention. With
    SET DELETED ON the marker is unreachable since the row would
    have been skipped at Area.Skip level; with SET DELETED OFF the
    user now sees which rows are tombstoned.

  * #26 temp aliases: `__copytmp` / `__sorttmp` / `__totaltmp` /
    `__jointmp` were process-global string constants. A nested
    invocation (e.g., COPY inside a FOR clause whose expression
    runs another COPY) collided on the alias and the inner Open
    failed with "alias already in use" — surfacing as `.F.` with
    no clear cause. Each Open now goes through a new helper
    `nextTmpAlias(prefix)` backed by an atomic counter, so every
    call gets `__copytmp_1`, `__copytmp_2`, etc. — no collisions.

  * #J test coverage gap: the 13 std.ch regression tests were all
    sitting in `/tmp` — lost on tmpfs reboot, never in git, never
    in CI. Move them into `tests/std_ch/` and add a simple
    `run.sh` runner that builds + executes each one in a temp
    scratch directory and grep-asserts on FAIL / NOT REJECTED /
    expectation-mismatch markers. 13/13 pass against the current
    head:

       PASS  test_pp_stdch       PASS  test_count
       PASS  test_sum_avg        PASS  test_sum_multi
       PASS  test_copy           PASS  test_sort
       PASS  test_list           PASS  test_total
       PASS  test_join           PASS  test_update
       PASS  test_set_deleted    PASS  test_unsupported
       PASS  test_block_comma

    test_block_comma in particular guards the gengo SeqExpr fix
    from Wave 1 — without it the comma-in-block miscompile would
    silently come back.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56
  std.ch suite       : 13/13

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 08:07:50 +09:00
1a9e509ee2 perf(rtl): SORT TO swaps insertion sort for sort.SliceStable (Wave 4)
Drop the toy O(n²) insertion-sort that __dbSort had been using and
delegate to the stdlib's sort.SliceStable. Reasoning: SORT TO is an
operation a user reaches for *because* their dataset is too big to
just iterate manually — interactive DBFs routinely have 10k–1M rows,
which the old impl would chew on for minutes to hours. SliceStable
gives O(n log n) and preserves the original-input ordering for
equal keys, which is what the previous implementation also tried to
do.

The function signature is unchanged (`stableSort(rows, less)`), so
all the multi-key / /D / /C dispatch logic from earlier waves keeps
working unmodified.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 08:03:13 +09:00
5b1d3fb32f feat(pp,rtl): pre-release accuracy round (Wave 3)
Four audit findings around correctness/consistency in std.ch and the
SORT/UPDATE/TOTAL handlers:

  * #13: TOTAL/UPDATE key idiom inconsistency documented as inherent.
    TOTAL evaluates `<key>` only in the source workarea so verbatim
    `<{key}>` (alias-qualified or `_FIELD->`-prefixed by the user)
    works. UPDATE evaluates the same block in BOTH master and detail
    context, so it must wrap as `_FIELD-><key>` to dispatch to
    whichever WA is selected at eval time. The two rules look alike
    but their evaluation contexts differ — also documented in
    std.ch alongside both rules so the asymmetry isn't a surprise.
    Plus: TOTAL TO and ON are now mandatory (matching the COUNT/
    UPDATE pattern from Wave 1) — bare TOTAL would have produced
    broken syntax via the unconditional `<(f)>`/`<{key}>` template
    references.

  * #15/#16: SDF / DELIMITED variants of COPY and TO PRINTER /
    TO FILE variants of LIST / DISPLAY are now matched by stub
    rules (placed *before* the regular rules so they win) that
    expand to a new `__dbNotImpl(reason)` RTL primitive raising a
    clear `&hbrt.HbError`. BEGIN SEQUENCE / RECOVER catches the
    panic, so callers get a real error instead of the previous
    silent dispatch-to-regular-DBF-copy.

  * #19: SORT /C (case-insensitive) now actually folds case before
    the string compare, instead of being silently treated as
    ascending. Suffix parser also rebuilt as a multi-letter scanner
    so `name/CD`, `name/DC`, `name/C/D`, `name/D/C` all parse the
    same way — combine /C and /D freely. Unknown suffix letters
    (e.g., `name/X`) leave the suffix attached to the field name
    so a stray slash in user input doesn't get silently mangled
    into a broken field reference.

  * #27 SET DELETED: verified with a regression test that
    `SET DELETED ON` causes COUNT/COPY (and by extension
    SORT/TOTAL/JOIN/UPDATE — all of which iterate via Area.Skip)
    to skip rows marked deleted. The filtering is implemented at
    the workarea level (skipFilter in dbf.go honors hbrdd.IsSetDeleted)
    so no RTL changes were needed; this commit just adds the
    coverage so the behavior doesn't silently regress.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 08:01:42 +09:00
f30704a854 fix(rtl,pp): pre-release safety round (Wave 2)
Five concrete gaps the audit flagged in the new __dbCopy / __dbSort /
__dbTotal / __dbJoin / PP code:

  * wam.Close() errors were dropped on the floor. Caller saw `.T.`
    even when the just-written DBF wasn't durable, leading to the
    classic "delete the source after the COPY succeeds" data-loss
    pattern. All four functions now capture the close error and
    return `.F.` if it fired.

  * drv.Create succeeded → wam.Open failed → orphaned-on-disk DBF.
    The user-named target file was left around with zero records,
    and the next call's drv.Create silently truncated it instead of
    surfacing the original error. Add `os.Remove(cFile)` on the
    Open-failure cleanup path for COPY/SORT/TOTAL/JOIN.

  * __dbTotal would write the DBF codec's overflow sentinel
    (`*****`) into the destination's sum-fields when a group total
    didn't fit in the source's declared field width, and still
    return `.T.`. Now: precompute each sum-field's max representable
    magnitude (10^(Len-Dec)) at start, mark the run as overflowed if
    any flush sees an out-of-range or NaN value, and propagate
    `.F.` to the caller so they don't trust the file.

  * cleanUnreferencedMarkers walked byte-by-byte and stripped any
    `<ident>` token in the result, INCLUDING ones that appear
    inside `"..."` / `'...'` string literals. A user expression
    like `LIST FOR url == "<a>x</a>"` got the `<a>` and `</a>`
    eaten on output. Now: track string-literal state and skip the
    cleanup pass while inside one. Bracket-strings `[…]` are
    intentionally not treated as strings here — the result template
    uses `[...]` as the optional-repeat marker, and disambiguating
    needs context the cleanup pass doesn't have.

  * (#8 SET SAFETY honoring) deferred. Harbour default is SAFETY
    OFF, so the current always-overwrite behavior matches default
    Harbour. The divergence only matters when user explicitly does
    `SET SAFETY ON`, which Five doesn't support yet — so the
    no-overwrite-protection is consistent end-to-end. Tracked as a
    separate followup.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 07:54:41 +09:00
000500e034 fix(pp,parser,gengo): pre-release blocker round (Wave 1)
Six audit-driven blockers landed together because they're tangled:

  * MENU TO removed from std.ch — the rule expanded to a call to a
    nonexistent __MenuTo() RTL symbol, so any user code with `MENU
    TO choice` compiled clean and panicked at runtime. Behavior
    pre-this-round was a parser silent no-op, which is at least
    consistent. Restore that until @ PROMPT (the companion command)
    actually lands.

  * COUNT now requires `TO <var>`. The earlier `[TO <v>]` optional
    bracket was a Harbour-pattern transcription error: the result
    template references `<v>` unconditionally, so a bare `COUNT`
    expanded to ungrammatical ` := 0 ; dbEval(...)` and the
    PRG parser rejected it. Match Harbour's std.ch which makes TO
    mandatory.

  * UPDATE FROM ... REPLACE now requires `FROM`/`ON`/`REPLACE` all
    three. Same root cause as COUNT: the result template uses
    `<key>`, `<f1>`, `<x1>` unconditionally; missing any of them
    produced broken syntax. Tightened to fail loudly rather than
    silently mis-expand.

  * CLOSE <unknown_alias> no longer closes the *current* workarea.
    SelectByAlias was a silent no-op when the alias was missing,
    leaving WASaveAndSelectAlias to evaluate the inner DbCloseArea()
    against the originally-selected WA — a real data-loss footgun.
    SelectByAlias now returns bool; WASaveAndSelectAlias switches to
    the no-area sentinel (0) on miss so the inner expression's
    Current() returns nil and short-circuits.

  * SUM <x1>, <xN> TO <v1>, <vN> — multi-pair form supported.
    Required two pieces:

       1. matchSegment's regular-marker stop-boundary now combines
          outerTail literals AND the segment's repeat boundary so
          `[, <xN>]` doesn't let `<xN>` swallow past the next ','.

       2. **Five parser miscompiled comma-separated expressions in
          code blocks.** `{|| e1, e2, e3 }` kept only the last expr
          and threw away earlier ones at *AST level*, so all their
          side effects vanished. New SeqExpr AST node + emitter
          (emit each, pop intermediate results) + folding/walk
          updates fix the underlying bug, which also unbreaks any
          other block that relied on comma sequencing.

  * pp.go's `;` continuation joiner now strips exactly one trailing
    `;` per iteration, preserving Harbour's `;;` convention (literal
    `;` followed by a continuation marker). Without this the SUM
    rule's chained `<v1> :=[ <vN> :=] 0 ; ; dbEval(...)` collapsed
    to a missing statement separator.

  * parseExprStmt's xBase fallback switch is back in sync with
    parseIdentStmt — COPY/SORT/COUNT/SUM/AVERAGE/TOTAL/UPDATE/JOIN/
    DISPLAY/LIST removed (std.ch handles all of them now). Leaving
    them in the fallback masked typos as silent no-ops.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 07:45:20 +09:00
e79ced2e0c docs: log PP/std.ch round + LABEL/REPORT deferred
Record the 9-commit Phase B run that landed Harbour-style #command
rewrites for ERASE/RENAME/CLOSE/COMMIT/UNLOCK/LOCATE/CONTINUE/
REINDEX/PACK/ZAP/KEYBOARD/RUN plus COUNT/SUM/AVERAGE/COPY/SORT/
LIST/DISPLAY/TOTAL/JOIN/UPDATE — 13 commands that were silent
no-ops in the parser before this round.

Also catalog the 14 PP completeness fixes the rules surfaced
(partial-pattern false-match, blockify substitution, list-aware
smart-stringify and blockify, MarkerList/MarkerWordList in optional
clauses, multi-delimiter capture, line-continuation in directives,
no-progress iteration leak, unreferenced logify/blockify cleanup,
nested `[...]`).

LABEL / REPORT explicitly deferred — niche xBase output-formatting
engines whose `.lbl` / `.frm` binary readers and pagination/group
machinery would be ~800–1500 LOC for near-zero modern users. Parser
keeps the silent no-op behavior for both keywords; entry points
documented in OPTIMIZATION_TODO.md if a real demand ever appears.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 17:52:30 +09:00
80a18daf8d feat(pp): UPDATE FROM via std.ch + nested-bracket fix in matchSegment
`UPDATE [FROM <alias>] [ON <key>] [RANDOM] REPLACE <f1> WITH <x1>
[, <fN> WITH <xN>]` becomes a preprocessor rewrite to a new RTL
primitive __dbUpdate. For each detail record, find the master
record with matching key (forward-walk if both sorted, full scan
when RANDOM) and apply the REPLACE clauses in master's context.

Same shape as harbour-core/src/rdd/dbupdat.prg. The REPLACE clauses
expand to comma-separated assignments inside one block —
`{|| _FIELD->total := del->amt, _FIELD->status := "OK" }` — using
the multi-pair `[, <fN> WITH <xN>]` optional-repeat that std.ch
already establishes for SUM and DEFAULT.

Five-specific tweak: ON <key> wraps as `{|| _FIELD-><key> }` rather
than Harbour's bare `<{key}>`. Five doesn't auto-resolve a bare
identifier in a code block to the current workarea's field, and the
UPDATE block must evaluate against both detail and master so an
explicit alias prefix won't do — _FIELD-> dispatches to whichever
area is selected at eval time, which is what's needed.

Wiring up UPDATE surfaced one further matchSegment gap that fell
out of the multi-pair `[REPLACE ... [, ...]]` shape:

  * matchSegment didn't handle nested `[...]` inside its body.
    `[REPLACE <f1> WITH <x1> [, <fN> WITH <xN>]]` gave the inner
    `[` as a literal token to match against the line, so even the
    single-pair `REPLACE total WITH del->amt` form failed and f1/x1
    came back empty. Now matchSegment runs the same repeat-loop on
    inner `[...]` blocks that the top-level matcher uses, with its
    own outer-tail computed from the segment tail past the inner
    `]`.

Parser cleanup: UPDATE removed from the IDENT-statement no-op switch.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 17:49:33 +09:00
ebe12e1108 feat(pp): JOIN WITH ... TO via std.ch + __dbJoin RTL
`JOIN WITH <alias> TO <file> [FIELDS <list>] [FOR <expr>]` becomes a
preprocessor rewrite to a new RTL primitive __dbJoin. Cartesian
product of the current ("master") workarea and the named "detail"
alias, filtered by the FOR expression.

Output structure:
  * No FIELDS clause: master's fields followed by detail's, dropping
    any detail-side name that clashes with master.
  * FIELDS list: one column per name in declaration order, resolved
    against master first then detail.

Same shape as harbour-core/src/rdd/dbjoin.prg. Five-specific
simplifications: alias->name in FIELDS not yet supported (bare
names with master-precedence lookup); RDD/codepage args dropped
since Five only has DBFNTX.

Note for callers: don't name a workarea `M` or `MEMVAR` — both are
Harbour-reserved memvar aliases, so `M->field` and `MEMVAR->field`
always go through the memory-variable namespace, not the workarea.
This is gengo behavior matching Harbour, not new in this commit.

Parser cleanup: JOIN removed from the IDENT-statement no-op switch.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 16:42:06 +09:00
699ea90156 feat(pp): TOTAL TO via std.ch + __dbTotal RTL
`TOTAL TO <file> ON <key> [FIELDS <list>] [FOR ...] [WHILE ...]
[NEXT ...] [RECORD ...] [REST] [ALL]` joins the family of std.ch
DML rewrites. New RTL primitive __dbTotal:

  * Walk the source under dbEval-style FOR/WHILE/NEXT/RECORD/REST
    bounds. The source must already be sorted/indexed on the key —
    same precondition as Harbour's dbtotal.prg.
  * Track the current group key. On each key change, flush the
    accumulated row to the destination (writing the running totals
    back into the most recently appended record's sum-fields,
    preserving each field's declared length/decimals).
  * On the *first* record of every group, append a fresh dst row
    and copy all non-memo source fields into it; subsequent records
    in the group only contribute to the sums. Net effect: non-summed
    fields take the first record's value, summed fields hold the
    group total. Same shape as harbour-core/src/rdd/dbtotal.prg.
  * Memo fields are dropped from the destination structure (Harbour
    does the same).

Parser cleanup: TOTAL removed from the IDENT-statement no-op switch.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 15:24:41 +09:00
1cc2d94927 feat(pp): LIST / DISPLAY via std.ch + four PP completeness fixes
`LIST [<fields>] [OFF] [FOR ...] [WHILE ...] [NEXT ...] [RECORD ...]
[REST] [ALL]` and `DISPLAY [<fields>] [OFF] [FOR ...] ... [ALL]`
reach the parser as plain function calls to a new RTL primitive
__dbList (rtlDbList in hbrtl/database.go).

Implementation: walk the workarea under dbEval-style FOR/WHILE/NEXT/
RECORD/REST bounds. For each visible record, evaluate each column
block and emit the rendered values via valueToDisplay (the same
formatter QOut already uses). Empty fields list defaults to
"all fields". OFF suppresses the record-number prefix.
LIST always emits the full filtered range; DISPLAY without ALL emits
only the current record (encoded as nCount=1). TO PRINTER / TO FILE
clauses are not yet wired through — for now everything goes to
stdout.

Wiring up LIST/DISPLAY surfaced four further gaps in PP that were
silently masking bugs in any rule with multiple word-list / list /
optional clauses chained together:

  * matchSegment refused MarkerWordList inside `[...]`. The LIST
    rule's `[<off:OFF>]` clause therefore never set the off
    capture, and `<.off.>` substituted to nothing instead of .T./.F.
    matchSegment now matches WordList markers the same way the
    top-level matcher does.

  * `<v,...>` and `<(f)>` capture stop boundaries didn't include the
    values of following MarkerWordList markers. For
    `[<v,...>] [<off:OFF>] [<all:ALL>]` against `LIST id, name OFF`,
    the v list would happily eat OFF. New addStopFrom helper
    contributes both literal keywords and word-list values; both
    matchSegment's MarkerList branch and captureExpression now use
    it.

  * Optional-repeat loop in matchPattern merged a no-progress
    iteration's empty capture into the running multi-capture string
    (with the `\x01` separator) before the no-progress break check
    fired. So a successful first iteration's value got contaminated
    and the substitution loop then skipped it as multi-capture
    garbage. The merge now happens after the progress check.

  * Unreferenced `<.name.>` markers (optional clauses that didn't
    match in the input) were getting cleaned up to empty by the
    generic marker scrubber instead of the .F. sentinel Harbour's
    std.ch expects. New replaceUnreferencedLogify pass mirrors the
    existing replaceUnreferencedBlockify and runs just before the
    cleanup.

Parser cleanup: LIST and DISPLAY removed from the IDENT-statement
no-op switch in both parseIdentStmt and parseExprStmt.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 15:19:36 +09:00
6dbc34b34b fix(pp): per-element blockify for list captures
`<{name}>` previously wrapped a list-typed capture's whole
comma-joined string in one code block: `{|| id , name }`. Harbour's
std.ch expects per-element wrapping so `{ <{v}> }` against
`LIST id, name` yields `{ {|| id }, {|| name } }` — an array of
column blocks the call site can evaluate per row.

applyResult now consults the marker table for blockify the same way
it already does for smart-stringify, splits the captured list on
top-level commas, and emits one `{|| expr }` per element.

Prereq for the upcoming LIST / DISPLAY rules; no user-visible
behavior change for the rules already in std.ch (their `<{for}>` /
`<{while}>` markers are scalar).

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 15:05:50 +09:00
989138d12e feat(pp): SORT TO via std.ch + __dbSort RTL
`SORT TO <file> [ON <key-list>] [FOR ...] [WHILE ...] [NEXT ...]
[RECORD ...] [REST] [ALL]` joins COPY in being a real preprocessor
rewrite to a function call. New RTL primitive __dbSort:

  * Buffer visible source records (FOR/WHILE/NEXT/RECORD/REST same
    as __dbCopy).
  * Multi-key stable insertion sort. Each key may carry `/D` for
    descending; ascending otherwise. /A and unknown suffixes fall
    through as ascending. Comparison delegates to the existing
    compareValues helper in sqlscan.go (numeric / string / NIL-aware).
  * Create destination DBF with the source's struct, append rows in
    sorted order, restore source selection.

Parser cleanup: SORT removed from the IDENT-statement no-op switch.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 15:04:18 +09:00
e961660f61 feat(pp): COPY TO via std.ch + four PP completeness fixes
`COPY TO <file> [FIELDS <list>] [FOR ...] [WHILE ...] [NEXT ...]
[RECORD ...] [REST] [ALL]` reaches the parser as a plain function
call to a new RTL primitive __dbCopy (rtlDbCopy in hbrtl/database.go).

Implementation: project the field list (case-insensitive name match
against the source's structure, full copy when omitted), dbCreate the
target file with that struct, open it under a temp alias, walk the
source under dbEval-style FOR/WHILE/NEXT/RECORD/REST bounds, and
GetValue/Append/PutValue per record into the target. SDF / DELIMITED
variants stay parser no-ops until those backends arrive.

Wiring up COPY surfaced four longstanding gaps in the PP that had to
be fixed for the rule to even reach the runtime:

  * `<(name)>` *pattern* marker was treated as a regular `<name>`
    with the parens baked into the captured key, so the matching
    result substitution `<(name)>` couldn't find it. parseOneMarker
    now strips the parens at parse time so capture key and result
    marker share the bare name. The smart-stringify result behavior
    is unchanged.
  * matchSegment (the optional-clause matcher) bailed on every
    non-Regular marker. `[FIELDS <fields,...>]` therefore failed to
    match at all and the fields list arrived empty in the result
    template. matchSegment now handles MarkerList with paren-balanced
    capture and segment+outer literal stop boundaries.
  * captureExpression only used the first literal in the pattern
    tail as a stop boundary. With std.ch's chain of optional
    clauses (`[TO <(f)>] [FIELDS ...] [FOR ...] [WHILE ...] ...`)
    the file-name marker was happy to gobble a trailing FOR clause
    when FIELDS was absent. It now stops at *any* of the remaining
    pattern literals.
  * `<(name)>` smart-stringify on a list-typed capture wrapped the
    whole comma-joined string in one set of quotes — `{ "a , b" }` —
    instead of `{ "a", "b" }`. New helper quoteListElements splits on
    top-level commas (paren / bracket / brace / string-balanced) and
    quotes each element. applyResult now consults the rule's marker
    table to know which captures came from `<name,...>`.

Parser cleanup: COPY removed from the IDENT-statement no-op switch in
both parseIdentStmt and parseExprStmt.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 15:00:18 +09:00
c2e7f7ea27 feat(pp): Phase B — COUNT / SUM / AVERAGE via std.ch
Three xBase analytical commands that were silent no-ops in the
parser now execute as Harbour-style PP rewrites:

  COUNT [TO <v>]   [FOR <for>] [WHILE <while>] ... -> dbEval()
  SUM <x> TO <v>   [FOR <for>] [WHILE <while>] ... -> dbEval()
  AVERAGE <x> TO <v> [FOR ...]                     -> __dbAverage()

COUNT and SUM expand to a `<v> := 0 ; dbEval( {|| ... } )` pair
matching harbour-core/include/std.ch verbatim. AVERAGE delegates to
a new RTL function rtlDbAverage (sum + count + divide; returns 0 on
empty match) — the chained-private-variable trick Harbour uses to
keep AVERAGE inline doesn't translate cleanly through Five's PP.

Wiring up these rules surfaced four PP issues that had to be fixed
for the rewrite to even reach the parser:

  * Result template did not implement <{name}> blockify. So a rule
    body like `{|| x := x + <x> }, <{for}>` left the literal text
    `<{for}>` in the output. Added blockify substitution: captured
    -> `{|| <captured> }`, missing -> NIL.
  * findMarkerEnd did not recognise `{`/`}` so unreferenced
    blockify markers were not cleaned up either. Added `{`/`}` to
    its prefix/suffix sets.
  * Optional-clause matching had no view of the outer pattern, so a
    regular marker at the end of `[TO <v>]` would swallow the rest
    of the line — `COUNT TO n FOR x>5` captured `<v>` as
    "n FOR x>5". matchSegment now takes outerTail and stops at its
    first literal.
  * `#command` directives could not span multiple physical lines.
    A trailing `;` is harbour-core's line-continuation marker for
    std.ch and now joins the next line into the directive before
    parsing.

Parser cleanup: COUNT, SUM, AVERAGE removed from the IDENT-statement
no-op switch in parseIdentStmt + parseExprStmt. The remaining xBase
verbs (COPY, SORT, TOTAL, JOIN, LIST, DISPLAY, LABEL, REPORT, ...)
stay in the parser until their RTL backends arrive.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 14:11:20 +09:00
c4f85f494c feat(pp): Phase A — preprocessor std.ch as single source of truth
Introduce compiler/pp/std.ch with 19 #command rules so that ERASE,
RENAME, DELETE FILE, CLOSE [<a>|ALL|DATABASES], COMMIT, UNLOCK,
LOCATE/CONTINUE, REINDEX, PACK, ZAP, KEYBOARD, RUN, MENU TO, and
CLEAR GETS reach the parser pre-rewritten as plain function calls.
Embedded into the compiler binary via //go:embed so it auto-loads
without an explicit #include in user code, exactly the way Harbour
auto-loads its std.ch.

This is a pure dispatch move, not a behavior change for the
already-working forms: the same Five RTL functions get called.
But it does fix three regressions that the parser was masking:

  * ERASE / RENAME / DELETE FILE used to be silent no-ops — the
    parser swallowed the entire line and returned NIL. They now
    actually delete/rename files (FErase / FRename).
  * CLOSE <alias> used to silently ignore the alias and close the
    current area. It now switches to the named area first
    (<a>->( DbCloseArea() )).
  * Two latent #command matcher bugs that surfaced while wiring
    std.ch up:
      - bare `CLOSE` would match rule `CLOSE ALL` because the tail
        of the pattern wasn't checked for unconsumed literals.
      - bare `CLOSE` would match rule `CLOSE <a>` because all
        unconsumed pattern markers were unconditionally treated as
        optional. They are only optional when nested inside `[...]`.

Parser cleanup: parseIdentStmt + parseExprStmt no longer hardcode
ERASE / RENAME / RUN / KEYBOARD / REINDEX / LOCATE / CONTINUE /
COMMIT / CLOSE — the rewriter handles them. Other xBase verbs
(COPY / SORT / COUNT / SUM / AVERAGE / TOTAL / JOIN / LIST /
DISPLAY / LABEL / REPORT / DIR ...) still no-op in the parser
because their RTL backends aren't implemented yet — once the
backends land they move into std.ch the same way.

Gates green:
  go test ./...      : PASS
  FiveSql2 SQL:1999  : 43/43
  Harbour compat     : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 12:03:30 +09:00
f4ed42556b checkpoint: season-wide bug fix campaign + infra
Cumulative season's silent-bug hunting (~62 fixes) across the FiveSql2
SQL engine, the Five compiler/runtime, and the hbrdd RDD layer. Saved
as a single checkpoint before refactoring the parser to delegate xBase
command translation to the preprocessor.

Highlights:

FiveSql2 engine (_FiveSql2/src/)
- prefix-glob index attach -> explicit convention (<table>_pk.ntx,
  <table>_uq.ntx, <table>.cdx) — fixes silent multi-row INSERT row-drop
- DROP/CREATE TABLE FErase chain extended (.cdx, .fsc, .fsv, .dbt, .fpt)
- COUNT(DISTINCT col) parsed + aggregated via hSeen hash
- UNION column-count mismatch returns SQL_ERR_GRAMMAR (was silent)
- DISTINCT + ORDER BY hidden-col leak fixed (trim before DISTINCT)
- Derived table FROM (SELECT...) + JOIN right-side derived
- Self-FK CASCADE depth 2+ via SqlGetSingleColPK pre-collect
- LAG/LEAD default arg uses SqlEvalRowExpr (handles -N const exprs)
- DATE literal round-trip validation (Feb 29 non-leap rejected)
- CREATE OR REPLACE VIEW; CREATE VIEW errors on already-exists
- AlterTable type dispatcher comma-wrapped (1-char type "A" no longer
  matches CHARACTER)

Compiler / runtime
- gengo: HB_ -> FV_ prefix on emitted Go function names (Five identity)
- gengo split: emit_block.go, emit_stmt.go, folding.go extracted
- parser/stmtreg.go nudges
- hbrt: debug TUI/CLI restructure (debugcmd, debugkey, termios_*),
  windows debug stubs collapsed
- thread/vm/value/class/pcinterp tightening from panic traces

RDD layer (hbrdd/)
- dbf: null bitmap support (null.go + null_test.go), mmap split
  (mmap_posix.go / mmap_windows.go), byte-level numeric parse
- ntx/cdx: windows mmap parity
- workarea + mem RDD: cross-area state-bleed fixes

RTL (hbrtl/)
- errorlog rewrite with platform-specific FD (errorlog_fd_unix /
  errorlog_fd_other)
- sqlscan, sqlhelpers, indexrtl, datetime extensions

Gates green at checkpoint:
- go test ./...        : PASS
- FiveSql2 SQL:1999    : 43/43
- Harbour compat       : 56/56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 09:26:25 +09:00
8a3f296e9a perf(dbf): byte-level numeric parse + RecCount cache
Two hot-path fixes for DBF reads surfaced by the bulk-bench profile.

1. parseNumericField decimal path — was 23% of flat CPU on BULK_CTE.
   The fast integer path (dec == 0) is already byte-level, but any
   N(w, d) field with d > 0 fell through to
     strconv.ParseFloat(string(raw[start:end]), 64)
   allocating per-row. A 10k-row CTE insert ran this 200k+ times.
   Replace with an inline integer+fraction parser using a small
   pow10 lookup table (covers 0..19 decimal places). Unexpected
   characters still fall back to strconv for correctness.
   Result:
     BULK_CTE_10k_20iter  187 → 83 ms  (2.25x)
     BULK_SUBQ_10k_20iter 102 → 22 ms  (4.6x)

2. DBFArea.RecCount in shared mode was doing Seek(0, 2) on every
   call. SqlScan calls it once per query for its result-array
   pre-allocation (~0.2 ms × 1000 queries = 0.2s of CPU on the
   bench). Cache the count per-area, keyed by a process-wide
   generation counter. Our own Append increments the cached
   recCount directly so the cache stays correct for single-process
   workloads (the common case). Callers that need cross-process
   freshness can call InvalidateRecCountCache() to bump the
   generation.
   SQL bench: modest 1-3 ms drops on B1/B2/B3/B6/B7.

Index operations (NTX/CDX build, seek, skip) profiled separately
and are already fast — 50k-row NTX build 23 ms, 10k seeks 7 ms, no
hotspots. Left untouched.

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 23:38:54 +09:00
325fe51656 fix(fivesql2): DML transaction + constraint ordering
Three correctness bugs in the DML executor that the 4.7 audit
surfaced:

1. RunInsert logged the transaction BEFORE dbAppend() and validation.
   LogRecord captured the PREVIOUS row's RecNo, and a CHECK/FK
   violation that rolled back via dbDelete() still left a spurious
   INSERT entry in the log pointing at the wrong record. Move
   LogRecord to after all field puts and all validators pass, so
   the log only records committed INSERTs at the correct RecNo.

2. RunUpdate (fallback path) skipped CHECK and FK validation entirely
   — only RunInsert validated. An UPDATE could violate the same
   constraints INSERT protects against. Add the same validator calls
   after FieldPut, with a captured aPrevVals snapshot so the in-
   memory record can roll back cleanly on failure. Gated by
   SqlLoadConstraints to skip the validator (and its recursive
   five_SQL) for tables without SQL-level metadata — tables created
   via plain dbCreate see no change.

3. RunDelete had no transaction logging at all — a BEGIN / DELETE /
   ROLLBACK cycle silently lost the row. Add LogRecord("DELETE")
   before dbDelete so undo can re-surface it. (A full FK-cascade
   check on delete would require parent→child scanning; deferred.)

The fast-path SqlBulkUpdate branch still bypasses per-record
validation by design (documented) — it's gated by
`! ::oTxn:IsActive()`, so txn-active queries always take the
validated fallback.

FiveSql2 43/43 (including SAVEPOINT + ROLLBACK TO and all four CHECK/
FK tests), Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 23:24:14 +09:00
e368402682 chore: audit cleanup — remove orphan parser + dead TSqlIndex methods
Opus 4.7 audit of the codebase surfaced several items that Opus 4.6
sessions left behind. This pass removes what's definitively dead and
fixes one trivial defensive bug; the real logic bugs (transaction
ordering, missing RunUpdate/RunDelete validation) come in a separate
commit.

Deletions:

- `_FiveSql2/src/TSqlParser_orig.prg` (1173 lines) — superseded by
  `TSqlParser2.prg` (Pratt). Production never instantiates the old
  parser; the only callers were the comparison/benchmark test files
  also being removed.
- `_FiveSql2/test/test_parser_cmp.prg` — compared orig vs Pratt AST,
  useless now that orig is gone.
- `_FiveSql2/test/bench_parser.prg` — benched both, same reason.
- `_FiveSql2/Makefile` `test_cmp:` and `bench:` targets referenced
  the removed files.
- `TSqlIndex.prg` methods `ApplyScope`, `ClearScope`, `ApplySeek`,
  `IndexInfo`, `CreateTempIndex`, `DropTempIndex` — each declared in
  the class header and implemented (~165 lines total) but zero
  callers anywhere in `_FiveSql2/` or `hbrtl/`. Class declarations
  removed alongside the bodies.

Small fixes:

- `TSqlDDL.prg:179-180` stale comment claiming Five doesn't support
  `@byref` — false since commit e95afad (2026-04-13) wired @byref
  via RefCell. The same method uses @nPos correctly elsewhere.
- `hbrt/class.go:tryBinaryOp` defensive nil-check on AsArray().
  IsObject() checks the type tag; a corrupted Value with tag=Object
  but ptr=nil would crash on `.Class`. Correct construction paths
  never hit this, but the guard is cheap.

Compat tests: FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 22:46:17 +09:00
e5843bdde4 docs: refresh Phase-C TODO — audit results + remaining edge cases
Update the 1.0-readiness document with:
- 2026-04-18 compatibility audit results: 50/47 build rate (94%)
  vs previous 40/34. Lists every fix commit this session.
- Four remaining low-priority edge cases from the audit (xcommand
  nested-comma args, u64 overflow, USE with ../ paths, legacy
  inline-C syntax) — none block a realistic 1.0.
- Revised Phase-C scope: user clarified contrib PRGs can be
  imported as-is so long as underlying RTL exists, so the work is
  "audit each contrib's low-level deps, fill gaps, copy .prg"
  rather than porting every function.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 18:32:45 +09:00
4a1bbdb1fe feat(pp): optional-repeat [...] blocks — DEFAULT / UPDATE from common.ch
Harbour's `#xcommand DEFAULT <v1> TO <x1> [, <vn> TO <xn>] => ...`
uses an optional, repeatable trailing `[...]` block to accept any
number of `var TO default` pairs on a single line. Five's PP
skipped bracket bodies during pattern matching and treated them
as no-ops in result templates, so

  DEFAULT a TO 10, b TO 20, c TO 30

expanded (at best) the first pair and dropped the rest — and
common.ch itself was documented as "not yet supported".

Three concrete changes:

1. matchPattern now matches the `[...]` body repeatedly against
   remaining line tokens via a new matchSegment helper. Each
   successful iteration appends captures for the interior markers
   under the same name, joined with a \x01 sentinel.

2. matchSegment, when capturing the last marker in a body with no
   following literal, uses the body's opening literal (e.g. the `,`
   in `[, <vn> TO <xn>]`) as the iteration boundary. Otherwise
   captureExpression would greedily eat the rest of the line and
   collapse every remaining pair into one capture.

3. applyResult's new expandOptionalRepeat walks the result template
   for top-level `[...]` blocks. When a referenced marker is multi-
   captured it emits the body N times (substituting per-iter value);
   when it's single-captured it emits the body once; otherwise drops
   the block. A separate referencedMarkers scanner and an inMarker
   guard keep literal `[` / `]` inside PP markers (like `<.x.>`)
   from being mistaken for bracket delimiters.

Side fix: ParseRule previously stripped every ` ;` as a Harbour
line-continuation marker, but that also destroyed in-line PRG
statement separators in result templates. Line joining is the
preprocessor's job upstream — keep semicolons intact here.

common.ch now ships real DEFAULT and UPDATE #xcommands. Verified
1-, 2-, and 3-pair DEFAULT expansion plus `common.ch` inclusion
from user code. FiveSql2 43/43, Harbour compat 56/56, Go test ALL
PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 18:20:11 +09:00
b1024c5244 fix(gengo): hoist #pragma BEGINDUMP imports + wire HB_FUNC registration
Two bugs blocked Five's own inline-Go feature:

1. Inline Go blocks placed mid-file couldn't carry an `import` list
   because Go rejects declarations before imports in the same file.
   examples/godump_demo.prg and friends (real Five demos) hit
   "syntax error: imports must appear before other declarations"
   during compile of the generated Go.

   hoistGoImports parses the raw dump body for `import (...)` blocks
   and single-form `import "path"` lines, registers each path into
   the generator's imports map, and returns the body with those
   directives stripped. The top-of-file import block then carries
   everything the dump needs.

2. HB_FUNC() calls inside the inline block's init() enqueue
   registrations into hbrt.dynamicFuncs, but the VM only promotes
   them to its symbol table when RegisterLibModules() is called.
   gengo's generated main() skipped that step, so dispatch on the
   inline-defined names panicked with "no function symbol for call".
   Emit vm.RegisterLibModules() after RegisterModule(symbols).

Verified: examples/godump_demo.prg builds and runs; the inline
GoUpper / GoFib / GoGCD / GoSplit / GoSquare / GoTypeOf functions
all dispatch. Matches the feature's original design intent.

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:58:49 +09:00
5514780b11 feat(pp): detect Harbour inline C in #pragma BEGINDUMP and fail fast
Harbour's #pragma BEGINDUMP ... #pragma ENDDUMP blocks carry C source
that the Harbour toolchain embeds verbatim. Five takes the same
directive but targets Go — any `.prg` ported from Harbour that ships
inline C gets its C shoveled into the Go codegen pipeline and fails
with opaque errors like "invalid character U+0023 '#'" from the Go
compiler, dozens of lines downstream of the actual cause.

Detect the C shape at PP time and report a clear, actionable error:

  pp: file.prg:N: #pragma BEGINDUMP contains C code — Five accepts
  inline Go only. Port the block to Go (or use an RTL function),
  then wrap in #pragma BEGINDUMP ... #pragma ENDDUMP.

looksLikeInlineC uses conservative signals that don't false-positive
on legitimate inline Go (which calls `hbrt.HB_FUNC("NAME", fn)` with
a package prefix and a quoted string, distinct from C's bare
`HB_FUNC(NAME)` macro). Signals:

  - `#include <...>` / `#include "..."` — unambiguous C preprocessor
  - line-starting `HB_FUNC(` / `HB_FUNC_STATIC(` — C FFI macro
  - `typedef ` / `struct ` / `int main(` / `void main(` at line start

main.go now aborts the build when PP returns errors (previously
printed but continued — same behavior the parser already had for
its own errors). Keeps build output short: one pp line + one
summary line, no gengo noise.

Verified:
  - harbour-core/tests/inline_c.prg → clean PP error, exit 1
  - examples/godump_demo.prg (legitimate inline Go) → passes PP
    (hits a separate pre-existing gengo import-ordering bug, not
    related to this change)

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:53:44 +09:00
85002df6b9 feat(parser+pp): USE with macros and paren-balanced PP capture
Two related fixes for Harbour's data-driven `USE &cFile ALIAS &cAlias
INDEX &cNdx` idiom — common in any app that dispatches table names
at runtime.

Parser (compiler/parser/parser.go parseUse):
- `USE &cFile` / `USE &(expr)` previously triggered a
  skipToEndOfLine short-circuit, emitting an empty UseCmd (equivalent
  to bare USE = close current area). Now parseMacro runs and the
  MacroExpr becomes the File node, so codegen emits MacroPush +
  dbUseArea.
- `ALIAS &cAlias` / `ALIAS &a.1` similarly dropped the macro result;
  now captures it into UseCmd.AliasExpr so codegen evaluates the
  alias at runtime. Both the IDENT-path ("ALIAS") and keyword-path
  (token.ALIAS) handlers fixed.

PP (compiler/pp/command.go):
- captureExpression and the MarkerList branch now paren-balance
  `(`/`[`/`{` so nested grouping inside a macro argument doesn't let
  an inner `)` terminate the capture. Example:
      _REGULAR_(&(a))
  previously captured `&(a` (missing inner `)`) and left the outer
  `)` dangling, producing parse errors in the expanded output.
- MarkerList capture still joins tokens with " " for raw `<z>`
  substitution — comma tokens stay in the stream, so `s(<z>)`
  re-emits them as argument separators and the list expands cleanly.

Bench: harbour-core/tests/pp.prg 2 errors → 0 for the realistic
`USE &macro` / `&(expr)` patterns. Remaining parse errors on line 70
are a pathological `_REGULAR_L` list that includes `&a.  [2]`
(space between macro's terminating dot and an array index) — the
PP expands it correctly but Five's lexer refuses the expanded
result. That form doesn't occur in real code.

/tmp/test_use_macro.prg — all four patterns (`USE &f`, `USE &f ALIAS
&f`, `USE &f ALIAS &f INDEX &i`, dot-terminated) now compile. FiveSql2
43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:38:15 +09:00
e9522772a7 fix(pp): stringify markers + paren-attached calls — pp.prg 26→2 errors
Three cumulative fixes for Harbour's preprocessor stringify forms
surfaced by harbour-core/tests/pp.prg:

1. Token alignment — tokenizePattern and tokenizeLine now both
   split on parens and brackets, so `DUMB(a)` (no space) tokenises
   as `DUMB`, `(`, `a`, `)` on both sides. Previously the line
   tokenizer kept `DUMB(a)` as one token while the pattern split
   it three ways, and the match never engaged. Fixes `_DUMB_(a)`-
   style calls in pp.prg line 57+.

2. Substitution order — applyResult was replacing the bare `<z>`
   marker first, eating the inner `<z>` of `#<z>`, `<"z">`, `<(z)>`
   and `<.z.>` and leaving stray `#` / `<` / `.` characters that
   the lexer reported as ILLEGAL tokens. Run all compound forms
   first, bare `<z>` last.

3. Quote delimiter picker — ppQuote wraps a captured value in a
   legal PRG string literal by trying `"..."` first, then `'...'`,
   then `[...]`. Harbour's #<z> dumb-stringify needs this because
   the capture may already contain `"`, and Five was producing
   malformed `""world""` literals.

Bonus: smart-stringify `<(z)>` now recognises input that's already
a string literal (`"x"` / `'x'` / `[x]`) and keeps it verbatim
instead of double-quoting.

pp.prg 26 parse errors → 2 (remaining: `USE &b ALIAS &a.1` macro-
inside-command at line 21 and one related line, unrelated to this
fix). FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:26:16 +09:00
385a4ec6a2 fix(gengo): M-> and MEMVAR-> route to memvar table, not workarea
Harbour reserves the aliases `M` and `MEMVAR` for the memvar
namespace — `M->cVar` reads a PUBLIC/PRIVATE memvar, not a DBF
field in a workarea named M. Five's emitAliasExpr and emitAssign
treated all aliases identically, emitting:

  t.PushAliasField("M", "cVar")              // read
  _wa := t.WA.(*hbrdd.WorkAreaManager); _wa.SetAliasField("M", ...) // write

which triggered a spurious hbrdd import on programs using memvars
and attempted a workarea lookup that couldn't find a "M" area at
runtime.

Detect the reserved aliases (case-insensitive) at the three
AliasExpr call sites — the read path (emitAliasExpr) and both
assign paths (emitAssign for statements, emitAssignExpr for
expression context) — and route to t.PushMemvar / t.PopMemvar
instead. The existing Thread helpers hash into the MemvarTable
populated by PUBLIC/PRIVATE declarations.

Unblocks harbour-core/tests/macro.prg build (runtime still needs
the TVALUE test helper, unrelated). FiveSql2 43/43, Harbour compat
56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:14:18 +09:00
65b2edc906 fix(gengo): SWITCH edge cases — empty body, OTHERWISE-only, EXIT semantics
Three SWITCH codegen bugs surfaced by harbour-core/tests/switch.prg:

1. Empty SWITCH (`SWITCH x ENDSWITCH`) — legal Harbour, produced by
   conditional-compile files like switch.prg:13. Previous code
   emitted `_sw := t.Pop2()` followed by `}` with no matching `{`,
   closing the enclosing procedure body and producing "syntax error:
   non-declaration statement outside function body".

2. OTHERWISE-only (no CASE arms) — emitted `} else {` with no opening
   if, same "unexpected keyword else" category.

3. `EXIT` inside a CASE should break out of the SWITCH — but Five
   lowers SWITCH to an if/else-if chain, so the generated `break`
   had nowhere to land ("break is not in a loop, switch, or select").

Fix all three by wrapping every SWITCH in a one-iteration `for`
loop. `break` inside a case targets the wrapper, matching Harbour
semantics. Empty / OTHERWISE-only bodies still emit valid Go
because the for-loop provides the scope boundary regardless of
whether any if-chain opened. A trailing `break` keeps the loop
one-shot.

Also:
- `_ = _sw` silences unused-var for empty SWITCH.
- Conditionally emit the if-chain closing `}` only when at least
  one CASE ran.

All 15 SWITCH blocks in harbour-core/tests/switch.prg now build
and run to completion. FiveSql2 43/43, Harbour compat 56/56,
Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:11:47 +09:00
4b629f7e7a fix(pp): #xcommand/#xtranslate patterns with paren-attached keyword
Real Harbour headers write parameterised commands with no space
between the keyword and its opening paren:

  #xcommand MAKE_TEST( <obj>, <v> ) => ...

ParseRule stored the rule keyword as `MAKE_TEST(` (stripping only
<>, [] marker wrappers), but firstToken normalised source lines by
stopping the first-word scan at `(` — so `MAKE_TEST( o, 42 )`
produced `MAKE_TEST` for the lookup. The two strings didn't match
and the fast-path keyword check rejected every invocation, leaving
the macro unexpanded and the call site as a bare undeclared
identifier.

Trim everything from the first `(` onward during keyword
extraction so both halves agree on the dispatch key. The marker
tokens inside the parens are still parsed normally by
parseMarkers / matchPattern.

Verified with /tmp/test_xcmd2.prg (`MAKE_TEST( o, 99 )` expands
and dispatches to the object's :hVar access). FiveSql2 43/43,
Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:07:06 +09:00
d6c26104c9 feat(rtl): common.ch aliases — ISNIL/ISARRAY/ISNUMBER and friends
Harbour's common.ch exposes classic Clipper type-check shorthands
via #translate rules that map to HB_IS* RTL functions:

  #translate ISNIL(<x>)       => ((<x>) == NIL)
  #translate ISARRAY(<x>)     => HB_ISARRAY(<x>)
  #translate ISCHARACTER(<x>) => HB_ISSTRING(<x>)
  ... etc.

Five's preprocessor currently supports #translate only for lines
whose FIRST word is the rule keyword, not for substring matches
inside expressions. Real usage like `IF ISNIL(x)` fails the keyword
check (first word is IF, not ISNIL) and the rule never fires.

Rather than rewrite the PP substring engine (A2 scope), register
the nine short names as direct RTL symbols in register.go, each
pointing at the same Go function as its HB_IS* twin. ISMEMO maps
to HB_ISSTRING as a reasonable approximation for Five (no distinct
memo type at the VM level).

common.ch becomes a short stub that just #defines TRUE/FALSE/YES/NO
and documents where the ISxxx aliases live. DEFAULT / UPDATE
#xcommand forms remain unsupported pending A2.

Verified with /tmp/test_common.prg — ISNUMBER(42), ISCHARACTER("x"),
ISNIL(nilVar) all dispatch correctly. Analyzer still emits
"undeclared variable" warnings for the short names (the static
checker doesn't see runtime-registered RTL symbols) but the
generated code links and runs.

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:01:50 +09:00
d3c4447198 feat(parser): keyword-as-identifier at stmt-block boundaries
Harbour permits keywords (CASE, DO, WHILE, etc.) to be used as
variable/array names. In most expression contexts Five already
handles this via expr.go:362 which whitelists keywords when used
as bare identifiers. But parseStmtBlock was stopping on any stop
token unconditionally, so a line like

  case[ n ] := x       -- 'case' is a LOCAL array

terminated the enclosing stmt block at `case` and left `[ n ] := x`
unparsable.

Add isIdentSuffix(): peeks one ahead and reports whether the next
token is something that can only follow an identifier ([, :=, +=,
-=, *=, /=, %=, ^=, ++, --, :, .). parseStmtBlock now treats the
stop token as a statement-start when its suffix matches, so the
block keeps going.

Verified with /tmp/test_kwident.prg (`case[...]` outside DO CASE,
`arr[...]` inside DO CASE body), /tmp/test_kwident2.prg (both the
`case case[n] == "two"` arm and `case[1] := "updated"` assignment
after ENDCASE). Pathological harbour-core/tests/keywords.prg still
fails — it places `case[...]` in the arm-expected position of a
DO CASE block with no leading arm, which no sane parser can
disambiguate.

FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:56:44 +09:00
0a5482b6aa feat(parser): implicit class binding for standalone METHOD bodies
Classic Clipper/Harbour form writes method implementations as bare
`METHOD Name(params)` statements following a `CLASS X ... ENDCLASS`
declaration, with the binding inferred from the most recent class:

  CREATE CLASS Shape
     METHOD Area
  ENDCLASS

  METHOD Area             -- binds to Shape
     RETURN 0

Five was requiring `METHOD Area CLASS Shape` explicitly. Without it,
parseMethodDecl left MethodDecl.ClassName empty, gengo skipped the
body emission, and the link step failed with `undefined: HB_SHAPE_AREA`.
The class registration had AddMethod("AREA", HB_SHAPE_AREA) pointing
at the missing symbol.

Parser tracks p.lastClassName at parseClassDecl, and parseMethodDecl
falls back to that value when no CLASS clause is supplied. Each new
CLASS declaration updates the tracker, so multi-class files still
dispatch correctly — verified with /tmp/test_implicit_class.prg
(Shape + Box both resolve their own Name/Area methods).

Unblocks harbour-core/tests/clsscope.prg and other OOP compat
tests that use this form. FiveSql2 43/43, Harbour compat 56/56,
Go test ALL PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:52:23 +09:00