fivedev/five - five - fivego gitea

Author	SHA1	Message	Date
CharlesKWON	ed1aeeb212	feat(pgserver): pg_catalog stub for BI-tool connection compatibility PostgreSQL clients (psql, pgx, DBeaver, Tableau, DataGrip, pgAdmin) fire a barrage of catalog probes at connection time — SELECT version(), SHOW server_version, SELECT FROM pg_namespace / pg_class / pg_type / pg_database / pg_settings. FiveSql2 can't parse most of them. Without interception the BI tool either errors out on connect or proceeds with a half-broken view of the database (zero tables, no type info, no schema list). This commit lands the minimum-viable catalog shim so the common connect-and-list-tables flow succeeds. Strategy -------- Pattern-match catalog probes BEFORE handing the SQL to five_SQL. Recognised shapes get synthesised result envelopes — same `{ aFieldNames, aRows }` hbrt.Value shape the engine returns, so the existing dispatchSimpleQuery / executePortal pipelines stream them identically to a normal query. Covered (v1.0) -------------- * SET / RESET / DISCARD <name> → success, no-op * SHOW <name> → single-row response (server_version, server_encoding, client_encoding, DateStyle, transaction_isolation, etc.) * SELECT version() / current_database() / current_schema() / current_user / session_user / pg_backend_pid() → single-row * SELECT … FROM pg_namespace → 2 rows (pg_catalog + public) * SELECT … FROM pg_class → list of open workareas (relkind='r', relnamespace=public) * SELECT … FROM pg_attribute → empty (stub; column-shape introspection deferred to v1.1) * SELECT … FROM pg_type → 7 OIDs FiveSql2 actually emits (bool, int4, int8, text, numeric, date, timestamp) * SELECT … FROM pg_database → 1 row, the connect-time db name * SELECT … FROM pg_settings → name/setting pairs matching SHOW * Anything else mentioning pg_catalog. / pg_<name> / information_schema. → empty result with generic field names (BI tool sees "0 rows" rather than a parse error) Deliberate non-goals -------------------- * WHERE / JOIN evaluation — psql, pgx, DBeaver all filter client-side on the rows we return. We send the whole catalog and let them apply their predicates. * pg_attribute introspection — would need to re-derive column types from the open workarea + map back to PG OIDs. Tracked as v1.1 work. * Recursive CTE catalog queries (pgAdmin's tree builder uses them) — too brittle to pattern-match. Falls through to five_SQL where it errors loudly. pgAdmin's table-tree pane will then show "0 tables" but the connection itself stays alive. Files ----- hbrtl/pgserver/catalog.go (new, ~280 LOC) catalogIntercept(sql) → (handled, value) synthPgNamespace / synthPgClass / synthPgAttribute / synthPgType / synthPgDatabase / synthPgSettings simpleSelectFunction (version/current_/pg_backend_pid) showResponse (SHOW <name>) hbrtl/pgserver/dispatch.go dispatchSimpleQuery: catalogIntercept ahead of runSQL. hbrtl/pgserver/extended.go executePortal: same intercept, ahead of runSQL. Verification ------------ psql against a running pgserver, with sslmode=require + MD5: $ psql -c 'SELECT version()' -At PostgreSQL 14.0 (FiveSql2) (FiveSql2 wire-compat shim) $ psql -c 'SELECT FROM pg_namespace' -At 11\|pg_catalog\|10 2200\|public\|10 $ psql -c 'SELECT * FROM pg_type' -At 16\|bool\|1 23\|int4\|4 20\|int8\|8 25\|text\|-1 1700\|numeric\|-1 1082\|date\|4 1114\|timestamp\|8 $ psql -l # \\l now works 데이터베이스 목록 oid \| datname \| datdba \| 인코딩 -----+---------+--------+-------- 1 \| alice \| 10 \| 6 Integration script gates grew from 6/6 → 9/9: PASS Catalog probe: SELECT version() PASS Catalog probe: pg_namespace lists public + pg_catalog PASS Catalog probe: SHOW server_version_num All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 9/9 ✓ (+3 from catalog stubs) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 22:31:52 +09:00
CharlesKWON	12fcb8d249	fix(dbf): Layer 6 — EOF marker max-merge + disable append batching in shared Closes two more multi-session correctness bugs surfaced by the post-Layer-5 stress harness. Combined with Layer 5's panic-free result, three-worker concurrency now sits around 80% pass with zero Go-level crashes; higher worker counts trade reliability for throughput against the inherent single-file-multi-writer limit of the DBF format. 1. EOF marker write at Close (max-merge with disk) `Close()` writes the EOF marker `0x1A` at `header.HeaderLen + a.recCount * RecordLen`, computed from our LOCAL recCount. A peer Append between our last refresh (under the append-intent lock at Append-time) and Close-time may have bumped the disk recCount above ours. Writing EOF at our stale offset overwrites byte 0 of the peer's record — flipping the delete-flag from ' ' (RecordActive) to 0x1A. The field bytes survive, but downstream code that depends on byte 0's exact value misclassifies the record. Fix mirrors updateHeader's max-merge (Layer 3a): in shared mode, re-read the disk header right before computing EOFOffset and use max(disk.RecCount, local). Cheap (~1 stat- sized read per Close) and the eventual close-fd is already the serial bottleneck of any meaningful churn. 2. Append-batching disabled in shared mode The appendBuf optimisation accumulates several consecutive APPENDs into a single WriteAt at flushRecord time. In single- process EXCLUSIVE mode that's a clean throughput win. In shared mode, though, a peer SELECT can open the file while our slots N..N+M are buffered but still on-disk only as reserved-but-zero bytes. The peer iterates 1..recCount and ReadAts zeros at offsets [N..N+M), treating the records as garbage / empty markers. Skip the batch path when `a.shared`: each Append writes its record straight through via flushRecord on the next state change. EXCLUSIVE single-process flows are unaffected. Observed stress numbers (3 trials × 30 runs each, average): pre-Layer-1 baseline: ~60% / panics +Layer 1+2: 80% / 50% / panic +Layer 4a/4b: 75-90% / 50-80% / panic +Layer 5 (mmap-gen): ~73% / ~67% / ~33% / NO PANICS +THIS (EOF + no-batch): ~83% / ~50% / ~22% / NO PANICS The remaining flake at 5+ concurrent writers reflects the fundamental constraint of FiveSql2's DBF model: no table-level write lock, no MVCC. PostgreSQL solves this with snapshot isolation; the equivalent for FiveSql2 would need a write-ahead log or per-table writer mutex. Tracked as a post-1.0 R&D direction. For typical pgserver use — many read clients, few write clients — the current correctness is production-acceptable. The pgserver Phase 7 integration suite (3/3 in the basic psql harness + 3/3 in the auth/TLS harness) remains 6/6 green because each suite uses one connection at a time. All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 6/6 ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 22:03:56 +09:00
CharlesKWON	151b628f6c	fix(pgserver): Layer 5 — per-path mmap-gen registry + getWA torn-read Closes the Go-panic class of multi-session concurrency bugs and introduces an explicit cross-area mmap invalidation channel. 1. getWA waCache torn-read (root cause of panics) hbrtl/rdd.go cached the most recent `interface{} → WAM` type assertion in a process-global struct of two `interface{}`- shaped fields. Each pgserver connection's NewThread gets its own WAM, so the cache missed on every call and immediately re-wrote two shared, unsynchronised fields. Go's `interface{}` is two words; concurrent write + read produced torn pointer values, with the result that goroutine A could observe goroutine B's WAM as its own. That mis-attribution surfaced as: - `concurrent map writes` panic at WorkAreaManager.Close (workarea.go:95): two goroutines genuinely modifying the SAME wam.aliases map. - `concurrent map writes` panic at DBFArea.FieldPosCache (dbf.go:439): two goroutines lazy-initing the SAME fieldPosMap. Drop the cache. The type assertion is ~ns; not worth a process-global shared slot. If perf matters again, replace with a sync.Map keyed by thread pointer, not a single struct. 2. Per-path mmap generation registry (hbrdd/dbf/area_registry.go) Each unique on-disk DBF path gets an atomic uint64 generation counter. DBFArea instances: - On Open: pathGen = pathGenFor(path); pathGenSeen = current. - On Append (shared) / flushRecord: bumpPathGen(path); pathGenSeen = current. - On loadRecord: if pathGenSeen < live counter, bypass mmap fast path for THIS load (use ReadAt) and re-sync seen. Without this, a peer DBFArea's PutValue mutating a record we'd mmap-cached returned stale pre-mutation bytes from our snapshot. The existing length-bound check covered file-grow (`offset > mmap len`) but not byte-level mutation within the snapshot range. The registry covers both. Cheap: read = one atomic.LoadUint64, hit rate is ~100% in the single-writer-many-readers steady state. Verification ------------ Same 3 / 5 / 10-worker pgx-driven concurrency stress harness: pre-Layer-1 baseline: ~60% pass + occasional panic +Layer 1+2: 80% / 50% / panic +Layer 3a (max-merge): 80% / 50% / panic +Layer 4a (per-session 3): 90% / 80% / 50% +Layer 4b (Go atomics): 75-90% / 50-80% / panic (still) +THIS (getWA + mmap-gen): 73% / 67% / 33% — ZERO PANICS The shift "many partial fails, no panics" is what matters for production: a connection seeing stale data is recoverable (rerun the query); a Go-level process crash is not. Remaining correctness flake comes from the in-flight appendBuf interaction when peer Append fires between this connection's Append and flushRecord — that's tractable with a per-connection flush ordering rule, deferred to Layer 6. All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 6/6 ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 21:43:04 +09:00
CharlesKWON	5e4a1c5d72	refactor(FiveSql2): cross-session globals → Go atomic + RWMutex Completes the per-STATIC migration started in `5bba0c2`. The remaining three TSqlExecutor module STATICs (s_nSchemaVer, s_nRCJSeq, s_hAutoInc) genuinely needed cross-connection visibility — a CREATE TABLE on connection A MUST invalidate B's plan cache, an RCJ alias MUST be unique across all live queries, and an IDENTITY column MUST hand out monotonic values across all writers. Moving them to TSqlSession (per-instance) would have broken those semantics. Solution: back them with Go-side primitives exposed via HB_FUNCs: s_nSchemaVer → atomic.Uint64 (SqlSchemaVer / SqlBumpSchemaVer) s_nRCJSeq → atomic.Uint64 (SqlNextRCJSeq, returns mod-100000) s_hAutoInc → sync.RWMutex + map[string][]string (SqlSetAutoInc / SqlGetAutoIncFields) Lives in `hbrtl/sqlglobals.go`. The PRG-side `FUNCTION SqlSchemaVer() / SqlBumpSchemaVer() / SqlSetAutoInc() / SqlGetAutoIncFields()` definitions in TSqlExecutor.prg are deleted; the HB_FUNC dispatch takes their place. The single PRG caller of `s_nRCJSeq` (in the RCJ helper around line 5600) becomes `SqlNextRCJSeq()` and reads cleaner — the old `s_nRCJSeq := (s_nRCJSeq + 1) % 100000` was both racy and a non-atomic two-write update under multi-conn load. The other module STATIC, `s_hAutoInc`, used to lazy-init on first use (`IF s_hAutoInc == NIL ... := { => }`); two concurrent first-CREATE TABLE calls hit "concurrent map writes" on that branch. The Go RWMutex eliminates the race; reads still scale (RLock) so the IDENTITY-lookup at INSERT time isn't a contention hot-spot. All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 6/6 ✓ Concurrency stress (3-worker × 20): pre-Layer-1: ~60% pass + occasional Go panic +Layer 1+2: 80% pass, no panics +3a: 80% pass +per-session 3 STATIC move: 90% pass +this commit: ~75% pass (variability — Go map atomic + mutex serialise the writers but the underlying hbrdd multi-area mmap path still has its own race, deferred to follow-up) The next bottleneck is at the hbrdd workarea layer (multi-Area instances per file each holding their own mmap snapshot), not at the FiveSql2 STATIC level. That fix is its own commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 19:58:52 +09:00
CharlesKWON	5bba0c2dae	refactor(FiveSql2): per-session aOuterStack/hDmlPcodeCache/lCteDiskSeen Continues the multi-session concurrency cleanup. Phase 1 moved the visible txn + plan-cache state onto TSqlSession; this pass takes the next batch of "shared by accident" STATICs that surfaced as Go-level `concurrent map writes` panics under 5-worker pgserver load: s_aOuterStack — subquery-nesting stack s_hDmlPcodeCache — DML pcode cache (schema-version keyed) s_lCteDiskSeen — CTE-materialised-to-DBF flag Each is now a DATA field on TSqlSession, initialised in New(). TSqlExecutor's 25 access sites (sed-rewritten under inspection) now route through `::oSession:fieldname`. The standalone `SqlDmlPcodeCacheReset()` helper keeps a backward-compatible signature: callers may pass an explicit oSession, otherwise it falls back to SqlDefaultSession() (preserves embedded-mode ergonomics). Remaining STATICs in TSqlExecutor.prg (s_nSchemaVer, s_nRCJSeq, s_hAutoInc) are cross-session-shared by design — schema-version bumps must invalidate every peer's plan cache, RCJ alias sequence needs cross-connection uniqueness, and IDENTITY columns must hand out monotonically increasing values across all writers into the same table. Those need atomic / mutex guards rather than per-session ownership; tracked as a follow-up. Measured impact on the pgserver stress harness (20 runs each): 3-worker 5-worker Layer 1+2: 16/20 (80%) 10/20 (50%) +3a: 16/20 (80%) 10/20 (50%) +THIS: 18/20 (90%) 16/20 (80%) The remaining flake comes from s_hAutoInc's lazy map init under concurrent IDENTITY-table writers and a few interleavings of the header max-merge path. Both are tractable with the planned atomic / mutex shims and the multi-area mmap-gen registry; both deferred to the follow-up commit to keep this diff focused on the move-to-session pattern. All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 6/6 ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 11:43:53 +09:00
CharlesKWON	4fd14f63ef	fix(dbf): max-merge header on shared-mode Close to preserve peer Append Third layer of the multi-session concurrency story. After Layers 1+2 (`67cd8f2` — shared DATA-INIT hash + recCount cache invalidation), the residual flake had this exact failure mode: goroutine A: OPEN -> Append (recCount→1, hdr=1) -> ... goroutine B: OPEN -> Append (refresh→1, bump to 2, hdr=2) -> ... goroutine B: Close -> flushRecord -> updateHeader (writes 2) goroutine A: Close -> flushRecord -> updateHeader (writes 1) ← clobbers! A's updateHeader unconditionally wrote a.recCount back to disk, even when the disk header had been bumped by B's append-intent- locked Append in between. Subsequent peer SELECTs then read hdr=1 and iterated only as far as slot 1, missing B's row that was physically present at slot 2. Fix: in shared mode, updateHeader re-reads the disk header first and writes back max(disk.RecCount, a.recCount). Correct under the existing append-intent-lock invariant (the disk count is monotonically nondecreasing across all peers); cheap (~1 stat- sized read per close, never on the hot append path). EXCLUSIVE mode keeps the old unconditional write — no peer can have bumped the header, so the read+max is pure overhead with no upside. Measured impact (3-worker concurrent insert+select+commit × 20 runs): pre-67cd8f2: ~60% pass, occasional Go panic after `67cd8f2`: 80% pass, no panics after THIS: 80% pass, no panics (3-worker stable) after THIS: 50% pass (5-worker — higher load uncovers additional races at the multi-area mmap layer) The remaining 5-worker flake points at a deeper issue: peer DBFArea instances on the same file each hold their own mmap, and the mmap snapshot taken at Open time doesn't track grow-by- peer events between mmap-time and the next read. loadRecord falls back to ReadAt when offset > len(mmap), so reads themselves work — but the per-area appendBuf interaction with peer-bumped header values needs more thought. Tracked as a proper follow-up; the architectural shape is "every shared DBFArea registers in a per-path mmap-gen registry that broadcasts grow-events". All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 6/6 ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 08:30:01 +09:00
CharlesKWON	67cd8f2306	fix(pgserver,dbf): partial fix for multi-session concurrency race Addresses two of the three layers behind the audit's "WorkArea collision under multi-session" risk surfaced in Phase 3: 1. Shared DATA-INIT hash literals (PRG side). TSqlSession.prg declared `DATA hPlanCache INIT { => }` (plus hSavepoints + hRolePerms etc.). On the gengo path that compiles class-DATA INITs, the {=>} literal is sometimes evaluated ONCE at class-definition time, with every subsequent New() reusing the same hash pointer. Two pgserver connections then read/wrote a single shared HbHash from different goroutines, eventually hitting `concurrent map writes` inside HbHash.ensureIndex (the lazy O(1)-lookup index map). The pre-existing gotcha is already documented in TSqlExecutor.prg's hSubCache comment ("DATA INIT on hash/ array literals can end up sharing the same instance across New() calls depending on the compile path") — TSqlSession had missed the same workaround. Moving the explicit `::hPlanCache := { => }` etc. into the constructor body guarantees a fresh hash per instance. 2. Stale cross-session recCount cache (Go side). `*DBFArea.RecCount()` in shared mode caches its result for the duration of `recCountCacheGen`. Append() bumped the count on disk + refreshed THIS area's count under the append-intent lock (Phase 1 of pre-1.0 audit) but never invalidated the cache on peer DBFArea instances — so a second pgserver connection's RecCount() kept returning its pre-Append cached value. The peer's SELECT then iterated 1..old_count and missed the newly inserted row. Append() now calls `InvalidateRecCountCache()` after committing the bumped header. The generation counter went to atomic.AddUint64 / atomic.LoadUint64 so the bump is safe to fire from any goroutine without a lock around the variable. Measured impact --------------- Same 3-worker concurrent-INSERT-then-SELECT stress test that was ~3/5 passing pre-fix: before: 3 / 5 (40% — plus occasional Go-level panic) after: 8 / 10 (80% — no panics, just intermittent missed rows) The remaining 20% flake is on the third layer — peer mmap shows a pre-Append snapshot when Append's `unmap()` only invalidates this area's own mmap, not the other workareas that opened the same DBF file independently via dbUseArea. Fixing that requires either a cross-area registry of mmap views to invalidate, or skipping mmap entirely when SHARED && cache-gen has bumped. Tracked as a proper follow-up; tests/pgserver/run.sh's "Known limitation" header now points at the narrower problem. Standalone six-gate verification: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 6/6 ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 16:20:25 +09:00
CharlesKWON	0e80b93d0a	docs(pgserver): Phase 7 — bootstrap example + CI gate documentation Wraps the v1.0 PG-wire deliverable with the two pieces operators actually look for: a runnable example PRG and an updated CI gate list in CLAUDE.md. * examples/pgserver_demo.prg — full bootstrap PRG demonstrating every HB_FUNC composed in the order a production deployment needs: PG_TLS_SELF_SIGNED → PG_ADD_ROLE × N → PG_ALLOW_IP × N → PG_SERVER_START( ":5432", "md5" ) Comments cover the SHARED-DBF integration point and the SPAWN idiom for non-blocking server startup. Builds cleanly under the examples_build sweep (now 66/72; was 65/71). * CLAUDE.md — the "어떤 파일이든 수정한 후" mandatory test list goes from 3 gates → 6: 1. go test ./... 2. FiveSql2 SQL:1999 43/43 3. Harbour compat 56/56 4. std.ch 17/17 (added) 5. FRB 7/7 (added) 6. pgserver integration 6/6 (added — psql required) Aligns the rule-of-thumb with reality. The five suites already ran on every audit-era commit; pgserver/run.sh is new in Phases 3-6 and now joins them. This completes the v1.0 PostgreSQL-wire frontend. End-to-end checklist: Phase 1: per-session state isolation [`93cf5c8`] Phase 2: SimpleQuery wire MVP [`d98f5e1` `7083297`] Phase 3: DML + transactions [`a556764`] Phase 4: Extended Protocol (Parse/Bind/Exec) [`8472928`] Phase 5: password + MD5 auth [`90eafcf`] Phase 6: TLS + IP allowlist [`3b2dd36`] Phase 7: example + docs [this commit] Open follow-ups (Phase 7.x): - hbrdd workarea per-thread isolation (audit Top-Risk #2): ≥3 concurrent connections doing in-flight INSERT/SELECT in their own transactions can race at the workarea layer. Fix is a separate workstream against hbrtl/database.go + hbrdd/dbf/. Documented limitation in tests/pgserver/run.sh. - SCRAM-SHA-256 auth (Phase 5.1). - pg_catalog shim for BI-tool introspection (Phase 1.1+ of the original audit plan). - Binary parameter format for NUMERIC/TIMESTAMP (Phase 4.1). All gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ examples 66/72 ✓ (+1 from new pgserver_demo) pgserver integration 6/6 ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 15:20:44 +09:00
CharlesKWON	3b2dd365ad	feat(pgserver): Phase 6 — TLS + source-IP allowlist Closes the v1.0 hardening surface: encrypted transport + a coarse pg_hba.conf-equivalent CIDR allowlist. Together with the Phase 5 auth flows, this is the security-baseline an internet- exposed PostgreSQL-wire server needs. TLS subsystem ------------- `hbrtl/pgserver/tls.go`: * `LoadTLSFromFiles(certPath, keyPath)` — cert/key PEM pair load with tls.VersionTLS12 floor. Installed as the pending config that the next PG_SERVER_START consumes (matches PG's "must-set-before-pg_ctl-start" semantics). * `GenerateSelfSignedCert(certPath, keyPath, hostname)` — ECDSA P-256 + 365-day validity + DNSNames+IPAddresses SANs covering the hostname plus 127.0.0.1 / ::1. Dev/CI helper; production ships a CA-signed cert via the loader. * `upgradeToTLS()` wraps `tls.Server(conn, cfg).Handshake()` so pgproto3 reads plaintext on top of the encrypted stream. Source-IP allowlist ------------------- * `AllowIP(cidr)` parses a CIDR and appends it to a per-server list snapshotted at PG_SERVER_START time. * `peerAllowed(remote, list)` runs at accept() — empty list → accept any, otherwise drop connections whose RemoteAddr falls outside every registered range. * `ClearAllowList()` resets to allow-all. Coarse but compatible with the "host alice 10.0.0.0/8 md5"-style entries every pg_hba.conf author already knows; a fuller per- role/per-database matcher is Phase 6.1+. PRG bindings (register.go) -------------------------- New HB_FUNCs, all idempotent and composable in any order before PG_SERVER_START: pg_tls_load( certPath, keyPath ) → .T. \| cErr pg_tls_self_signed( cert, key, hostname ) → .T. \| cErr pg_allow_ip( cidr ) → .T. \| cErr pg_clear_allowlist() → NIL Bootstrap idiom: PROCEDURE Main() PG_TLS_SELF_SIGNED( "/tmp/cert.pem", "/tmp/key.pem", "localhost" ) PG_ADD_ROLE( "alice", "swordfish" ) PG_ALLOW_IP( "127.0.0.1/32" ) PG_ALLOW_IP( "10.0.0.0/8" ) PG_SERVER_START( ":5432", "md5" ) The startup banner now reports TLS + allowlist state so the PRG operator sees the security posture at a glance: pgserver: listening on :5432 (auth=md5 tls=on allowlist=2) Verification ------------ End-to-end via real psql against a self-signed server: $ PGPASSWORD=swordfish psql \ "postgres://alice@127.0.0.1:15432/alice?sslmode=require" \ -c "SELECT 'tls-works' AS x" -At tls-works $ # off-allowlist source (192.168.x.x mock) → connection refused $ # (verified manually; psql can't easily spoof src IP for CI) Integration script gates expanded to 6/6: PASS Simple Query PASS Multi-statement Simple Query PASS Transaction control PASS MD5 auth: wrong password rejected PASS MD5 auth: correct password accepted PASS TLS handshake + MD5 auth via sslmode=require All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 6/6 ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 14:07:19 +09:00
CharlesKWON	90eafcfc06	feat(pgserver): Phase 5 — password + MD5 authentication Trust mode (v1.0 default) accepts anyone; that's fine for embedded demo but unshipping a multi-client database without credentials would be irresponsible. This commit adds two of libpq's three standard auth flows. SCRAM-SHA-256 is Phase 5.1 — pgx/psql both fall back to MD5 cleanly when the server advertises only md5, so v1.0's functional coverage is complete with the pair landed here. Auth subsystem -------------- `hbrtl/pgserver/auth.go` adds: * An in-memory role registry: `roleMap map[string]role` guarded by sync.RWMutex. Reads (lookupRole) are hot-path during connection startup so the RWMutex lets multiple sessions auth in parallel without serialising through a plain Mutex. `AddRole(name, password)` / `RemoveRole(name)` Go API consumed by the new HB_FUNCs `PG_ADD_ROLE` / `PG_REMOVE_ROLE` (see register.go). Bootstrap PRG idiom: PG_ADD_ROLE("alice", "swordfish") PG_ADD_ROLE("bob", "hunter2") PG_SERVER_START(":5432", "md5") * `authPassword()` — cleartext PasswordMessage exchange. The wire payload is plain so intended for TLS-protected links only; Phase 6 ties the warning to actual TLS detection on the session. * `authMD5()` — libpq's md5 challenge: server → AuthenticationMD5Password{salt: 4 random bytes} client → "md5" \|\| md5_hex( md5_hex(password \|\| user) \|\| salt ) We recompute the canonical hash from the stored plaintext and compare. md5Challenge() is exported for pinning by a Go unit test (vector cross-checked against libpq's fe-auth-md5.c). Salt is sourced from crypto/rand on every challenge so replay attacks against a captured wire trace can't reuse a prior hash. Dispatch matrix (Config.AuthMode → flow): "" / "trust" → AuthenticationOk immediately, no lookup "password" → authPassword() "md5" → authMD5() anything else→ 28000 + connection close Tests ----- Unit (hbrtl/pgserver/pgserver_test.go): PASS TestMD5Challenge (vector + determinism + diff) PASS TestRoleRegistry (add/replace/remove/lookup) Integration (tests/pgserver/run.sh): PASS Simple Query: SELECT 1, 'hello' PASS Multi-statement Simple Query PASS Transaction control: BEGIN/COMMIT round-trip PASS MD5 auth: wrong password rejected PASS MD5 auth: correct password accepted End-to-end matrix with real psql: wrong password → "ERROR: md5 authentication failed for user 'alice'" correct password → SELECT returns row unknown user → "ERROR: md5 authentication failed for user 'eve'" password mode → cleartext exchange works equivalently All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 5/5 ✓ (up from 3/3 in Phase 4) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 14:01:30 +09:00
CharlesKWON	8472928102	feat(pgserver): Phase 4 — Extended Protocol (Parse/Bind/Execute) pgx and most drivers default to PostgreSQL's Extended Protocol (named prepared statements). Phase 2 only handled Simple Query, so every pgx caller had to force `QueryExecModeSimpleProtocol` — unworkable for a production deployment. This commit lands the full Parse → Bind → Describe → Execute → Sync state machine, enough that pgx (and any other libpq-protocol-v3 client) works without any client-side knobs. Implementation lives in `hbrtl/pgserver/extended.go`: * Per-session caches `stmts map[string]preparedStmt` and `portals map[string]portal`, lazily allocated on first use. Stored as fields on `session` so they don't leak across connections. * Parameters are inlined at Bind time via `substituteParams` — the resolved SQL is a normal Simple-Query-shaped string the engine sees through the existing `five_SQL(cSQL, …, oSession)` pipeline. Avoids teaching FiveSql2 a second param-shape; the trade-off is that binary timestamps/numerics round-trip through text (Phase 4.1 will plumb `?`-params through aParams for the binary fast path). * `paramToLiteral` decodes the binary-format encodings pgx uses by default for INT4/INT8/BOOL (big-endian fixed-width). Other binary OIDs fall back to a hex-escaped quoted literal which errors loudly rather than silently misparsing. * `countPgPlaceholders` scans the SQL outside string literals for the highest `$N` so the server can answer Describe-statement with a correctly-sized ParameterDescription even when the client didn't pre-declare param OIDs. Without this, pgx errored with "expected 0 arguments, got 2" on the very first prepared query. * RowDescription emission: Describe-statement still returns NoData (we can't infer row shape without execution). When Execute fires on a portal the client never Described, we emit RowDescription inline from the cached result before DataRow streams. pgx and psql both tolerate this ordering. * Execute → CommandComplete tag derives from the SQL verb via the existing `commandTagFor` helper. Row counts in the tag remain "VERB 0" for v1.0; threading real counters through the engine is Phase 5. Wire dispatch in `session.go:queryLoop` now handles Parse, Bind, Describe, Execute, Close, Sync, Flush — the full v3 message set. Verification ------------ End-to-end pgx (default mode, no SimpleProtocol flag) successfully runs: SELECT $1 AS n, $2 AS s with 42 + "hi" → [42 hi] Same statement re-executed with different bound values → reuses the cached prepared statement SELECT $1 AS b, $2 AS s with true + "binary-bool" → [t binary-bool] `tests/pgserver/run.sh` expanded from 1 → 3 integration assertions: PASS Simple Query: SELECT 1, 'hello' PASS Multi-statement Simple Query PASS Transaction control: BEGIN/COMMIT round-trip (Extended Protocol can't be driven from psql's -c CLI directly because psql's PREPARE/EXECUTE is a separate SQL-level feature that FiveSql2 doesn't parse; the pgx-driven path verifies it manually, and a self-contained Go integration that drives pgx from inside a process bootstrap is Phase 7 work.) All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 3/3 ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 12:55:41 +09:00
CharlesKWON	a5567648e9	test(pgserver): Phase 3 — DML + transaction integration harness Adds tests/pgserver/run.sh, the integration gate for the wire layer. Builds a minimal bootstrap PRG that opens nothing and just calls PG_SERVER_START on an ephemeral port, then drives psql with a Simple Query to confirm the end-to-end pipeline (TCP accept → startup handshake → Query → five_SQL → RowDescription + DataRow → ReadyForQuery) still works after every change. Phase 3 verified scope (driven via a separate pgx harness during development): * CREATE TABLE / INSERT / UPDATE / DELETE over Simple Query * BEGIN / COMMIT / ROLLBACK from the wire * Two-connection cross-visibility on a shared DBF * Per-session ROLLBACK leaves the other connection's data intact — the Phase 1 STATIC → TSqlSession refactor is what makes this hold; pre-refactor, both connections would have shared one s_aTxnLog and A's ROLLBACK would have collapsed B's COMMIT. Known limitation captured in the script header (deferred to Phase 7 follow-up): * ≥3 concurrent connections doing in-flight INSERT/SELECT in their own transactions occasionally race at the hbrdd workarea layer — surfaces as one worker's just-inserted row missing from its own SELECT. 2-way concurrent + N-way serial are both reliable. Root cause is multi-thread workarea arbitration during dbUseArea/dbAppend, which the pre-1.0 audit flagged as Top-Risk #2 ("WorkArea collision under multi-session"). Tracking for a dedicated fix. Gate count now reads: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ examples 65/71 ✓ (unchanged baseline) pgserver integration 1/1 ✓ (new) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 07:25:13 +09:00
CharlesKWON	708329785a	test(pgserver): wire-protocol roundtrip via net.Pipe Adds an in-process startup-handshake test using net.Pipe so we can pin the protocol envelope (StartupMessage → AuthenticationOk → ParameterStatus×N → BackendKeyData → ReadyForQuery) without binding a real TCP port. Runs in <1ms; safe for CI. The PRG-dispatch path (runSQL → FIVE_SQL → row encoding) is already covered manually by spinning a `five run` of `pg_server_start(":15432")` and connecting with pgx — that flow verified post-MVP that a real PostgreSQL client receives `{ONE (INT4), GREET (TEXT)}` + row `[1 hello]` for `SELECT 1 AS one, 'hello' AS greet` over the wire. An automated shell harness will land in Phase 7 with the psql integration tests. Also rolls go.mod / go.sum forward with the pgx v5 toolchain pulled in by Phase 2's pgproto3 dependency. Module bump 1.21.13 → 1.25.0 matches what `go get github.com/jackc/pgx/v5/pgproto3` selected; cross-builds for windows/linux/darwin all still succeed (verified locally). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 22:13:40 +09:00
CharlesKWON	d98f5e1767	feat(pgserver): PostgreSQL-wire MVP — psql can SELECT from FiveSql2 First end-to-end working version of the PostgreSQL-wire-compatible TCP server frontend. A standard `psql` client now connects, runs `SELECT * FROM employees`, and gets back a properly typed result set rendered by psql with the right column alignment: ID \| NAME \| SALARY ----+----------------------+---------- 1 \| Alice \| 50000.00 2 \| Bob \| 42000.50 3 \| Cho \| 77500.00 This is the Phase 2 deliverable from the approved plan at /Users/charleskwon/.claude/plans/compiled-launching-shore.md. Builds on the session-state refactor in `93cf5c8` — each connection gets its own TSqlSession on the PRG side via the new PG_NEW_SESSION HB_FUNC, so concurrent psql clients won't share transaction logs or plan caches. Scope ----- v1.0 MVP: Simple Query only, trust auth, no TLS yet. SELECT works against the full FiveSql2 surface (CTEs, window functions, JOINs, aggregates). DML + per-session transactions are Phase 3, extended protocol is Phase 4, auth + TLS are Phases 5/6. Architecture ------------ psql/pgx/JDBC ──TCP:5432──▶ pgserver.Listener │ accept() ▼ go handleConn(net.Conn) ┌─────────────────────────────┐ │ Session goroutine │ │ 1. SSLRequest peek │ │ 2. StartupMessage │ │ 3. AuthenticationOk (trust) │ │ 4. ParameterStatus×7 │ │ 5. BackendKeyData │ │ 6. ReadyForQuery('I') │ │ 7. loop: Receive() → │ │ dispatchSimpleQuery → │ │ hbrt.Thread.Function( │ │ FIVE_SQL,sql,...,sess) │ │ emit RowDescription │ │ emit DataRow×N │ │ emit CommandComplete │ │ emit ReadyForQuery │ └─────────────────────────────┘ One goroutine per connection, each owning its own hbrt.Thread and TSqlSession instance. Uses the existing audit-fixed NewThread() (`cde8673`) so statics + WA factory propagate. New files (hbrtl/pgserver/) --------------------------- server.go — Config, Server, Serve loop with MaxConnections gate via semaphore, Close drains in-flight sessions. * session.go — full lifecycle: SSLRequest peek + prefixedConn byte-injection trick for StartupMessage, ParameterStatus broadcast (server_version "14.0 (FiveSql2)" so pgx negotiates), BackendKeyData (random pid+secret per session, no CancelRequest yet), query loop dispatching only Simple Query in v1.0 with a loud "0A000 not supported" for Extended messages. * dispatch.go — runSQL invokes FIVE_SQL via PushSymbol+Function, unpacks the engine's `{aFieldNames, aRows}` envelope or the `{{"__error__"}, {{nCode, cMsg, cSQL}}}` error shape, emits RowDescription with text-format OIDs and DataRow per row. * typemap.go — pgTypeFor() picks INT4 / INT8 / NUMERIC / TEXT / DATE / TIMESTAMP / BOOL by sampling the first row's value type; encodeText() formats each cell, returning nil-slice for NULL (the PG length=-1 convention). * errmap.go — sqlStateFor() maps FiveSql2 SQL_ERR_* codes to canonical PG SQLSTATEs (42601/42P01/42703/42804/23505/23514/ 23503/25P02/42501/02000/XX000). * auth.go — trust mode in v1.0; password/MD5/SCRAM lands Phase 5 but the dispatch sentinel is already in place. * tls.go — upgradeToTLS stub for SSLRequest handling; the byte- ordering is already wired so Phase 6 just plugs in tls.Config. * register.go — package init() registers pg_server_start / pg_server_stop HB_FUNCs. Importing the package (done from hbrtl/register.go via blank import) is enough to enable them. * pgserver_test.go — unit tests for encodeText (numeric, string, NIL), pgTypeFor (OID dispatch), sqlStateFor (error mapping), commandTagFor (SELECT/INSERT/UPDATE/DELETE/BEGIN/COMMIT). Other changes ------------- * _FiveSql2/src/TSqlSession.prg — added PG_NEW_SESSION() factory used by the Go dispatcher to allocate a per-connection session bypassing the embedded process default. * hbrtl/register.go — blank-import five/hbrtl/pgserver so its init() fires and the HB_FUNCs land in the global dynamic-func table for VM symbol lookup. * go.mod / go.sum — github.com/jackc/pgx/v5 v5.9.2 (pgproto3 subpackage). MIT license. Same library pgx itself uses, so protocol coverage matches the de-facto Go PG ecosystem. Verification ------------ $ pg_server_start(15432, "trust") /* PRG one-liner / $ psql -h 127.0.0.1 -p 15432 -U fiveuser -c 'SELECT FROM employees' → 3 rows rendered correctly by psql (ID as INT4, NAME as TEXT, SALARY as NUMERIC(10,2) with 2 decimal places) All six release gates green: go test ./... ✓ (incl. new hbrtl/pgserver tests) FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ examples 65/71 ✓ (unchanged baseline) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 18:40:32 +09:00
CharlesKWON	93cf5c8bfa	refactor(FiveSql2): per-session state — TSqlSession isolates txn + plan cache Foundation for the upcoming PostgreSQL-wire server. The SQL engine previously held transaction state and the plan cache in module-level STATICs: TSqlTxn.prg:16-18 STATIC s_aTxnLog := {} STATIC s_lInTxn := .F. STATIC s_hSavepoints := NIL TFiveSQL.prg:37 STATIC s_hPlanCache := { => } gengo emits PRG STATIC as Go package variables, so two clients sharing one process serialised through a single transaction log: client A's `BEGIN; INSERT;` followed by client B's `ROLLBACK` would silently undo A's insert. Acceptable for embedded single- caller use; show-stopper for a multi-connection daemon. Moved each of those into instance fields on a new TSqlSession class. Every executor instance now carries an oSession pointer that's inherited by nested subquery executors. A process-default session is lazy-initialised by SqlDefaultSession() so embedded `five_SQL(cSQL)` callers (today's only consumer) keep working unchanged. Changes ------- * `_FiveSql2/src/TSqlSession.prg` (new) — class holding the four ex-STATICs plus seats for auth/ACL state and a list of workareas the session opened (used later for disconnect cleanup). Module- level `SqlDefaultSession()` lazily creates one process-wide default for embedded callers. * `_FiveSql2/src/TSqlTxn.prg` — added `oSession` DATA; New() takes an optional oSession and falls back to the default. All STATIC reads/writes rewritten as `::oSession:aTxnLog`, `::oSession:lInTxn`, etc. * `_FiveSql2/src/TFiveSQL.prg` — added `oSession` DATA; New() takes an optional second arg. Plan-cache reads/writes route through `::oSession:hPlanCache`. SQL_PLAN_CACHE_MAX now caps each session independently (a chatty client only flushes its own cache, not the shared one). * `_FiveSql2/src/TSqlExecutor.prg` — added `oSession` DATA; New() takes an optional third arg; `::oTxn := TSqlTxn():New(::oSession)` propagates the binding. Every in-class `TSqlExecutor():New(...)` call site for subqueries / UNION / IN-list materialisation / EXISTS / lifted subqueries now passes `::oSession` through, so a child executor inherits the parent's session. Standalone helper functions (SqlEvalExprNode / SqlFetchRowArr / SqlJoinRecurse / SqlMaterializeSubquery) intentionally fall back to the default session — they don't BEGIN/COMMIT and the plan cache is keyed by schema-version anyway. * `_FiveSql2/src/FiveSqlCls.prg` — `five_SQL()` gains an optional fourth arg `oSession`. Existing 1-/2-/3-arg callers keep working; pgserver will create one TSqlSession per connection and pass it. Verification ------------ Per-session isolation pinned by a fresh PRG-level regression (reproducer not committed yet — will land with pgserver test suite). The scenario: oSessA := TSqlSession():New() oSessB := TSqlSession():New() oSqlA := TFiveSQL():New(NIL, oSessA) oSqlB := TFiveSQL():New(NIL, oSessB) oSqlA:Execute("BEGIN") -- A in txn oSqlB:Execute("BEGIN") -- B in txn, A unaffected oSqlB:Execute("INSERT ... VALUES(2,'b-row')") oSqlB:Execute("COMMIT") -- B committed, A still in txn oSqlA:Execute("ROLLBACK") -- A's empty rollback, B's row survives All four assertions pass post-refactor, would fail pre-refactor because both sessions wrote the same `s_aTxnLog`. All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ examples 65/71 ✓ (unchanged baseline) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 17:47:00 +09:00
CharlesKWON	cde86730b8	fix(compiler,hbrt,hbrdd,cli): pre-1.0 audit — 13 critical fixes Senior-engineer / QA audit landed 13 silent-miscompile and data- integrity fixes spanning the whole compiler+runtime+storage stack. Each fix is paired with either an integration test in the suite or a focused regression check; all 6 release gates stay green: go test ./..., FiveSql2 43/43, Harbour compat 56/56, std.ch 17/17, FRB 7/7, examples 65/71. Compiler -------- * genpc IF/ELSEIF jumpEnd2 patching (compiler/genpc/genpc.go). Per-ELSEIF branch terminators were stashed into `_ = jumpEnd2` and never patched — the relative offset stayed 0 and the runtime walked the next ELSEIF's PcOpJumpFalse opcode as if it were jump-offset data. Bytecode-level corruption in pcode mode. Now collected into a slice and patched at end-of-IF. Verified via Grade(95..50) cases 11a-e added to tests/frb/test_frb_pcode_sweep. * countLocalsInStmts / scanBodyLocals missing bodies (compiler/gengo/gen_util.go, compiler/gengo/gengo.go). Frame-size counter skipped WATCH/TIMEOUT/PARALLEL FOR bodies, so a LOCAL declared inside one of those constructs got a slot index past the runtime's allocated count — silent NIL reads or out-of-range stomps. * emitMethodDeclStandalone nested LOCAL (compiler/gengo/gen_class.go). Same bug class but on the method side. Pre-fix repro: METHOD Stomp(n) CLASS T LOCAL a := 1, b := 2 IF n > 0 LOCAL c := 30, d := 40, e := 50, f := 60 Inner( n ) IF c != 30 .OR. d != 40 .OR. e != 50 .OR. f != 60 ... printed `c, d, e, f = 5, NIL, NIL, NIL` because Inner's frame collided with Stomp's underallocated slot range. Now counts body-nested LOCALs into the frame and pre-allocates indices via scanBodyLocals. * genpc unsupported-AST diagnostic surface (compiler/genpc/genpc.go, hbrt/pcode.go, cmd/five/main.go, hbrtl/frb.go). The `default` cases in emitStmt / emitExpr silently emitted PushNil / no-op for nodes the pcode generator doesn't implement (ClassDecl, MethodDecl, xBase commands, concurrency primitives, …). Added `PcodeModule.Warnings []string` populated by noteUnsupported, surfaced on stderr from the build pipeline. Users now see "pcode: AST node not supported in --pcode/FRB-pcode mode: stmt ast.GoBlockStmt" instead of getting a silently broken module. Runtime ------- class.go Send/tryBinaryOp t.self defer-restore (hbrt/class.go). Restoration was a plain `t.self = oldSelf` after `fn(t)`. Any panic in the method body skipped the line, so the next BEGIN SEQUENCE / RECOVER handler ran with the THROWING object's Self — `::field` resolved against the wrong receiver. Wrapped both restore sites in `defer func() { t.self = oldSelf }()`. Verified: pre-fix RECOVER saw "THROWER", post-fix "OUTER". * hbfunc.go HB_FUNC parameter Frame() (hbrt/hbfunc.go). The RegisterDynamicFunc wrapper called `fn(ctx)` without ever calling Frame, so `ctx.ParC(1)` / `ctx.Local(n)` read through `t.curFrame.localBase + n - 1` against the caller's frame. Every #pragma BEGINDUMP HB_FUNC taking parameters silently returned "" / 0 / "" for them — masked by ParNIDef-style defaults. Wrapper now does `t.Frame(t.pendingParams, 0); defer t.EndProc()` before dispatch. * pcode codeblock closure capture (hbrt/pcinterp.go, hbrt/pcode.go, hbrt/thread.go, compiler/genpc/genpc.go). PcOpPushBlock recorded `nDetached` but never copied enclosing locals; free vars in the block body fell through to memvar lookup → NIL. Wired full capture pipeline: - New opcodes PcOpPushDetached (0x59) / PcOpPopDetached (0x5A). - PushBlock now reads per-slot source-local indices and snapshots into bb.Detached at construction time. - New detachedMap in genpc auto-promotes any free var that resolves to an enclosing-frame local into a capture slot. - emitAssignAsExpr leaves the assigned value on the eval stack so SeqExpr items like `{\|v\| acc += v, acc }` work. - Thread tracks curBlock with paired Set/restore in the block's Fn wrapper for nested-block evaluation. Mutating capture (acc += v across successive Evals) now works. * vm.NewThread statics + waFactory propagation (hbrt/vm.go). GoLaunch / GoLaunchBlock call NewThread directly. Previously the statics map and WA factory were applied only in Run(), so goroutine-spawned PRG code panicked on STATIC access ("static index out of range") and crashed dereferencing nil WA on any DB call. Both now happen inside NewThread under the same lock as TID assignment. Data layer ---------- * dbf concurrent Append lock (hbrdd/dbf/dbf.go, hbrdd/dbf/locks_posix.go, hbrdd/dbf/locks_windows.go). Append bumped a local recCount with no file-system serialization. Two shared-mode processes both wrote at the same RecordOffset; one record silently overwrote the other. Added an append-intent byte-range lock at offset 0x7FFFFFFE + bounded retry, on-disk header refresh inside the locked region, and immediate header write so peers refresh past our slot. * indexer negative numeric key encoding (hbrdd/dbf/indexer.go + new hbrdd/dbf/encode_numeric_test.go). `%20.10f` formats `-100` as `" -100.0000000000"` and `99` as `" 99.0000000000"`. ASCII ' ' (0x20) < '-' (0x2D), so `99` lex-compared LESS than `-100` — every NTX/CDX index over a column that ever held a negative number returned wrong rows for SEEK / range scans. Replaced with a 1-byte sign prefix + 21-byte zero-padded magnitude (negatives use digit-complement) so byte order matches numeric order across signs and magnitudes. Format change: existing indexes built with the old encoding must be REINDEXed. Three unit tests pin the order. * dbf Append index maintenance hooks (hbrdd/dbf/dbf.go, hbrdd/dbf/indexer.go). Append never inserted into open NTX/CDX indexes — the audit's canonical scenario `SET INDEX TO …; APPEND BLANK; REPLACE …; dbSeek …` silently missed the new record. Added optional IndexWriter interface, queue the new recNo in pendingIdxInserts, drain after flushRecord by calling InsertKey on every open writer-supporting engine. NTX participates (its existing rebuild-on-insert is correct); CDX online maintenance is deferred to a follow-up — those indexes still need REINDEX. Verified: post-fix SEEK("Charlie") after APPEND BLANK + REPLACE finds the new record. * dbf PACK crash-safety (hbrdd/dbf/dbf.go). The old in-place rewrite read record N, overwrote slot M<N, then truncated. Power loss after partial loop left a file with overwritten prefix and no original copies of the records already advanced past — silent data loss. Rewrote to: 1) drop mmap, build `<file>.pack.tmp` with all surviving records, 2) Sync(), 3) close original handle + os.Rename(tmp, orig) (atomic on same FS), 4) reopen + re-mmap. TestComp_Pack passes; readers always see either the pre-PACK or post-PACK contents, never a half-state. * mem RDD torn reads (hbrdd/mem/memrdd.go). The comment claimed in-place PutValue was safe because hbrt.Value "fits in a single machine word + pointer". hbrt.Value is 24 bytes (3 words) — a concurrent reader could observe new type tag with stale scalar/ptr and type-confuse on the next AsXxx() call. Switched mu to sync.RWMutex; GetValue takes RLock, Append/PutValue/Delete/Recall take Lock. `go test -race ./hbrdd/mem/` clean. Files touched ------------- compiler/gengo/gen_class.go, gen_util.go, gengo.go compiler/genpc/genpc.go hbrt/class.go, hbfunc.go, pcinterp.go, pcode.go, thread.go, vm.go hbrdd/dbf/dbf.go, indexer.go, locks_posix.go, locks_windows.go hbrdd/dbf/encode_numeric_test.go (new) hbrdd/mem/memrdd.go cmd/five/main.go hbrtl/frb.go tests/frb/test_frb_pcode_sweep.prg Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 05:29:56 +09:00
CharlesKWON	c5dd74c044	fix(pp): codeblock-in-macro + multi-line ;-continuation for #command Three silent-miscompile fixes in the preprocessor that were masking real bugs in Harbour-style PRG. 1. Brace tokenizer (compiler/pp/command.go) `{` and `}` now tokenize as standalone separator tokens. The matcher previously only split on `,()[]"'` etc., so a codeblock literal `{\|\| ... }` in a macro argument became the tokens `{\|\|`, `""`, `}`. The capture-depth tracker only matched exact `{`/`}`, so `{\|\|` was invisible as an opener while the standalone `}` wrongly decremented depth — `TEST_LINE( o:VarPut({\|\| "" }) )` truncated mid-argument and the parser later choked at the inner `}` with `expected ), got } "}"`. Fix: add `{` and `}` to tokenizeLine's separator set. Now `{\|\| ... }` lexes as `{`, `\|\|`, `""`, `}` and balances cleanly. 2. ;-continuation join for non-`#` lines (compiler/pp/pp.go) The existing line-joiner only collapsed trailing `;` continuations on `#`-prefixed directives. Plain source code using the same convention — e.g. Harbour's TEST macro: TEST t004 STATIC s_once := NIL, S_C ; INIT hb_threadOnce( @s_once, {\|\| ... } ) ; CODE x := S_C was processed one physical line at a time, so the TEST pattern never matched the full logical statement. The first row passed through unrewritten, fell through to the parser as an expression, and gengo silently absorbed it as part of the previous function's body. Six TEST macros' STATIC declarations all ended up tagged with t003's function name, producing duplicate `static_T003_S_ONCE` decls and a Go compile failure. Fix: add the same trailing-`;` join logic to user code, with blank-line fillers inserted post-join so source line numbers in parser errors still align with the original file. 3. Block-comment-aware continuation join Inline `/* ... /` at the end of a continuation row hid the trailing `;` from the joiner's HasSuffix check. The fix calls stripBlockComments on the next-line peek before testing for `;`, so chains like AAdd( aResult, { cChildBase, ; aRefs[ "fk" ][ j ][ 1 ], ; / child col / aRefs[ "fk" ][ j ][ 3 ], ; / parent col */ ... keep folding instead of stopping after one row and leaving a dangling `,` at end of line. Results ------- Harbour-core compat sweep: 25/30 → 28/30 (remaining lnlenli1 + keywords are //NOTEST stress files, intentionally unbalanced). All 6 release gates green: go test ./..., FiveSql2 43/43, Harbour compat 56/56, std.ch 17/17, FRB 7/7, examples 65/71. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 05:28:54 +09:00
CharlesKWON	ce7b067785	fix(cli): multi-PRG build adds every input dir to the include path Each PRG file's preprocessor instance was set up with only its OWN directory on the include search path (`filepath.Dir(prgFile)`). That worked for self-contained files but broke any multi-file build where one PRG `#include`s a header that lives next to a SIBLING PRG — the other file's directory wasn't on the path, so the include silently failed and PP just skipped it ("// #include \"FiveSqlDef.ch\" — not found (skipped)"). This was the root cause behind test_sql_standards's mass-failure pattern. The test does #include "FiveSqlDef.ch" ... Assert( ..., h["columns"][1][1][1] == ND_FN .AND. ... ) `FiveSqlDef.ch` lives in `_FiveSql2/src/` (next to TSqlExecutor.prg and friends), but the test source sits in `_FiveSql2/test/`. Building with `./five build _FiveSql2/test/test_sql_standards.prg _FiveSql2/src/.prg` should resolve the header from a sibling input file's directory — but only the test's own dir was searched, so ND_FN / ND_LIT / ND_BIN / ND_UNI all stayed undefined and the identifiers fell through to runtime memvar lookup, returning NIL. Every assertion that compared against the constants therefore silently failed (24 / 64 passing because non-constant assertions still worked). buildMultiPRGWithIncludes now seeds the user-include list with the directory of every input PRG before handing off to buildMultiPRG. A test under one directory can now resolve a `#include` that lives next to a sibling source file in the same multi-file build. Result: test_sql_standards goes from 24 / 64 to 64 / 64*. The parser was already correct end-to-end — every SQL:2003-2023 construct it had been advertising actually worked; the test just couldn't read the constants it was asserting against. Wired test_sql_standards into the std.ch runner with a per-test override so it picks up the FiveSql2 src files. Suite stands at 17/17. Other gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 FiveSql2 standards : 64/64 (was 24/64) Harbour compat : 56/56 std.ch suite : 17/17 FRB suite : 7/7 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 19:21:45 +09:00
CharlesKWON	af0d54d352	fix(lexer): {array}[index] no longer mis-tokenises [ as bracket-string The lexer's isStringBracket disambiguator decides whether `[` opens an indexing operator or a Harbour bracket-string literal. The heuristic checks the previous token's kind and treats the bracket as indexing only when preceded by an IDENT, RPAREN, RBRACKET, or a literal. RBRACE was missing — so FieldPut(3, {"Kim","Lee","Park","Choi","Yoon"}[Int(Mod(i-1,5))+1]) tokenised the `[` after `}` as a bracket-string opener, swallowed through the first `]` it found, and produced bogus parse errors ("expected ), got STRING …"). RBRACE is now in the indexing-context set, so an inline array-literal followed by `[index]` works. Surfaced by the examples/ build sweep — fixed test_all_rdd, test_index_adv, test_multi_rdd, test_rdd_full all in one go. The sweep itself is committed as tests/examples_build.sh — builds every PRG under examples/ and reports any compiler / preprocessor errors. Run it after compiler changes to catch regressions in broad-coverage user-style code that the focused suites don't exercise. Current sweep state: 65 / 71 examples build cleanly. The remaining 6 failures are all #pragma BEGINDUMP blocks that import external Go packages (http, websocket, sqlite, time) — not Five-side bugs. Other gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 std.ch suite : 16/16 FRB suite : 7/7 examples build : 65/71 (rest = external Go deps) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 21:06:13 +09:00
CharlesKWON	2008266da7	feat(pp,rtl): Tier 2 audit followups — JOIN hash + PP validation + C heuristic Three medium-priority audit items in one commit, each independently revertible. * #18 JOIN hash-join fast path. New std.ch shape: JOIN WITH <alias> TO <file> [FIELDS ...] ON <mfield> = <dfield> expands to a 6-arg __dbJoin call with the master/detail key field names. Runtime detects the extra args, builds an O(M) hash over the detail's key column, then probes per master row for O(N+M) total — vs the FOR form's O(NM). For 1k×1k that's 2k vs 1M operations; the gap widens with N. The original FOR form is unchanged and stays the fallback for arbitrary predicates. New helper dbHashKey type-tags the key string so `1` (numeric), `"1"` (string), and `.T.` (logical) don't collide in the bucket map. #38 PP rule result-marker validation. ParseRule now walks the result template after parseMarkers and warns about every `<name>` (or `<(name)>` / `<.name.>` / `<{name}>` / `#<name>` / `<"name">`) that doesn't match a pattern marker. Warnings flow into pp.errors via handleDirective with the directive's filename:line, so a typo'd `<NaMe>` in an `#xcommand` case-sensitive rule fails the build with a clear diagnostic instead of silently producing broken expansions. * #44 looksLikeInlineC heuristic strengthened. Catches more of the common Harbour-PRG-with-C-inline-block shapes that used to fall through and produce cryptic Go-side errors: function-like #define, `extern "C"` linkage blocks, C return- type declarations (`int foo(`, `static char* bar(`), and the hb_ret() helper family used by Harbour's C FFI return setters. Two small predicate helpers (allLetters, allIdentChars) keep the C-vs-Go disambiguation tight enough that legit Go code (`func name() int { ... }`) doesn't trip. #28 LIST/DISPLAY pagination — explicitly deferred. Proper pagination requires interactive terminal handling (Inkey(0) for the keypress) which would hang in CI / batch mode. Will revisit when an interactive terminal layer needs it for other reasons. Test fixtures: tests/std_ch/test_join_hash.prg verifies the new ON-form path produces the same output as the FOR form would. std.ch runner now stands at 16/16. Other gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 std.ch suite : 16/16 FRB suite : 7/7 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 19:21:19 +09:00
CharlesKWON	29ca02e1bc	fix(genpc,parser,pcinterp): pcode wider regression sweep (Tier 1 #3 ) Six more silent miscompiles in the pcode path, all uncovered by a new pcode regression sweep that exercises the full PRG surface a dynamic FrbCompile body could legitimately use. * xBase-keyword shadowing of variable names. parseIdentStmt and parseExprStmt's fallback switches consumed an entire line when the leading IDENT matched LABEL / REPORT / ACCEPT / INPUT / NOTE / etc. Those words are also extremely common LOCAL / PRIVATE names — `LOCAL label ; label := "x"` had the assignment swallowed because the switch didn't peek at the next token. Both switches now look at peek(1): an assignment operator, [], (, -, ++, --, or `.` means it's a variable / call / member access, not the xBase command, and we fall through to expression parsing. Real silent bug — bit test_frb_pcode_sweep's `LOCAL label` declaration. * `arr[i]` indexing not implemented in genpc. ast.IndexExpr fell through to the default PushNil path, so any indexed read in a pcode-mode body returned NIL. New case emits the array, the index, and PcOpArrayPush (the get-op; PcOpArrayPop is the set-op — naming follows Harbour convention). Hashes go through the same opcode, which already special-cases IsHash() in ops_collection.go. * Hash literals not implemented in genpc + dispatch missing in pcinterp. `{ "k" => v, ... }` fell to PushNil. Added HashLitExpr emit (Push key, Push value pairs, then PcOpHashGen with count). Also wired up the PcOpHashGen dispatch in execPcodeBody — it had been declared in pcode.go since the initial design but the case statement was never added, so even hand-written modules couldn't use hashes. * `x++` / `x--` postfix were silent no-ops. PostfixExpr fell to PushNil and the surrounding ExprStmt then popped the NIL. DO WHILE loops with `n--` couldn't terminate; FOR loops with `i++` in the body were broken too. New case: PushLocal + LocalAddInt(±1). * BlockExpr (`{\|p\| body }`) wasn't compiled. Eval(b, n) inside a pcode body returned NIL. Added: build the body in a sub-codebuffer with the block's params occupying its locals, emit PcOpRetValue at the end, then PushBlock with the serialized bytes. Format extended with a uint16 nParams field so the runtime's PcOpPushBlock dispatch can set PcodeFunc.Params correctly — without it, ExecPcode's Frame(0, 0) pulled none of Eval's args and the block saw every parameter as NIL. * All g.locals accesses were case-sensitive. PRG is case- insensitive, but the pcode generator stored block params via strings.ToUpper while every other lookup site (function decl, mid-decl, ForStmt, IdentExpr read, AssignExpr write, PostfixExpr) used the raw .Name. So `{\|x\| xx }` stored "X" but read "x" and missed. Normalized: all insertions and all lookups now go through strings.ToUpper. SeqExpr in pcode — added the matching emit for comma- separated expression lists in code blocks (`{\|\| a, b, c }`). Same shape as the gengo SeqExpr case from Wave 1. Test fixture: tests/frb/test_frb_pcode_sweep.prg covers 14 shapes (string ops, arithmetic, comparison chains, array indexing, DO WHILE with postfix, nested IF, IIf, hash literal + indexing, block + Eval, character iteration). All 14 pass. Wired into the FRB runner — suite now stands at 7/7. Other gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 std.ch suite : 15/15 FRB suite : 7/7 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 11:32:38 +09:00
CharlesKWON	dca7bb22e5	fix(gengo): count nested LOCALs into the function frame Function-entry Frame() allocation counted only top-level LOCAL declarations from fn.Body. Mid-function LOCALs hidden inside an IF / FOR / WHILE / DO CASE / SWITCH / SEQUENCE block weren't included, so the runtime allocated a frame too small to hold them. Subsequent reads/writes via PopLocalFast / PushLocalFast / LocalAdd to those slot indices then either silently scribbled past the frame (read-back saw NIL) or panicked with "local variable index out of range" once the index exceeded the underlying slice. This is the underlying bug behind frb_demo Section 4 — the `LOCAL ch := Channel(1)` declared inside `IF pAsync != NIL` got slot N+1 from the codegen but the runtime only allocated N. The Channel value was scribbled past the frame, ChReceive then read NIL from a non-existent slot, and the goroutine's ChSend(49) had nowhere to land. New helper gen_util.go::countLocalsInStmts walks every nested body (IF + ElseIfs + ElseBody, ForStmt, ForEachStmt, DoWhileStmt, SeqStmt's Body + RecoverBody, SwitchStmt's Cases + Otherwise) and totals every ScopeLocal VarDecl. The function-emit caller adds this to the top-level count before sizing the Frame. Test fixture (tests/frb/test_frb_goroutine.prg) reproduces the demo Section 4 shape — `LOCAL ch := Channel(1)` inside IF, then `Go("WORKER", ch, 7)`, then ChReceive(ch). Wired into the FRB runner so it stands at 6/6. Other gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 std.ch suite : 15/15 FRB suite : 6/6 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 07:05:22 +09:00
CharlesKWON	6a30c4e50e	fix(gengo): compound assign for non-LOCAL LHS Audit follow-up after Wave 1's pcode `+=` fix surfaced a parallel class of silent miscompiles in the gengo (native-Go) emit path. Three real bugs hiding behind happy-path test coverage: * `arr[i] += x` was ASSIGN-only — the IndexExpr branch returned after emitting `arr[i] := x`, dropping the original element. Now: PushArray + Push index, ArrayPush to read, fold with RHS, re-do PushArray + index, ArrayPop to store. * `alias->field += x` (and the M-> / MEMVAR-> namespace variants) were ASSIGN-only too. Same shape of bug — `x->v += 7` compiled as `x->v := 7`. Compound branch reads via PushAliasField (or PushMemvar for M->), folds, stores via SetAliasField (or PopMemvar). * PRIVATE / PUBLIC mid-function declarations were treated as extra LOCAL slots. emitMidVarDecl extended `locals` past the function's declared count and emitted `PopLocalFast(idx)` for the init. The slot didn't exist at runtime, so the init either silently scribbled past the frame (small N) or panicked with "local variable index out of range" once exercised. New logic: PRIVATE/PUBLIC declarations bypass the locals table and emit `PopMemvar(name)` for the init expression. The runtime auto- creates the memvar. * Memvar assignment fallback. After the LOCAL/STATIC checks miss in emitAssign, the bottom path used to be a one-line WARN that emitted RHS + `Pop()` — silently discarding the value. PRIVATE pSum stayed at its initial value forever. Now: ASSIGN goes through PopMemvar; compound forms read via PushMemvar, fold, write back via PopMemvar. Test fixture (tests/std_ch/test_compound_lhs.prg) covers all four shapes. The std.ch runner picks it up so the regression suite now stands at 15/15. Other gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 std.ch suite : 15/15 FRB suite : 5/5 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 05:14:28 +09:00
CharlesKWON	efb615bed9	fix(frb,genpc): in-process compile + 4 pcode bugs Compiling _FiveSql2/test/test_sql_extreme.prg + a sweep of the FRB demos surfaced four real bugs in the dynamic-compilation pipeline. All fixes shipped together because they were on the same critical path; each is independently revertible. * pcode FOR loop ignored STEP and direction. emitFor in compiler/genpc emitted a fixed `<= to` comparison and a hardcoded `+1` increment, then deleted the actual step expression with slice arithmetic on the byte buffer. Result: `FOR 5 TO 1 STEP -1` exited on the first iteration; `FOR 1 TO 10 STEP 2` summed 1..10 (55) instead of 1+3+5+7+9 (25). Rewritten to mirror gengo's emitFor: detect negative step from a literal `-N` or unary MINUS, pick `<=` vs `>=` accordingly, and emit a clean `var := var + step` increment per iteration. * pcode compound `+=` operator stored only the RHS. emitAssign looked at AssignExpr.Op only for the := case; +=/-=/etc. silently took the same path, so `n += i` compiled as `n := i`, discarding the accumulator. Loop reduces were wrong: `Reverse` returned "" and `n := 0; FOR i ... n += i; NEXT` returned only the last increment. New compoundBinOp helper maps PLUSEQ / MINUSEQ / STAREQ / SLASHEQ / PERCENTEQ / POWEREQ to their matching binary opcode; emitAssign emits `local + rhs ; pop local` for compound forms. * Pcode body stack leaks polluted the caller's frame. A pcode function whose body left intermediate values on the data stack (FOR control values, etc.) returned with extra entries past its declared retVal. FrbDoFunc / FrbExecFunc / FrbRunFunc then pushed retVal on top of those leaks, so the caller saw the leaked values where its own preceding arguments should have been: `? "Fibonacci(10) =", FrbDo(...), "(expect 55)"` printed `1 55 (expect 55)` because the FOR loop's `1` lived in arg-1's slot. Two new Thread methods (`SP()` / `SetSP(int)`) let the three FRB dispatchers snapshot stack depth before the inner call and clamp it back afterward, so the leaks evaporate before they reach the caller's frame. * FrbExec / FrbRun recursed into the host's Main forever. Both looked up "MAIN" via t.VM().FindSymbol, which always resolved to the OUTER program's Main since FRB modules deliberately keep Main local. Compile + run + unload became compile + recurse + OOM. Both now look up Main via mod.FindFunc("MAIN") (module scope) — Frbload's policy of leaving Main module-local now actually has the intended effect. Plus an architectural improvement: in-memory compilation no longer depends on shelling out to an external `five` binary. New hbrtl.frbCompileInProc parses + preprocesses + generates pcode in process, building a FrbModule directly. FrbCompile and FrbExec use this exclusively, which means dynamic compilation works from any directory regardless of PATH and without a second process. The plugin-mode path (with its runtime-version-mismatch fragility) is left available via hbrt.FrbCompileSource for callers that want it, but FrbCompile no longer reaches for it by default. Test suite: tests/frb/ holds five fixtures + a runner. 5/5 pass: test_frb_simple / test_frb_pcode_load / test_frb_compile / test_frb_loop / test_frb_step. Other gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 std.ch suite : 14/14 FRB suite : 5/5 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 10:25:35 +09:00
CharlesKWON	3ce0eceed5	fix(pp): apply rules to every ;-separated statement on a line Until now applyRules looked at the first token of each physical line. PRG legitimately packs multiple statements on a single line with `;` as an intra-line separator (e.g. `dbCommit(); CLOSE ALL`), and after Wave 1 removed the parser's xBase fallback for CLOSE/ COMMIT/etc., a `;`-separated `CLOSE ALL` on a line that started with another statement would slip past std.ch entirely. The parser then saw `CLOSE` / `ALL` as IDENTifiers, the runtime tried to dispatch `CLOSE` as a function, and the user got a "no function symbol for call" panic at execution time. Fix: at applyRules entry, check for top-level `;` (paren / bracket / brace / string-literal balanced), split the line into statement segments, recursively apply rules to each, rejoin with `;`. Two new helpers (`hasTopLevelSemi` / `splitTopLevelSemi`) keep the balancing logic small and self-contained. Found by compiling _FiveSql2/test/test_sql_extreme.prg, which packs the typical xBase one-liner DBF setup `dbAppend(); FieldPut(...); ...; dbCommit(); CLOSE ALL` across many rows of test data. The test was panicking at the first such line; with this fix it now runs to completion: 15/15 PASS. All FiveSql2 SQL tests green together for the first time: test_sql1999 : 43/43 test_sql1999_hard : 10/10 test_sql_extreme : 15/15 test_sql_challenge : 15/15 -- 83 / 83 Other gates green: go test ./... : PASS Harbour compat : 56/56 std.ch suite : 14/14 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 08:27:47 +09:00
CharlesKWON	412351b67d	feat(rtl): LIST/DISPLAY TO FILE — text output redirection Wire up TO FILE for both LIST and DISPLAY: __dbList grows a 9th parameter cFile, opens it (truncating any prior content) when non- empty, and writes the formatted rows there via fmt.Fprintln. Default behavior (no TO FILE) still goes to stdout. std.ch gets two new rules placed before the regular LIST/DISPLAY patterns so they win when TO FILE is present: LIST [<v,...>] TO FILE <(f)> [OFF] [FOR] [WHILE] [NEXT] ... DISPLAY [<v,...>] TO FILE <(f)> [OFF] [FOR] [WHILE] [NEXT] ... Open failure raises a clear *HbError ("LIST/DISPLAY TO FILE: cannot create <path> — <syscall reason>") so callers know exactly what went wrong instead of getting partial-or-empty output. TO PRINTER stays rejected via __dbNotImpl — Five doesn't drive a printer port. Test coverage: tests/std_ch/test_list_to_file.prg exercises four shapes (full LIST, single-row DISPLAY, OFF + FOR with explicit fields, and confirms TO PRINTER still raises). Wired into the std.ch runner so the regression suite now stands at 14/14. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 std.ch suite : 14/14 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 08:15:32 +09:00
CharlesKWON	3a7f1dea72	feat(rtl,tests): pre-release UX round (Wave 5) Three audit findings around polish + a release-readiness commit: * #UX1 LIST/DISPLAY output: dropped \r\n (unix terminals showed a stray ^M), moved the newline to AFTER each row (no more leading blank line), and added the `` deleted-record marker after the record number — matches xBase LIST/DISPLAY convention. With SET DELETED ON the marker is unreachable since the row would have been skipped at Area.Skip level; with SET DELETED OFF the user now sees which rows are tombstoned. #26 temp aliases: `__copytmp` / `__sorttmp` / `__totaltmp` / `__jointmp` were process-global string constants. A nested invocation (e.g., COPY inside a FOR clause whose expression runs another COPY) collided on the alias and the inner Open failed with "alias already in use" — surfacing as `.F.` with no clear cause. Each Open now goes through a new helper `nextTmpAlias(prefix)` backed by an atomic counter, so every call gets `__copytmp_1`, `__copytmp_2`, etc. — no collisions. * #J test coverage gap: the 13 std.ch regression tests were all sitting in `/tmp` — lost on tmpfs reboot, never in git, never in CI. Move them into `tests/std_ch/` and add a simple `run.sh` runner that builds + executes each one in a temp scratch directory and grep-asserts on FAIL / NOT REJECTED / expectation-mismatch markers. 13/13 pass against the current head: PASS test_pp_stdch PASS test_count PASS test_sum_avg PASS test_sum_multi PASS test_copy PASS test_sort PASS test_list PASS test_total PASS test_join PASS test_update PASS test_set_deleted PASS test_unsupported PASS test_block_comma test_block_comma in particular guards the gengo SeqExpr fix from Wave 1 — without it the comma-in-block miscompile would silently come back. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 std.ch suite : 13/13 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 08:07:50 +09:00
CharlesKWON	1a9e509ee2	perf(rtl): SORT TO swaps insertion sort for sort.SliceStable (Wave 4) Drop the toy O(n²) insertion-sort that __dbSort had been using and delegate to the stdlib's sort.SliceStable. Reasoning: SORT TO is an operation a user reaches for because their dataset is too big to just iterate manually — interactive DBFs routinely have 10k–1M rows, which the old impl would chew on for minutes to hours. SliceStable gives O(n log n) and preserves the original-input ordering for equal keys, which is what the previous implementation also tried to do. The function signature is unchanged (`stableSort(rows, less)`), so all the multi-key / /D / /C dispatch logic from earlier waves keeps working unmodified. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 08:03:13 +09:00
CharlesKWON	5b1d3fb32f	feat(pp,rtl): pre-release accuracy round (Wave 3) Four audit findings around correctness/consistency in std.ch and the SORT/UPDATE/TOTAL handlers: * #13: TOTAL/UPDATE key idiom inconsistency documented as inherent. TOTAL evaluates `<key>` only in the source workarea so verbatim `<{key}>` (alias-qualified or `_FIELD->`-prefixed by the user) works. UPDATE evaluates the same block in BOTH master and detail context, so it must wrap as `_FIELD-><key>` to dispatch to whichever WA is selected at eval time. The two rules look alike but their evaluation contexts differ — also documented in std.ch alongside both rules so the asymmetry isn't a surprise. Plus: TOTAL TO and ON are now mandatory (matching the COUNT/ UPDATE pattern from Wave 1) — bare TOTAL would have produced broken syntax via the unconditional `<(f)>`/`<{key}>` template references. * #15/#16: SDF / DELIMITED variants of COPY and TO PRINTER / TO FILE variants of LIST / DISPLAY are now matched by stub rules (placed before the regular rules so they win) that expand to a new `__dbNotImpl(reason)` RTL primitive raising a clear `&hbrt.HbError`. BEGIN SEQUENCE / RECOVER catches the panic, so callers get a real error instead of the previous silent dispatch-to-regular-DBF-copy. * #19: SORT /C (case-insensitive) now actually folds case before the string compare, instead of being silently treated as ascending. Suffix parser also rebuilt as a multi-letter scanner so `name/CD`, `name/DC`, `name/C/D`, `name/D/C` all parse the same way — combine /C and /D freely. Unknown suffix letters (e.g., `name/X`) leave the suffix attached to the field name so a stray slash in user input doesn't get silently mangled into a broken field reference. * #27 SET DELETED: verified with a regression test that `SET DELETED ON` causes COUNT/COPY (and by extension SORT/TOTAL/JOIN/UPDATE — all of which iterate via Area.Skip) to skip rows marked deleted. The filtering is implemented at the workarea level (skipFilter in dbf.go honors hbrdd.IsSetDeleted) so no RTL changes were needed; this commit just adds the coverage so the behavior doesn't silently regress. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 08:01:42 +09:00
CharlesKWON	f30704a854	fix(rtl,pp): pre-release safety round (Wave 2) Five concrete gaps the audit flagged in the new __dbCopy / __dbSort / __dbTotal / __dbJoin / PP code: * wam.Close() errors were dropped on the floor. Caller saw `.T.` even when the just-written DBF wasn't durable, leading to the classic "delete the source after the COPY succeeds" data-loss pattern. All four functions now capture the close error and return `.F.` if it fired. * drv.Create succeeded → wam.Open failed → orphaned-on-disk DBF. The user-named target file was left around with zero records, and the next call's drv.Create silently truncated it instead of surfacing the original error. Add `os.Remove(cFile)` on the Open-failure cleanup path for COPY/SORT/TOTAL/JOIN. * __dbTotal would write the DBF codec's overflow sentinel (`****`) into the destination's sum-fields when a group total didn't fit in the source's declared field width, and still return `.T.`. Now: precompute each sum-field's max representable magnitude (10^(Len-Dec)) at start, mark the run as overflowed if any flush sees an out-of-range or NaN value, and propagate `.F.` to the caller so they don't trust the file. cleanUnreferencedMarkers walked byte-by-byte and stripped any `<ident>` token in the result, INCLUDING ones that appear inside `"..."` / `'...'` string literals. A user expression like `LIST FOR url == "<a>x</a>"` got the `<a>` and `</a>` eaten on output. Now: track string-literal state and skip the cleanup pass while inside one. Bracket-strings `[…]` are intentionally not treated as strings here — the result template uses `[...]` as the optional-repeat marker, and disambiguating needs context the cleanup pass doesn't have. * (#8 SET SAFETY honoring) deferred. Harbour default is SAFETY OFF, so the current always-overwrite behavior matches default Harbour. The divergence only matters when user explicitly does `SET SAFETY ON`, which Five doesn't support yet — so the no-overwrite-protection is consistent end-to-end. Tracked as a separate followup. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 07:54:41 +09:00
CharlesKWON	000500e034	fix(pp,parser,gengo): pre-release blocker round (Wave 1) Six audit-driven blockers landed together because they're tangled: * MENU TO removed from std.ch — the rule expanded to a call to a nonexistent __MenuTo() RTL symbol, so any user code with `MENU TO choice` compiled clean and panicked at runtime. Behavior pre-this-round was a parser silent no-op, which is at least consistent. Restore that until @ PROMPT (the companion command) actually lands. * COUNT now requires `TO <var>`. The earlier `[TO <v>]` optional bracket was a Harbour-pattern transcription error: the result template references `<v>` unconditionally, so a bare `COUNT` expanded to ungrammatical ` := 0 ; dbEval(...)` and the PRG parser rejected it. Match Harbour's std.ch which makes TO mandatory. * UPDATE FROM ... REPLACE now requires `FROM`/`ON`/`REPLACE` all three. Same root cause as COUNT: the result template uses `<key>`, `<f1>`, `<x1>` unconditionally; missing any of them produced broken syntax. Tightened to fail loudly rather than silently mis-expand. * CLOSE <unknown_alias> no longer closes the current workarea. SelectByAlias was a silent no-op when the alias was missing, leaving WASaveAndSelectAlias to evaluate the inner DbCloseArea() against the originally-selected WA — a real data-loss footgun. SelectByAlias now returns bool; WASaveAndSelectAlias switches to the no-area sentinel (0) on miss so the inner expression's Current() returns nil and short-circuits. * SUM <x1>, <xN> TO <v1>, <vN> — multi-pair form supported. Required two pieces: 1. matchSegment's regular-marker stop-boundary now combines outerTail literals AND the segment's repeat boundary so `[, <xN>]` doesn't let `<xN>` swallow past the next ','. 2. Five parser miscompiled comma-separated expressions in code blocks. `{\|\| e1, e2, e3 }` kept only the last expr and threw away earlier ones at AST level, so all their side effects vanished. New SeqExpr AST node + emitter (emit each, pop intermediate results) + folding/walk updates fix the underlying bug, which also unbreaks any other block that relied on comma sequencing. * pp.go's `;` continuation joiner now strips exactly one trailing `;` per iteration, preserving Harbour's `;;` convention (literal `;` followed by a continuation marker). Without this the SUM rule's chained `<v1> :=[ <vN> :=] 0 ; ; dbEval(...)` collapsed to a missing statement separator. * parseExprStmt's xBase fallback switch is back in sync with parseIdentStmt — COPY/SORT/COUNT/SUM/AVERAGE/TOTAL/UPDATE/JOIN/ DISPLAY/LIST removed (std.ch handles all of them now). Leaving them in the fallback masked typos as silent no-ops. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 07:45:20 +09:00
CharlesKWON	e79ced2e0c	docs: log PP/std.ch round + LABEL/REPORT deferred Record the 9-commit Phase B run that landed Harbour-style #command rewrites for ERASE/RENAME/CLOSE/COMMIT/UNLOCK/LOCATE/CONTINUE/ REINDEX/PACK/ZAP/KEYBOARD/RUN plus COUNT/SUM/AVERAGE/COPY/SORT/ LIST/DISPLAY/TOTAL/JOIN/UPDATE — 13 commands that were silent no-ops in the parser before this round. Also catalog the 14 PP completeness fixes the rules surfaced (partial-pattern false-match, blockify substitution, list-aware smart-stringify and blockify, MarkerList/MarkerWordList in optional clauses, multi-delimiter capture, line-continuation in directives, no-progress iteration leak, unreferenced logify/blockify cleanup, nested `[...]`). LABEL / REPORT explicitly deferred — niche xBase output-formatting engines whose `.lbl` / `.frm` binary readers and pagination/group machinery would be ~800–1500 LOC for near-zero modern users. Parser keeps the silent no-op behavior for both keywords; entry points documented in OPTIMIZATION_TODO.md if a real demand ever appears. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 17:52:30 +09:00
CharlesKWON	80a18daf8d	feat(pp): UPDATE FROM via std.ch + nested-bracket fix in matchSegment `UPDATE [FROM <alias>] [ON <key>] [RANDOM] REPLACE <f1> WITH <x1> [, <fN> WITH <xN>]` becomes a preprocessor rewrite to a new RTL primitive __dbUpdate. For each detail record, find the master record with matching key (forward-walk if both sorted, full scan when RANDOM) and apply the REPLACE clauses in master's context. Same shape as harbour-core/src/rdd/dbupdat.prg. The REPLACE clauses expand to comma-separated assignments inside one block — `{\|\| _FIELD->total := del->amt, _FIELD->status := "OK" }` — using the multi-pair `[, <fN> WITH <xN>]` optional-repeat that std.ch already establishes for SUM and DEFAULT. Five-specific tweak: ON <key> wraps as `{\|\| _FIELD-><key> }` rather than Harbour's bare `<{key}>`. Five doesn't auto-resolve a bare identifier in a code block to the current workarea's field, and the UPDATE block must evaluate against both detail and master so an explicit alias prefix won't do — _FIELD-> dispatches to whichever area is selected at eval time, which is what's needed. Wiring up UPDATE surfaced one further matchSegment gap that fell out of the multi-pair `[REPLACE ... [, ...]]` shape: * matchSegment didn't handle nested `[...]` inside its body. `[REPLACE <f1> WITH <x1> [, <fN> WITH <xN>]]` gave the inner `[` as a literal token to match against the line, so even the single-pair `REPLACE total WITH del->amt` form failed and f1/x1 came back empty. Now matchSegment runs the same repeat-loop on inner `[...]` blocks that the top-level matcher uses, with its own outer-tail computed from the segment tail past the inner `]`. Parser cleanup: UPDATE removed from the IDENT-statement no-op switch. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 17:49:33 +09:00
CharlesKWON	ebe12e1108	feat(pp): JOIN WITH ... TO via std.ch + __dbJoin RTL `JOIN WITH <alias> TO <file> [FIELDS <list>] [FOR <expr>]` becomes a preprocessor rewrite to a new RTL primitive __dbJoin. Cartesian product of the current ("master") workarea and the named "detail" alias, filtered by the FOR expression. Output structure: * No FIELDS clause: master's fields followed by detail's, dropping any detail-side name that clashes with master. * FIELDS list: one column per name in declaration order, resolved against master first then detail. Same shape as harbour-core/src/rdd/dbjoin.prg. Five-specific simplifications: alias->name in FIELDS not yet supported (bare names with master-precedence lookup); RDD/codepage args dropped since Five only has DBFNTX. Note for callers: don't name a workarea `M` or `MEMVAR` — both are Harbour-reserved memvar aliases, so `M->field` and `MEMVAR->field` always go through the memory-variable namespace, not the workarea. This is gengo behavior matching Harbour, not new in this commit. Parser cleanup: JOIN removed from the IDENT-statement no-op switch. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 16:42:06 +09:00
CharlesKWON	699ea90156	feat(pp): TOTAL TO via std.ch + __dbTotal RTL `TOTAL TO <file> ON <key> [FIELDS <list>] [FOR ...] [WHILE ...] [NEXT ...] [RECORD ...] [REST] [ALL]` joins the family of std.ch DML rewrites. New RTL primitive __dbTotal: * Walk the source under dbEval-style FOR/WHILE/NEXT/RECORD/REST bounds. The source must already be sorted/indexed on the key — same precondition as Harbour's dbtotal.prg. * Track the current group key. On each key change, flush the accumulated row to the destination (writing the running totals back into the most recently appended record's sum-fields, preserving each field's declared length/decimals). * On the first record of every group, append a fresh dst row and copy all non-memo source fields into it; subsequent records in the group only contribute to the sums. Net effect: non-summed fields take the first record's value, summed fields hold the group total. Same shape as harbour-core/src/rdd/dbtotal.prg. * Memo fields are dropped from the destination structure (Harbour does the same). Parser cleanup: TOTAL removed from the IDENT-statement no-op switch. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 15:24:41 +09:00
CharlesKWON	1cc2d94927	feat(pp): LIST / DISPLAY via std.ch + four PP completeness fixes `LIST [<fields>] [OFF] [FOR ...] [WHILE ...] [NEXT ...] [RECORD ...] [REST] [ALL]` and `DISPLAY [<fields>] [OFF] [FOR ...] ... [ALL]` reach the parser as plain function calls to a new RTL primitive __dbList (rtlDbList in hbrtl/database.go). Implementation: walk the workarea under dbEval-style FOR/WHILE/NEXT/ RECORD/REST bounds. For each visible record, evaluate each column block and emit the rendered values via valueToDisplay (the same formatter QOut already uses). Empty fields list defaults to "all fields". OFF suppresses the record-number prefix. LIST always emits the full filtered range; DISPLAY without ALL emits only the current record (encoded as nCount=1). TO PRINTER / TO FILE clauses are not yet wired through — for now everything goes to stdout. Wiring up LIST/DISPLAY surfaced four further gaps in PP that were silently masking bugs in any rule with multiple word-list / list / optional clauses chained together: * matchSegment refused MarkerWordList inside `[...]`. The LIST rule's `[<off:OFF>]` clause therefore never set the off capture, and `<.off.>` substituted to nothing instead of .T./.F. matchSegment now matches WordList markers the same way the top-level matcher does. * `<v,...>` and `<(f)>` capture stop boundaries didn't include the values of following MarkerWordList markers. For `[<v,...>] [<off:OFF>] [<all:ALL>]` against `LIST id, name OFF`, the v list would happily eat OFF. New addStopFrom helper contributes both literal keywords and word-list values; both matchSegment's MarkerList branch and captureExpression now use it. * Optional-repeat loop in matchPattern merged a no-progress iteration's empty capture into the running multi-capture string (with the `\x01` separator) before the no-progress break check fired. So a successful first iteration's value got contaminated and the substitution loop then skipped it as multi-capture garbage. The merge now happens after the progress check. * Unreferenced `<.name.>` markers (optional clauses that didn't match in the input) were getting cleaned up to empty by the generic marker scrubber instead of the .F. sentinel Harbour's std.ch expects. New replaceUnreferencedLogify pass mirrors the existing replaceUnreferencedBlockify and runs just before the cleanup. Parser cleanup: LIST and DISPLAY removed from the IDENT-statement no-op switch in both parseIdentStmt and parseExprStmt. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 15:19:36 +09:00
CharlesKWON	6dbc34b34b	fix(pp): per-element blockify for list captures `<{name}>` previously wrapped a list-typed capture's whole comma-joined string in one code block: `{\|\| id , name }`. Harbour's std.ch expects per-element wrapping so `{ <{v}> }` against `LIST id, name` yields `{ {\|\| id }, {\|\| name } }` — an array of column blocks the call site can evaluate per row. applyResult now consults the marker table for blockify the same way it already does for smart-stringify, splits the captured list on top-level commas, and emits one `{\|\| expr }` per element. Prereq for the upcoming LIST / DISPLAY rules; no user-visible behavior change for the rules already in std.ch (their `<{for}>` / `<{while}>` markers are scalar). Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 15:05:50 +09:00
CharlesKWON	989138d12e	feat(pp): SORT TO via std.ch + __dbSort RTL `SORT TO <file> [ON <key-list>] [FOR ...] [WHILE ...] [NEXT ...] [RECORD ...] [REST] [ALL]` joins COPY in being a real preprocessor rewrite to a function call. New RTL primitive __dbSort: * Buffer visible source records (FOR/WHILE/NEXT/RECORD/REST same as __dbCopy). * Multi-key stable insertion sort. Each key may carry `/D` for descending; ascending otherwise. /A and unknown suffixes fall through as ascending. Comparison delegates to the existing compareValues helper in sqlscan.go (numeric / string / NIL-aware). * Create destination DBF with the source's struct, append rows in sorted order, restore source selection. Parser cleanup: SORT removed from the IDENT-statement no-op switch. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 15:04:18 +09:00
CharlesKWON	e961660f61	feat(pp): COPY TO via std.ch + four PP completeness fixes `COPY TO <file> [FIELDS <list>] [FOR ...] [WHILE ...] [NEXT ...] [RECORD ...] [REST] [ALL]` reaches the parser as a plain function call to a new RTL primitive __dbCopy (rtlDbCopy in hbrtl/database.go). Implementation: project the field list (case-insensitive name match against the source's structure, full copy when omitted), dbCreate the target file with that struct, open it under a temp alias, walk the source under dbEval-style FOR/WHILE/NEXT/RECORD/REST bounds, and GetValue/Append/PutValue per record into the target. SDF / DELIMITED variants stay parser no-ops until those backends arrive. Wiring up COPY surfaced four longstanding gaps in the PP that had to be fixed for the rule to even reach the runtime: * `<(name)>` pattern marker was treated as a regular `<name>` with the parens baked into the captured key, so the matching result substitution `<(name)>` couldn't find it. parseOneMarker now strips the parens at parse time so capture key and result marker share the bare name. The smart-stringify result behavior is unchanged. * matchSegment (the optional-clause matcher) bailed on every non-Regular marker. `[FIELDS <fields,...>]` therefore failed to match at all and the fields list arrived empty in the result template. matchSegment now handles MarkerList with paren-balanced capture and segment+outer literal stop boundaries. * captureExpression only used the first literal in the pattern tail as a stop boundary. With std.ch's chain of optional clauses (`[TO <(f)>] [FIELDS ...] [FOR ...] [WHILE ...] ...`) the file-name marker was happy to gobble a trailing FOR clause when FIELDS was absent. It now stops at any of the remaining pattern literals. * `<(name)>` smart-stringify on a list-typed capture wrapped the whole comma-joined string in one set of quotes — `{ "a , b" }` — instead of `{ "a", "b" }`. New helper quoteListElements splits on top-level commas (paren / bracket / brace / string-balanced) and quotes each element. applyResult now consults the rule's marker table to know which captures came from `<name,...>`. Parser cleanup: COPY removed from the IDENT-statement no-op switch in both parseIdentStmt and parseExprStmt. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 15:00:18 +09:00
CharlesKWON	c2e7f7ea27	feat(pp): Phase B — COUNT / SUM / AVERAGE via std.ch Three xBase analytical commands that were silent no-ops in the parser now execute as Harbour-style PP rewrites: COUNT [TO <v>] [FOR <for>] [WHILE <while>] ... -> dbEval() SUM <x> TO <v> [FOR <for>] [WHILE <while>] ... -> dbEval() AVERAGE <x> TO <v> [FOR ...] -> __dbAverage() COUNT and SUM expand to a `<v> := 0 ; dbEval( {\|\| ... } )` pair matching harbour-core/include/std.ch verbatim. AVERAGE delegates to a new RTL function rtlDbAverage (sum + count + divide; returns 0 on empty match) — the chained-private-variable trick Harbour uses to keep AVERAGE inline doesn't translate cleanly through Five's PP. Wiring up these rules surfaced four PP issues that had to be fixed for the rewrite to even reach the parser: * Result template did not implement <{name}> blockify. So a rule body like `{\|\| x := x + <x> }, <{for}>` left the literal text `<{for}>` in the output. Added blockify substitution: captured -> `{\|\| <captured> }`, missing -> NIL. * findMarkerEnd did not recognise `{`/`}` so unreferenced blockify markers were not cleaned up either. Added `{`/`}` to its prefix/suffix sets. * Optional-clause matching had no view of the outer pattern, so a regular marker at the end of `[TO <v>]` would swallow the rest of the line — `COUNT TO n FOR x>5` captured `<v>` as "n FOR x>5". matchSegment now takes outerTail and stops at its first literal. * `#command` directives could not span multiple physical lines. A trailing `;` is harbour-core's line-continuation marker for std.ch and now joins the next line into the directive before parsing. Parser cleanup: COUNT, SUM, AVERAGE removed from the IDENT-statement no-op switch in parseIdentStmt + parseExprStmt. The remaining xBase verbs (COPY, SORT, TOTAL, JOIN, LIST, DISPLAY, LABEL, REPORT, ...) stay in the parser until their RTL backends arrive. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 14:11:20 +09:00
CharlesKWON	c4f85f494c	feat(pp): Phase A — preprocessor std.ch as single source of truth Introduce compiler/pp/std.ch with 19 #command rules so that ERASE, RENAME, DELETE FILE, CLOSE [<a>\|ALL\|DATABASES], COMMIT, UNLOCK, LOCATE/CONTINUE, REINDEX, PACK, ZAP, KEYBOARD, RUN, MENU TO, and CLEAR GETS reach the parser pre-rewritten as plain function calls. Embedded into the compiler binary via //go:embed so it auto-loads without an explicit #include in user code, exactly the way Harbour auto-loads its std.ch. This is a pure dispatch move, not a behavior change for the already-working forms: the same Five RTL functions get called. But it does fix three regressions that the parser was masking: * ERASE / RENAME / DELETE FILE used to be silent no-ops — the parser swallowed the entire line and returned NIL. They now actually delete/rename files (FErase / FRename). * CLOSE <alias> used to silently ignore the alias and close the current area. It now switches to the named area first (<a>->( DbCloseArea() )). * Two latent #command matcher bugs that surfaced while wiring std.ch up: - bare `CLOSE` would match rule `CLOSE ALL` because the tail of the pattern wasn't checked for unconsumed literals. - bare `CLOSE` would match rule `CLOSE <a>` because all unconsumed pattern markers were unconditionally treated as optional. They are only optional when nested inside `[...]`. Parser cleanup: parseIdentStmt + parseExprStmt no longer hardcode ERASE / RENAME / RUN / KEYBOARD / REINDEX / LOCATE / CONTINUE / COMMIT / CLOSE — the rewriter handles them. Other xBase verbs (COPY / SORT / COUNT / SUM / AVERAGE / TOTAL / JOIN / LIST / DISPLAY / LABEL / REPORT / DIR ...) still no-op in the parser because their RTL backends aren't implemented yet — once the backends land they move into std.ch the same way. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 12:03:30 +09:00
CharlesKWON	f4ed42556b	checkpoint: season-wide bug fix campaign + infra Cumulative season's silent-bug hunting (~62 fixes) across the FiveSql2 SQL engine, the Five compiler/runtime, and the hbrdd RDD layer. Saved as a single checkpoint before refactoring the parser to delegate xBase command translation to the preprocessor. Highlights: FiveSql2 engine (_FiveSql2/src/) - prefix-glob index attach -> explicit convention (<table>_pk.ntx, <table>_uq.ntx, <table>.cdx) — fixes silent multi-row INSERT row-drop - DROP/CREATE TABLE FErase chain extended (.cdx, .fsc, .fsv, .dbt, .fpt) - COUNT(DISTINCT col) parsed + aggregated via hSeen hash - UNION column-count mismatch returns SQL_ERR_GRAMMAR (was silent) - DISTINCT + ORDER BY hidden-col leak fixed (trim before DISTINCT) - Derived table FROM (SELECT...) + JOIN right-side derived - Self-FK CASCADE depth 2+ via SqlGetSingleColPK pre-collect - LAG/LEAD default arg uses SqlEvalRowExpr (handles -N const exprs) - DATE literal round-trip validation (Feb 29 non-leap rejected) - CREATE OR REPLACE VIEW; CREATE VIEW errors on already-exists - AlterTable type dispatcher comma-wrapped (1-char type "A" no longer matches CHARACTER) Compiler / runtime - gengo: HB_ -> FV_ prefix on emitted Go function names (Five identity) - gengo split: emit_block.go, emit_stmt.go, folding.go extracted - parser/stmtreg.go nudges - hbrt: debug TUI/CLI restructure (debugcmd, debugkey, termios_*), windows debug stubs collapsed - thread/vm/value/class/pcinterp tightening from panic traces RDD layer (hbrdd/) - dbf: null bitmap support (null.go + null_test.go), mmap split (mmap_posix.go / mmap_windows.go), byte-level numeric parse - ntx/cdx: windows mmap parity - workarea + mem RDD: cross-area state-bleed fixes RTL (hbrtl/) - errorlog rewrite with platform-specific FD (errorlog_fd_unix / errorlog_fd_other) - sqlscan, sqlhelpers, indexrtl, datetime extensions Gates green at checkpoint: - go test ./... : PASS - FiveSql2 SQL:1999 : 43/43 - Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 09:26:25 +09:00
CharlesKWON	8a3f296e9a	perf(dbf): byte-level numeric parse + RecCount cache Two hot-path fixes for DBF reads surfaced by the bulk-bench profile. 1. parseNumericField decimal path — was 23% of flat CPU on BULK_CTE. The fast integer path (dec == 0) is already byte-level, but any N(w, d) field with d > 0 fell through to strconv.ParseFloat(string(raw[start:end]), 64) allocating per-row. A 10k-row CTE insert ran this 200k+ times. Replace with an inline integer+fraction parser using a small pow10 lookup table (covers 0..19 decimal places). Unexpected characters still fall back to strconv for correctness. Result: BULK_CTE_10k_20iter 187 → 83 ms (2.25x) BULK_SUBQ_10k_20iter 102 → 22 ms (4.6x) 2. DBFArea.RecCount in shared mode was doing Seek(0, 2) on every call. SqlScan calls it once per query for its result-array pre-allocation (~0.2 ms × 1000 queries = 0.2s of CPU on the bench). Cache the count per-area, keyed by a process-wide generation counter. Our own Append increments the cached recCount directly so the cache stays correct for single-process workloads (the common case). Callers that need cross-process freshness can call InvalidateRecCountCache() to bump the generation. SQL bench: modest 1-3 ms drops on B1/B2/B3/B6/B7. Index operations (NTX/CDX build, seek, skip) profiled separately and are already fast — 50k-row NTX build 23 ms, 10k seeks 7 ms, no hotspots. Left untouched. FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 23:38:54 +09:00
CharlesKWON	325fe51656	fix(fivesql2): DML transaction + constraint ordering Three correctness bugs in the DML executor that the 4.7 audit surfaced: 1. RunInsert logged the transaction BEFORE dbAppend() and validation. LogRecord captured the PREVIOUS row's RecNo, and a CHECK/FK violation that rolled back via dbDelete() still left a spurious INSERT entry in the log pointing at the wrong record. Move LogRecord to after all field puts and all validators pass, so the log only records committed INSERTs at the correct RecNo. 2. RunUpdate (fallback path) skipped CHECK and FK validation entirely — only RunInsert validated. An UPDATE could violate the same constraints INSERT protects against. Add the same validator calls after FieldPut, with a captured aPrevVals snapshot so the in- memory record can roll back cleanly on failure. Gated by SqlLoadConstraints to skip the validator (and its recursive five_SQL) for tables without SQL-level metadata — tables created via plain dbCreate see no change. 3. RunDelete had no transaction logging at all — a BEGIN / DELETE / ROLLBACK cycle silently lost the row. Add LogRecord("DELETE") before dbDelete so undo can re-surface it. (A full FK-cascade check on delete would require parent→child scanning; deferred.) The fast-path SqlBulkUpdate branch still bypasses per-record validation by design (documented) — it's gated by `! ::oTxn:IsActive()`, so txn-active queries always take the validated fallback. FiveSql2 43/43 (including SAVEPOINT + ROLLBACK TO and all four CHECK/ FK tests), Harbour compat 56/56, Go test ALL PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 23:24:14 +09:00
CharlesKWON	e368402682	chore: audit cleanup — remove orphan parser + dead TSqlIndex methods Opus 4.7 audit of the codebase surfaced several items that Opus 4.6 sessions left behind. This pass removes what's definitively dead and fixes one trivial defensive bug; the real logic bugs (transaction ordering, missing RunUpdate/RunDelete validation) come in a separate commit. Deletions: - `_FiveSql2/src/TSqlParser_orig.prg` (1173 lines) — superseded by `TSqlParser2.prg` (Pratt). Production never instantiates the old parser; the only callers were the comparison/benchmark test files also being removed. - `_FiveSql2/test/test_parser_cmp.prg` — compared orig vs Pratt AST, useless now that orig is gone. - `_FiveSql2/test/bench_parser.prg` — benched both, same reason. - `_FiveSql2/Makefile` `test_cmp:` and `bench:` targets referenced the removed files. - `TSqlIndex.prg` methods `ApplyScope`, `ClearScope`, `ApplySeek`, `IndexInfo`, `CreateTempIndex`, `DropTempIndex` — each declared in the class header and implemented (~165 lines total) but zero callers anywhere in `_FiveSql2/` or `hbrtl/`. Class declarations removed alongside the bodies. Small fixes: - `TSqlDDL.prg:179-180` stale comment claiming Five doesn't support `@byref` — false since commit `e95afad` (2026-04-13) wired @byref via RefCell. The same method uses @nPos correctly elsewhere. - `hbrt/class.go:tryBinaryOp` defensive nil-check on AsArray(). IsObject() checks the type tag; a corrupted Value with tag=Object but ptr=nil would crash on `.Class`. Correct construction paths never hit this, but the guard is cheap. Compat tests: FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 22:46:17 +09:00
CharlesKWON	e5843bdde4	docs: refresh Phase-C TODO — audit results + remaining edge cases Update the 1.0-readiness document with: - 2026-04-18 compatibility audit results: 50/47 build rate (94%) vs previous 40/34. Lists every fix commit this session. - Four remaining low-priority edge cases from the audit (xcommand nested-comma args, u64 overflow, USE with ../ paths, legacy inline-C syntax) — none block a realistic 1.0. - Revised Phase-C scope: user clarified contrib PRGs can be imported as-is so long as underlying RTL exists, so the work is "audit each contrib's low-level deps, fill gaps, copy .prg" rather than porting every function. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 18:32:45 +09:00
CharlesKWON	4a1bbdb1fe	feat(pp): optional-repeat [...] blocks — DEFAULT / UPDATE from common.ch Harbour's `#xcommand DEFAULT <v1> TO <x1> [, <vn> TO <xn>] => ...` uses an optional, repeatable trailing `[...]` block to accept any number of `var TO default` pairs on a single line. Five's PP skipped bracket bodies during pattern matching and treated them as no-ops in result templates, so DEFAULT a TO 10, b TO 20, c TO 30 expanded (at best) the first pair and dropped the rest — and common.ch itself was documented as "not yet supported". Three concrete changes: 1. matchPattern now matches the `[...]` body repeatedly against remaining line tokens via a new matchSegment helper. Each successful iteration appends captures for the interior markers under the same name, joined with a \x01 sentinel. 2. matchSegment, when capturing the last marker in a body with no following literal, uses the body's opening literal (e.g. the `,` in `[, <vn> TO <xn>]`) as the iteration boundary. Otherwise captureExpression would greedily eat the rest of the line and collapse every remaining pair into one capture. 3. applyResult's new expandOptionalRepeat walks the result template for top-level `[...]` blocks. When a referenced marker is multi- captured it emits the body N times (substituting per-iter value); when it's single-captured it emits the body once; otherwise drops the block. A separate referencedMarkers scanner and an inMarker guard keep literal `[` / `]` inside PP markers (like `<.x.>`) from being mistaken for bracket delimiters. Side fix: ParseRule previously stripped every ` ;` as a Harbour line-continuation marker, but that also destroyed in-line PRG statement separators in result templates. Line joining is the preprocessor's job upstream — keep semicolons intact here. common.ch now ships real DEFAULT and UPDATE #xcommands. Verified 1-, 2-, and 3-pair DEFAULT expansion plus `common.ch` inclusion from user code. FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 18:20:11 +09:00
CharlesKWON	b1024c5244	fix(gengo): hoist #pragma BEGINDUMP imports + wire HB_FUNC registration Two bugs blocked Five's own inline-Go feature: 1. Inline Go blocks placed mid-file couldn't carry an `import` list because Go rejects declarations before imports in the same file. examples/godump_demo.prg and friends (real Five demos) hit "syntax error: imports must appear before other declarations" during compile of the generated Go. hoistGoImports parses the raw dump body for `import (...)` blocks and single-form `import "path"` lines, registers each path into the generator's imports map, and returns the body with those directives stripped. The top-of-file import block then carries everything the dump needs. 2. HB_FUNC() calls inside the inline block's init() enqueue registrations into hbrt.dynamicFuncs, but the VM only promotes them to its symbol table when RegisterLibModules() is called. gengo's generated main() skipped that step, so dispatch on the inline-defined names panicked with "no function symbol for call". Emit vm.RegisterLibModules() after RegisterModule(symbols). Verified: examples/godump_demo.prg builds and runs; the inline GoUpper / GoFib / GoGCD / GoSplit / GoSquare / GoTypeOf functions all dispatch. Matches the feature's original design intent. FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 17:58:49 +09:00
CharlesKWON	5514780b11	feat(pp): detect Harbour inline C in #pragma BEGINDUMP and fail fast Harbour's #pragma BEGINDUMP ... #pragma ENDDUMP blocks carry C source that the Harbour toolchain embeds verbatim. Five takes the same directive but targets Go — any `.prg` ported from Harbour that ships inline C gets its C shoveled into the Go codegen pipeline and fails with opaque errors like "invalid character U+0023 '#'" from the Go compiler, dozens of lines downstream of the actual cause. Detect the C shape at PP time and report a clear, actionable error: pp: file.prg:N: #pragma BEGINDUMP contains C code — Five accepts inline Go only. Port the block to Go (or use an RTL function), then wrap in #pragma BEGINDUMP ... #pragma ENDDUMP. looksLikeInlineC uses conservative signals that don't false-positive on legitimate inline Go (which calls `hbrt.HB_FUNC("NAME", fn)` with a package prefix and a quoted string, distinct from C's bare `HB_FUNC(NAME)` macro). Signals: - `#include <...>` / `#include "..."` — unambiguous C preprocessor - line-starting `HB_FUNC(` / `HB_FUNC_STATIC(` — C FFI macro - `typedef ` / `struct ` / `int main(` / `void main(` at line start main.go now aborts the build when PP returns errors (previously printed but continued — same behavior the parser already had for its own errors). Keeps build output short: one pp line + one summary line, no gengo noise. Verified: - harbour-core/tests/inline_c.prg → clean PP error, exit 1 - examples/godump_demo.prg (legitimate inline Go) → passes PP (hits a separate pre-existing gengo import-ordering bug, not related to this change) FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 17:53:44 +09:00
CharlesKWON	85002df6b9	feat(parser+pp): USE with macros and paren-balanced PP capture Two related fixes for Harbour's data-driven `USE &cFile ALIAS &cAlias INDEX &cNdx` idiom — common in any app that dispatches table names at runtime. Parser (compiler/parser/parser.go parseUse): - `USE &cFile` / `USE &(expr)` previously triggered a skipToEndOfLine short-circuit, emitting an empty UseCmd (equivalent to bare USE = close current area). Now parseMacro runs and the MacroExpr becomes the File node, so codegen emits MacroPush + dbUseArea. - `ALIAS &cAlias` / `ALIAS &a.1` similarly dropped the macro result; now captures it into UseCmd.AliasExpr so codegen evaluates the alias at runtime. Both the IDENT-path ("ALIAS") and keyword-path (token.ALIAS) handlers fixed. PP (compiler/pp/command.go): - captureExpression and the MarkerList branch now paren-balance `(`/`[`/`{` so nested grouping inside a macro argument doesn't let an inner `)` terminate the capture. Example: _REGULAR_(&(a)) previously captured `&(a` (missing inner `)`) and left the outer `)` dangling, producing parse errors in the expanded output. - MarkerList capture still joins tokens with " " for raw `<z>` substitution — comma tokens stay in the stream, so `s(<z>)` re-emits them as argument separators and the list expands cleanly. Bench: harbour-core/tests/pp.prg 2 errors → 0 for the realistic `USE &macro` / `&(expr)` patterns. Remaining parse errors on line 70 are a pathological `_REGULAR_L` list that includes `&a. [2]` (space between macro's terminating dot and an array index) — the PP expands it correctly but Five's lexer refuses the expanded result. That form doesn't occur in real code. /tmp/test_use_macro.prg — all four patterns (`USE &f`, `USE &f ALIAS &f`, `USE &f ALIAS &f INDEX &i`, dot-terminated) now compile. FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 17:38:15 +09:00

1 2 3 4 5

202 Commits