fivedev/five - five - fivego gitea

Author	SHA1	Message	Date
CharlesKWON	675eaa4def	feat(hbrtl): FV_HTTPGET / FV_HTTPPOST / FV_ZIP* / FV_XML_ROWS New Five-native HTTP / ZIP / XML primitives so PRG code can do HTTPS fetch, ZIP container reads, and streaming XML row extraction without dropping into BEGINDUMP. FV_ prefix marks Five-original RTL (distinct from Harbour-inherited HB_ surface). FV_HTTPGET(cUrl [, hOpts]) / FV_HTTPPOST(cUrl, cBody [, hOpts]) hOpts: { headers: {=>}, timeout: nSec, tls_legacy: .T./.F. } Result: { status, body, error, headers } tls_legacy re-enables TLS_RSA cipher suites for legacy endpoints (DART OpenAPI pins them). FV_ZIPENTRIES(cZipBytes) / FV_ZIPREAD(cZipBytes, cEntryName) Read ZIP archives held in memory (e.g. from FV_HTTPGET). FV_XML_ROWS(cXml, cRowTag) Streaming reader for repeating-record XML. Each row becomes a flat hash of immediate-child element name -> text. Verified against DART corpCode.xml: 30 MB / 118k rows in seconds, no full-tree allocation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-29 08:47:34 +09:00
Charles KWON OhJun	c7ac4044f7	feat(json): hb_jsonDecode 2-arg byref form (Harbour-spec compatible) Previously hb_jsonDecode took only (cJSON) and returned the value. That covers most uses but not the Harbour-spec second form nBytesParsed := hb_jsonDecode( cJSON, @xOut ) which mod_harbour / fivenode PRG (e.g. bridge_context.prg's ctx_get / ctx_set) and any other code that wants the parse-length relies on. The byref output was silently dropped, so a hash lookup went through the @hOut path that was always NIL and fell back to the default value — looking like a hash key was missing even though the JSON parsed fine. Now PCount() == 1 keeps the legacy return-value form; PCount() >= 2 writes the decoded value into local-2 via SetLocal (which is already byref-aware) and returns the byte count (0 on parse error). Verified: hb_jsonDecode('{"x":1,"y":2}', @h) writes the hash and returns 13; the 1-arg form still returns the value as before; Compat 56/56 + go test ./compiler/... ./hbrt/... ./hbrtl/... all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 10:44:12 +09:00
Charles KWON OhJun	ad6cc0bcee	feat(rtl): add hb_HGetDef and PValue / hb_PValue Two standard Harbour functions that fivenode-style PRG code (bridge_.prg and downstream apps) calls frequently. Without them, every reference emits an analyzer WARN and resolves to NIL at runtime. hb_HGetDef(hHash, xKey, xDefault) — hash lookup with fallback. * PValue(nIndex[, xDefault]) — read the nth parameter of the calling PRG function. Mirrors the PCount pattern: needs the caller frame's paramCount and locals, exposed via new hbrt.Thread.CallerLocal helper that pairs with the existing CallerParamCount. Registered under PVALUE and HB_PVALUE (Harbour accepts both forms). Verified: hb_HGetDef / PValue / HB_PVALUE all return expected values for present-key, missing-key-with-default, missing-key-no-default, and out-of-range-param cases. Full regression: go test (18 packages) + Compat 56/56 + std.ch 17/17 + FRB 7/7 + FiveSql2 43/43 all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 09:55:19 +09:00
CharlesKWON	d7a81af7db	feat(pgserver): binary-format param decoding (Phase 4.1) pgx defaults to binary wire format for INT2/INT4/INT8/FLOAT4/FLOAT8/ BOOL/NUMERIC/DATE/TIMESTAMP/TIMESTAMPTZ — Go's most-used PG driver ships nearly every typed parameter as binary unless explicitly told to use text mode. The Phase 3 implementation only decoded INT4/INT8/ BOOL, so any pgx call with a decimal price, a timestamp, or a date was silently mis-quoted into the SQL stream. Decoders now cover the seven additional OIDs. The interesting one is NUMERIC: PG's wire format is base-10000 digit groups plus a separate displayed-scale, so the decoder rebuilds the decimal string from weight+sign+ndigits+digits[] without going through float (which would lose precision for NUMERIC(38,*) values). Pinned by vectors covering zero / positive / negative / fractional-only / NaN / multi-group integer + fraction cases. DATE / TIMESTAMP decoders assume integer_datetimes=on (which the server advertises in ParameterStatus); the 8-byte microsecond delta from the PG epoch (2000-01-01 UTC) is converted via Go's time.Time machinery and re-emitted as a quoted SQL literal. Text-format path also broadened: FLOAT4/FLOAT8/INT2 now transit unquoted alongside INT4/INT8/BOOL/NUMERIC; the regression would have been clients sending text-format floats getting them rewritten as '1.5' (string literal) instead of 1.5 (numeric). Verified: all 6 mandatory gates green (go test, SQL 43/43, compat 56/56, std.ch 17/17, FRB 7/7, pgserver 11/11). Five new decoder tests pin each wire format against handcrafted PG payloads. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 10:02:15 +09:00
CharlesKWON	e83787750a	feat(pgserver): SCRAM-SHA-256 authentication (Phase 5.1) PG14+ clients (libpq, pgx, JDBC) prefer SCRAM over MD5 when offered; this lands the five-message exchange (SASL / SASLInitialResponse / SASLContinue / SASLResponse / SASLFinal) so they get their preferred path. MD5 stays as the universal fallback. Storage stays plaintext in the in-memory role registry — per-auth we generate a fresh salt + iter, derive SaltedPassword on the fly. Same net security as the existing MD5 path, while matching wire output to RFC 5802 byte for byte. Critical detail: pgproto3's Backend multiplexes PasswordMessage, SASLInitialResponse, and SASLResponse onto the same 'p' byte tag. Without SetAuthType() the decoder picks PasswordMessage and the handshake fails immediately. Switch state to AuthTypeSASL before the client-first receive and AuthTypeSASLContinue before the client-final receive. Verified: * SCRAM math (PBKDF2 / HMAC / proof verify / server signature) via pinned unit test * Live psql round-trip — correct password accepted, wrong password rejected with proper SQLSTATE 28P01 * All 6 mandatory gates green (go test, SQL 43/43, compat 56/56, std.ch 17/17, FRB 7/7, pgserver 11/11) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 09:24:34 +09:00
CharlesKWON	ed1aeeb212	feat(pgserver): pg_catalog stub for BI-tool connection compatibility PostgreSQL clients (psql, pgx, DBeaver, Tableau, DataGrip, pgAdmin) fire a barrage of catalog probes at connection time — SELECT version(), SHOW server_version, SELECT FROM pg_namespace / pg_class / pg_type / pg_database / pg_settings. FiveSql2 can't parse most of them. Without interception the BI tool either errors out on connect or proceeds with a half-broken view of the database (zero tables, no type info, no schema list). This commit lands the minimum-viable catalog shim so the common connect-and-list-tables flow succeeds. Strategy -------- Pattern-match catalog probes BEFORE handing the SQL to five_SQL. Recognised shapes get synthesised result envelopes — same `{ aFieldNames, aRows }` hbrt.Value shape the engine returns, so the existing dispatchSimpleQuery / executePortal pipelines stream them identically to a normal query. Covered (v1.0) -------------- * SET / RESET / DISCARD <name> → success, no-op * SHOW <name> → single-row response (server_version, server_encoding, client_encoding, DateStyle, transaction_isolation, etc.) * SELECT version() / current_database() / current_schema() / current_user / session_user / pg_backend_pid() → single-row * SELECT … FROM pg_namespace → 2 rows (pg_catalog + public) * SELECT … FROM pg_class → list of open workareas (relkind='r', relnamespace=public) * SELECT … FROM pg_attribute → empty (stub; column-shape introspection deferred to v1.1) * SELECT … FROM pg_type → 7 OIDs FiveSql2 actually emits (bool, int4, int8, text, numeric, date, timestamp) * SELECT … FROM pg_database → 1 row, the connect-time db name * SELECT … FROM pg_settings → name/setting pairs matching SHOW * Anything else mentioning pg_catalog. / pg_<name> / information_schema. → empty result with generic field names (BI tool sees "0 rows" rather than a parse error) Deliberate non-goals -------------------- * WHERE / JOIN evaluation — psql, pgx, DBeaver all filter client-side on the rows we return. We send the whole catalog and let them apply their predicates. * pg_attribute introspection — would need to re-derive column types from the open workarea + map back to PG OIDs. Tracked as v1.1 work. * Recursive CTE catalog queries (pgAdmin's tree builder uses them) — too brittle to pattern-match. Falls through to five_SQL where it errors loudly. pgAdmin's table-tree pane will then show "0 tables" but the connection itself stays alive. Files ----- hbrtl/pgserver/catalog.go (new, ~280 LOC) catalogIntercept(sql) → (handled, value) synthPgNamespace / synthPgClass / synthPgAttribute / synthPgType / synthPgDatabase / synthPgSettings simpleSelectFunction (version/current_/pg_backend_pid) showResponse (SHOW <name>) hbrtl/pgserver/dispatch.go dispatchSimpleQuery: catalogIntercept ahead of runSQL. hbrtl/pgserver/extended.go executePortal: same intercept, ahead of runSQL. Verification ------------ psql against a running pgserver, with sslmode=require + MD5: $ psql -c 'SELECT version()' -At PostgreSQL 14.0 (FiveSql2) (FiveSql2 wire-compat shim) $ psql -c 'SELECT FROM pg_namespace' -At 11\|pg_catalog\|10 2200\|public\|10 $ psql -c 'SELECT * FROM pg_type' -At 16\|bool\|1 23\|int4\|4 20\|int8\|8 25\|text\|-1 1700\|numeric\|-1 1082\|date\|4 1114\|timestamp\|8 $ psql -l # \\l now works 데이터베이스 목록 oid \| datname \| datdba \| 인코딩 -----+---------+--------+-------- 1 \| alice \| 10 \| 6 Integration script gates grew from 6/6 → 9/9: PASS Catalog probe: SELECT version() PASS Catalog probe: pg_namespace lists public + pg_catalog PASS Catalog probe: SHOW server_version_num All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 9/9 ✓ (+3 from catalog stubs) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 22:31:52 +09:00
CharlesKWON	151b628f6c	fix(pgserver): Layer 5 — per-path mmap-gen registry + getWA torn-read Closes the Go-panic class of multi-session concurrency bugs and introduces an explicit cross-area mmap invalidation channel. 1. getWA waCache torn-read (root cause of panics) hbrtl/rdd.go cached the most recent `interface{} → WAM` type assertion in a process-global struct of two `interface{}`- shaped fields. Each pgserver connection's NewThread gets its own WAM, so the cache missed on every call and immediately re-wrote two shared, unsynchronised fields. Go's `interface{}` is two words; concurrent write + read produced torn pointer values, with the result that goroutine A could observe goroutine B's WAM as its own. That mis-attribution surfaced as: - `concurrent map writes` panic at WorkAreaManager.Close (workarea.go:95): two goroutines genuinely modifying the SAME wam.aliases map. - `concurrent map writes` panic at DBFArea.FieldPosCache (dbf.go:439): two goroutines lazy-initing the SAME fieldPosMap. Drop the cache. The type assertion is ~ns; not worth a process-global shared slot. If perf matters again, replace with a sync.Map keyed by thread pointer, not a single struct. 2. Per-path mmap generation registry (hbrdd/dbf/area_registry.go) Each unique on-disk DBF path gets an atomic uint64 generation counter. DBFArea instances: - On Open: pathGen = pathGenFor(path); pathGenSeen = current. - On Append (shared) / flushRecord: bumpPathGen(path); pathGenSeen = current. - On loadRecord: if pathGenSeen < live counter, bypass mmap fast path for THIS load (use ReadAt) and re-sync seen. Without this, a peer DBFArea's PutValue mutating a record we'd mmap-cached returned stale pre-mutation bytes from our snapshot. The existing length-bound check covered file-grow (`offset > mmap len`) but not byte-level mutation within the snapshot range. The registry covers both. Cheap: read = one atomic.LoadUint64, hit rate is ~100% in the single-writer-many-readers steady state. Verification ------------ Same 3 / 5 / 10-worker pgx-driven concurrency stress harness: pre-Layer-1 baseline: ~60% pass + occasional panic +Layer 1+2: 80% / 50% / panic +Layer 3a (max-merge): 80% / 50% / panic +Layer 4a (per-session 3): 90% / 80% / 50% +Layer 4b (Go atomics): 75-90% / 50-80% / panic (still) +THIS (getWA + mmap-gen): 73% / 67% / 33% — ZERO PANICS The shift "many partial fails, no panics" is what matters for production: a connection seeing stale data is recoverable (rerun the query); a Go-level process crash is not. Remaining correctness flake comes from the in-flight appendBuf interaction when peer Append fires between this connection's Append and flushRecord — that's tractable with a per-connection flush ordering rule, deferred to Layer 6. All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 6/6 ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 21:43:04 +09:00
CharlesKWON	5e4a1c5d72	refactor(FiveSql2): cross-session globals → Go atomic + RWMutex Completes the per-STATIC migration started in `5bba0c2`. The remaining three TSqlExecutor module STATICs (s_nSchemaVer, s_nRCJSeq, s_hAutoInc) genuinely needed cross-connection visibility — a CREATE TABLE on connection A MUST invalidate B's plan cache, an RCJ alias MUST be unique across all live queries, and an IDENTITY column MUST hand out monotonic values across all writers. Moving them to TSqlSession (per-instance) would have broken those semantics. Solution: back them with Go-side primitives exposed via HB_FUNCs: s_nSchemaVer → atomic.Uint64 (SqlSchemaVer / SqlBumpSchemaVer) s_nRCJSeq → atomic.Uint64 (SqlNextRCJSeq, returns mod-100000) s_hAutoInc → sync.RWMutex + map[string][]string (SqlSetAutoInc / SqlGetAutoIncFields) Lives in `hbrtl/sqlglobals.go`. The PRG-side `FUNCTION SqlSchemaVer() / SqlBumpSchemaVer() / SqlSetAutoInc() / SqlGetAutoIncFields()` definitions in TSqlExecutor.prg are deleted; the HB_FUNC dispatch takes their place. The single PRG caller of `s_nRCJSeq` (in the RCJ helper around line 5600) becomes `SqlNextRCJSeq()` and reads cleaner — the old `s_nRCJSeq := (s_nRCJSeq + 1) % 100000` was both racy and a non-atomic two-write update under multi-conn load. The other module STATIC, `s_hAutoInc`, used to lazy-init on first use (`IF s_hAutoInc == NIL ... := { => }`); two concurrent first-CREATE TABLE calls hit "concurrent map writes" on that branch. The Go RWMutex eliminates the race; reads still scale (RLock) so the IDENTITY-lookup at INSERT time isn't a contention hot-spot. All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 6/6 ✓ Concurrency stress (3-worker × 20): pre-Layer-1: ~60% pass + occasional Go panic +Layer 1+2: 80% pass, no panics +3a: 80% pass +per-session 3 STATIC move: 90% pass +this commit: ~75% pass (variability — Go map atomic + mutex serialise the writers but the underlying hbrdd multi-area mmap path still has its own race, deferred to follow-up) The next bottleneck is at the hbrdd workarea layer (multi-Area instances per file each holding their own mmap snapshot), not at the FiveSql2 STATIC level. That fix is its own commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 19:58:52 +09:00
CharlesKWON	3b2dd365ad	feat(pgserver): Phase 6 — TLS + source-IP allowlist Closes the v1.0 hardening surface: encrypted transport + a coarse pg_hba.conf-equivalent CIDR allowlist. Together with the Phase 5 auth flows, this is the security-baseline an internet- exposed PostgreSQL-wire server needs. TLS subsystem ------------- `hbrtl/pgserver/tls.go`: * `LoadTLSFromFiles(certPath, keyPath)` — cert/key PEM pair load with tls.VersionTLS12 floor. Installed as the pending config that the next PG_SERVER_START consumes (matches PG's "must-set-before-pg_ctl-start" semantics). * `GenerateSelfSignedCert(certPath, keyPath, hostname)` — ECDSA P-256 + 365-day validity + DNSNames+IPAddresses SANs covering the hostname plus 127.0.0.1 / ::1. Dev/CI helper; production ships a CA-signed cert via the loader. * `upgradeToTLS()` wraps `tls.Server(conn, cfg).Handshake()` so pgproto3 reads plaintext on top of the encrypted stream. Source-IP allowlist ------------------- * `AllowIP(cidr)` parses a CIDR and appends it to a per-server list snapshotted at PG_SERVER_START time. * `peerAllowed(remote, list)` runs at accept() — empty list → accept any, otherwise drop connections whose RemoteAddr falls outside every registered range. * `ClearAllowList()` resets to allow-all. Coarse but compatible with the "host alice 10.0.0.0/8 md5"-style entries every pg_hba.conf author already knows; a fuller per- role/per-database matcher is Phase 6.1+. PRG bindings (register.go) -------------------------- New HB_FUNCs, all idempotent and composable in any order before PG_SERVER_START: pg_tls_load( certPath, keyPath ) → .T. \| cErr pg_tls_self_signed( cert, key, hostname ) → .T. \| cErr pg_allow_ip( cidr ) → .T. \| cErr pg_clear_allowlist() → NIL Bootstrap idiom: PROCEDURE Main() PG_TLS_SELF_SIGNED( "/tmp/cert.pem", "/tmp/key.pem", "localhost" ) PG_ADD_ROLE( "alice", "swordfish" ) PG_ALLOW_IP( "127.0.0.1/32" ) PG_ALLOW_IP( "10.0.0.0/8" ) PG_SERVER_START( ":5432", "md5" ) The startup banner now reports TLS + allowlist state so the PRG operator sees the security posture at a glance: pgserver: listening on :5432 (auth=md5 tls=on allowlist=2) Verification ------------ End-to-end via real psql against a self-signed server: $ PGPASSWORD=swordfish psql \ "postgres://alice@127.0.0.1:15432/alice?sslmode=require" \ -c "SELECT 'tls-works' AS x" -At tls-works $ # off-allowlist source (192.168.x.x mock) → connection refused $ # (verified manually; psql can't easily spoof src IP for CI) Integration script gates expanded to 6/6: PASS Simple Query PASS Multi-statement Simple Query PASS Transaction control PASS MD5 auth: wrong password rejected PASS MD5 auth: correct password accepted PASS TLS handshake + MD5 auth via sslmode=require All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 6/6 ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 14:07:19 +09:00
CharlesKWON	90eafcfc06	feat(pgserver): Phase 5 — password + MD5 authentication Trust mode (v1.0 default) accepts anyone; that's fine for embedded demo but unshipping a multi-client database without credentials would be irresponsible. This commit adds two of libpq's three standard auth flows. SCRAM-SHA-256 is Phase 5.1 — pgx/psql both fall back to MD5 cleanly when the server advertises only md5, so v1.0's functional coverage is complete with the pair landed here. Auth subsystem -------------- `hbrtl/pgserver/auth.go` adds: * An in-memory role registry: `roleMap map[string]role` guarded by sync.RWMutex. Reads (lookupRole) are hot-path during connection startup so the RWMutex lets multiple sessions auth in parallel without serialising through a plain Mutex. `AddRole(name, password)` / `RemoveRole(name)` Go API consumed by the new HB_FUNCs `PG_ADD_ROLE` / `PG_REMOVE_ROLE` (see register.go). Bootstrap PRG idiom: PG_ADD_ROLE("alice", "swordfish") PG_ADD_ROLE("bob", "hunter2") PG_SERVER_START(":5432", "md5") * `authPassword()` — cleartext PasswordMessage exchange. The wire payload is plain so intended for TLS-protected links only; Phase 6 ties the warning to actual TLS detection on the session. * `authMD5()` — libpq's md5 challenge: server → AuthenticationMD5Password{salt: 4 random bytes} client → "md5" \|\| md5_hex( md5_hex(password \|\| user) \|\| salt ) We recompute the canonical hash from the stored plaintext and compare. md5Challenge() is exported for pinning by a Go unit test (vector cross-checked against libpq's fe-auth-md5.c). Salt is sourced from crypto/rand on every challenge so replay attacks against a captured wire trace can't reuse a prior hash. Dispatch matrix (Config.AuthMode → flow): "" / "trust" → AuthenticationOk immediately, no lookup "password" → authPassword() "md5" → authMD5() anything else→ 28000 + connection close Tests ----- Unit (hbrtl/pgserver/pgserver_test.go): PASS TestMD5Challenge (vector + determinism + diff) PASS TestRoleRegistry (add/replace/remove/lookup) Integration (tests/pgserver/run.sh): PASS Simple Query: SELECT 1, 'hello' PASS Multi-statement Simple Query PASS Transaction control: BEGIN/COMMIT round-trip PASS MD5 auth: wrong password rejected PASS MD5 auth: correct password accepted End-to-end matrix with real psql: wrong password → "ERROR: md5 authentication failed for user 'alice'" correct password → SELECT returns row unknown user → "ERROR: md5 authentication failed for user 'eve'" password mode → cleartext exchange works equivalently All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 5/5 ✓ (up from 3/3 in Phase 4) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 14:01:30 +09:00
CharlesKWON	8472928102	feat(pgserver): Phase 4 — Extended Protocol (Parse/Bind/Execute) pgx and most drivers default to PostgreSQL's Extended Protocol (named prepared statements). Phase 2 only handled Simple Query, so every pgx caller had to force `QueryExecModeSimpleProtocol` — unworkable for a production deployment. This commit lands the full Parse → Bind → Describe → Execute → Sync state machine, enough that pgx (and any other libpq-protocol-v3 client) works without any client-side knobs. Implementation lives in `hbrtl/pgserver/extended.go`: * Per-session caches `stmts map[string]preparedStmt` and `portals map[string]portal`, lazily allocated on first use. Stored as fields on `session` so they don't leak across connections. * Parameters are inlined at Bind time via `substituteParams` — the resolved SQL is a normal Simple-Query-shaped string the engine sees through the existing `five_SQL(cSQL, …, oSession)` pipeline. Avoids teaching FiveSql2 a second param-shape; the trade-off is that binary timestamps/numerics round-trip through text (Phase 4.1 will plumb `?`-params through aParams for the binary fast path). * `paramToLiteral` decodes the binary-format encodings pgx uses by default for INT4/INT8/BOOL (big-endian fixed-width). Other binary OIDs fall back to a hex-escaped quoted literal which errors loudly rather than silently misparsing. * `countPgPlaceholders` scans the SQL outside string literals for the highest `$N` so the server can answer Describe-statement with a correctly-sized ParameterDescription even when the client didn't pre-declare param OIDs. Without this, pgx errored with "expected 0 arguments, got 2" on the very first prepared query. * RowDescription emission: Describe-statement still returns NoData (we can't infer row shape without execution). When Execute fires on a portal the client never Described, we emit RowDescription inline from the cached result before DataRow streams. pgx and psql both tolerate this ordering. * Execute → CommandComplete tag derives from the SQL verb via the existing `commandTagFor` helper. Row counts in the tag remain "VERB 0" for v1.0; threading real counters through the engine is Phase 5. Wire dispatch in `session.go:queryLoop` now handles Parse, Bind, Describe, Execute, Close, Sync, Flush — the full v3 message set. Verification ------------ End-to-end pgx (default mode, no SimpleProtocol flag) successfully runs: SELECT $1 AS n, $2 AS s with 42 + "hi" → [42 hi] Same statement re-executed with different bound values → reuses the cached prepared statement SELECT $1 AS b, $2 AS s with true + "binary-bool" → [t binary-bool] `tests/pgserver/run.sh` expanded from 1 → 3 integration assertions: PASS Simple Query: SELECT 1, 'hello' PASS Multi-statement Simple Query PASS Transaction control: BEGIN/COMMIT round-trip (Extended Protocol can't be driven from psql's -c CLI directly because psql's PREPARE/EXECUTE is a separate SQL-level feature that FiveSql2 doesn't parse; the pgx-driven path verifies it manually, and a self-contained Go integration that drives pgx from inside a process bootstrap is Phase 7 work.) All six release gates green: go test ./... ✓ FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ pgserver integration 3/3 ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 12:55:41 +09:00
CharlesKWON	708329785a	test(pgserver): wire-protocol roundtrip via net.Pipe Adds an in-process startup-handshake test using net.Pipe so we can pin the protocol envelope (StartupMessage → AuthenticationOk → ParameterStatus×N → BackendKeyData → ReadyForQuery) without binding a real TCP port. Runs in <1ms; safe for CI. The PRG-dispatch path (runSQL → FIVE_SQL → row encoding) is already covered manually by spinning a `five run` of `pg_server_start(":15432")` and connecting with pgx — that flow verified post-MVP that a real PostgreSQL client receives `{ONE (INT4), GREET (TEXT)}` + row `[1 hello]` for `SELECT 1 AS one, 'hello' AS greet` over the wire. An automated shell harness will land in Phase 7 with the psql integration tests. Also rolls go.mod / go.sum forward with the pgx v5 toolchain pulled in by Phase 2's pgproto3 dependency. Module bump 1.21.13 → 1.25.0 matches what `go get github.com/jackc/pgx/v5/pgproto3` selected; cross-builds for windows/linux/darwin all still succeed (verified locally). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 22:13:40 +09:00
CharlesKWON	d98f5e1767	feat(pgserver): PostgreSQL-wire MVP — psql can SELECT from FiveSql2 First end-to-end working version of the PostgreSQL-wire-compatible TCP server frontend. A standard `psql` client now connects, runs `SELECT * FROM employees`, and gets back a properly typed result set rendered by psql with the right column alignment: ID \| NAME \| SALARY ----+----------------------+---------- 1 \| Alice \| 50000.00 2 \| Bob \| 42000.50 3 \| Cho \| 77500.00 This is the Phase 2 deliverable from the approved plan at /Users/charleskwon/.claude/plans/compiled-launching-shore.md. Builds on the session-state refactor in `93cf5c8` — each connection gets its own TSqlSession on the PRG side via the new PG_NEW_SESSION HB_FUNC, so concurrent psql clients won't share transaction logs or plan caches. Scope ----- v1.0 MVP: Simple Query only, trust auth, no TLS yet. SELECT works against the full FiveSql2 surface (CTEs, window functions, JOINs, aggregates). DML + per-session transactions are Phase 3, extended protocol is Phase 4, auth + TLS are Phases 5/6. Architecture ------------ psql/pgx/JDBC ──TCP:5432──▶ pgserver.Listener │ accept() ▼ go handleConn(net.Conn) ┌─────────────────────────────┐ │ Session goroutine │ │ 1. SSLRequest peek │ │ 2. StartupMessage │ │ 3. AuthenticationOk (trust) │ │ 4. ParameterStatus×7 │ │ 5. BackendKeyData │ │ 6. ReadyForQuery('I') │ │ 7. loop: Receive() → │ │ dispatchSimpleQuery → │ │ hbrt.Thread.Function( │ │ FIVE_SQL,sql,...,sess) │ │ emit RowDescription │ │ emit DataRow×N │ │ emit CommandComplete │ │ emit ReadyForQuery │ └─────────────────────────────┘ One goroutine per connection, each owning its own hbrt.Thread and TSqlSession instance. Uses the existing audit-fixed NewThread() (`cde8673`) so statics + WA factory propagate. New files (hbrtl/pgserver/) --------------------------- server.go — Config, Server, Serve loop with MaxConnections gate via semaphore, Close drains in-flight sessions. * session.go — full lifecycle: SSLRequest peek + prefixedConn byte-injection trick for StartupMessage, ParameterStatus broadcast (server_version "14.0 (FiveSql2)" so pgx negotiates), BackendKeyData (random pid+secret per session, no CancelRequest yet), query loop dispatching only Simple Query in v1.0 with a loud "0A000 not supported" for Extended messages. * dispatch.go — runSQL invokes FIVE_SQL via PushSymbol+Function, unpacks the engine's `{aFieldNames, aRows}` envelope or the `{{"__error__"}, {{nCode, cMsg, cSQL}}}` error shape, emits RowDescription with text-format OIDs and DataRow per row. * typemap.go — pgTypeFor() picks INT4 / INT8 / NUMERIC / TEXT / DATE / TIMESTAMP / BOOL by sampling the first row's value type; encodeText() formats each cell, returning nil-slice for NULL (the PG length=-1 convention). * errmap.go — sqlStateFor() maps FiveSql2 SQL_ERR_* codes to canonical PG SQLSTATEs (42601/42P01/42703/42804/23505/23514/ 23503/25P02/42501/02000/XX000). * auth.go — trust mode in v1.0; password/MD5/SCRAM lands Phase 5 but the dispatch sentinel is already in place. * tls.go — upgradeToTLS stub for SSLRequest handling; the byte- ordering is already wired so Phase 6 just plugs in tls.Config. * register.go — package init() registers pg_server_start / pg_server_stop HB_FUNCs. Importing the package (done from hbrtl/register.go via blank import) is enough to enable them. * pgserver_test.go — unit tests for encodeText (numeric, string, NIL), pgTypeFor (OID dispatch), sqlStateFor (error mapping), commandTagFor (SELECT/INSERT/UPDATE/DELETE/BEGIN/COMMIT). Other changes ------------- * _FiveSql2/src/TSqlSession.prg — added PG_NEW_SESSION() factory used by the Go dispatcher to allocate a per-connection session bypassing the embedded process default. * hbrtl/register.go — blank-import five/hbrtl/pgserver so its init() fires and the HB_FUNCs land in the global dynamic-func table for VM symbol lookup. * go.mod / go.sum — github.com/jackc/pgx/v5 v5.9.2 (pgproto3 subpackage). MIT license. Same library pgx itself uses, so protocol coverage matches the de-facto Go PG ecosystem. Verification ------------ $ pg_server_start(15432, "trust") /* PRG one-liner / $ psql -h 127.0.0.1 -p 15432 -U fiveuser -c 'SELECT FROM employees' → 3 rows rendered correctly by psql (ID as INT4, NAME as TEXT, SALARY as NUMERIC(10,2) with 2 decimal places) All six release gates green: go test ./... ✓ (incl. new hbrtl/pgserver tests) FiveSql2 SQL:1999 43/43 ✓ Harbour compat 56/56 ✓ std.ch 17/17 ✓ FRB 7/7 ✓ examples 65/71 ✓ (unchanged baseline) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 18:40:32 +09:00
CharlesKWON	cde86730b8	fix(compiler,hbrt,hbrdd,cli): pre-1.0 audit — 13 critical fixes Senior-engineer / QA audit landed 13 silent-miscompile and data- integrity fixes spanning the whole compiler+runtime+storage stack. Each fix is paired with either an integration test in the suite or a focused regression check; all 6 release gates stay green: go test ./..., FiveSql2 43/43, Harbour compat 56/56, std.ch 17/17, FRB 7/7, examples 65/71. Compiler -------- * genpc IF/ELSEIF jumpEnd2 patching (compiler/genpc/genpc.go). Per-ELSEIF branch terminators were stashed into `_ = jumpEnd2` and never patched — the relative offset stayed 0 and the runtime walked the next ELSEIF's PcOpJumpFalse opcode as if it were jump-offset data. Bytecode-level corruption in pcode mode. Now collected into a slice and patched at end-of-IF. Verified via Grade(95..50) cases 11a-e added to tests/frb/test_frb_pcode_sweep. * countLocalsInStmts / scanBodyLocals missing bodies (compiler/gengo/gen_util.go, compiler/gengo/gengo.go). Frame-size counter skipped WATCH/TIMEOUT/PARALLEL FOR bodies, so a LOCAL declared inside one of those constructs got a slot index past the runtime's allocated count — silent NIL reads or out-of-range stomps. * emitMethodDeclStandalone nested LOCAL (compiler/gengo/gen_class.go). Same bug class but on the method side. Pre-fix repro: METHOD Stomp(n) CLASS T LOCAL a := 1, b := 2 IF n > 0 LOCAL c := 30, d := 40, e := 50, f := 60 Inner( n ) IF c != 30 .OR. d != 40 .OR. e != 50 .OR. f != 60 ... printed `c, d, e, f = 5, NIL, NIL, NIL` because Inner's frame collided with Stomp's underallocated slot range. Now counts body-nested LOCALs into the frame and pre-allocates indices via scanBodyLocals. * genpc unsupported-AST diagnostic surface (compiler/genpc/genpc.go, hbrt/pcode.go, cmd/five/main.go, hbrtl/frb.go). The `default` cases in emitStmt / emitExpr silently emitted PushNil / no-op for nodes the pcode generator doesn't implement (ClassDecl, MethodDecl, xBase commands, concurrency primitives, …). Added `PcodeModule.Warnings []string` populated by noteUnsupported, surfaced on stderr from the build pipeline. Users now see "pcode: AST node not supported in --pcode/FRB-pcode mode: stmt ast.GoBlockStmt" instead of getting a silently broken module. Runtime ------- class.go Send/tryBinaryOp t.self defer-restore (hbrt/class.go). Restoration was a plain `t.self = oldSelf` after `fn(t)`. Any panic in the method body skipped the line, so the next BEGIN SEQUENCE / RECOVER handler ran with the THROWING object's Self — `::field` resolved against the wrong receiver. Wrapped both restore sites in `defer func() { t.self = oldSelf }()`. Verified: pre-fix RECOVER saw "THROWER", post-fix "OUTER". * hbfunc.go HB_FUNC parameter Frame() (hbrt/hbfunc.go). The RegisterDynamicFunc wrapper called `fn(ctx)` without ever calling Frame, so `ctx.ParC(1)` / `ctx.Local(n)` read through `t.curFrame.localBase + n - 1` against the caller's frame. Every #pragma BEGINDUMP HB_FUNC taking parameters silently returned "" / 0 / "" for them — masked by ParNIDef-style defaults. Wrapper now does `t.Frame(t.pendingParams, 0); defer t.EndProc()` before dispatch. * pcode codeblock closure capture (hbrt/pcinterp.go, hbrt/pcode.go, hbrt/thread.go, compiler/genpc/genpc.go). PcOpPushBlock recorded `nDetached` but never copied enclosing locals; free vars in the block body fell through to memvar lookup → NIL. Wired full capture pipeline: - New opcodes PcOpPushDetached (0x59) / PcOpPopDetached (0x5A). - PushBlock now reads per-slot source-local indices and snapshots into bb.Detached at construction time. - New detachedMap in genpc auto-promotes any free var that resolves to an enclosing-frame local into a capture slot. - emitAssignAsExpr leaves the assigned value on the eval stack so SeqExpr items like `{\|v\| acc += v, acc }` work. - Thread tracks curBlock with paired Set/restore in the block's Fn wrapper for nested-block evaluation. Mutating capture (acc += v across successive Evals) now works. * vm.NewThread statics + waFactory propagation (hbrt/vm.go). GoLaunch / GoLaunchBlock call NewThread directly. Previously the statics map and WA factory were applied only in Run(), so goroutine-spawned PRG code panicked on STATIC access ("static index out of range") and crashed dereferencing nil WA on any DB call. Both now happen inside NewThread under the same lock as TID assignment. Data layer ---------- * dbf concurrent Append lock (hbrdd/dbf/dbf.go, hbrdd/dbf/locks_posix.go, hbrdd/dbf/locks_windows.go). Append bumped a local recCount with no file-system serialization. Two shared-mode processes both wrote at the same RecordOffset; one record silently overwrote the other. Added an append-intent byte-range lock at offset 0x7FFFFFFE + bounded retry, on-disk header refresh inside the locked region, and immediate header write so peers refresh past our slot. * indexer negative numeric key encoding (hbrdd/dbf/indexer.go + new hbrdd/dbf/encode_numeric_test.go). `%20.10f` formats `-100` as `" -100.0000000000"` and `99` as `" 99.0000000000"`. ASCII ' ' (0x20) < '-' (0x2D), so `99` lex-compared LESS than `-100` — every NTX/CDX index over a column that ever held a negative number returned wrong rows for SEEK / range scans. Replaced with a 1-byte sign prefix + 21-byte zero-padded magnitude (negatives use digit-complement) so byte order matches numeric order across signs and magnitudes. Format change: existing indexes built with the old encoding must be REINDEXed. Three unit tests pin the order. * dbf Append index maintenance hooks (hbrdd/dbf/dbf.go, hbrdd/dbf/indexer.go). Append never inserted into open NTX/CDX indexes — the audit's canonical scenario `SET INDEX TO …; APPEND BLANK; REPLACE …; dbSeek …` silently missed the new record. Added optional IndexWriter interface, queue the new recNo in pendingIdxInserts, drain after flushRecord by calling InsertKey on every open writer-supporting engine. NTX participates (its existing rebuild-on-insert is correct); CDX online maintenance is deferred to a follow-up — those indexes still need REINDEX. Verified: post-fix SEEK("Charlie") after APPEND BLANK + REPLACE finds the new record. * dbf PACK crash-safety (hbrdd/dbf/dbf.go). The old in-place rewrite read record N, overwrote slot M<N, then truncated. Power loss after partial loop left a file with overwritten prefix and no original copies of the records already advanced past — silent data loss. Rewrote to: 1) drop mmap, build `<file>.pack.tmp` with all surviving records, 2) Sync(), 3) close original handle + os.Rename(tmp, orig) (atomic on same FS), 4) reopen + re-mmap. TestComp_Pack passes; readers always see either the pre-PACK or post-PACK contents, never a half-state. * mem RDD torn reads (hbrdd/mem/memrdd.go). The comment claimed in-place PutValue was safe because hbrt.Value "fits in a single machine word + pointer". hbrt.Value is 24 bytes (3 words) — a concurrent reader could observe new type tag with stale scalar/ptr and type-confuse on the next AsXxx() call. Switched mu to sync.RWMutex; GetValue takes RLock, Append/PutValue/Delete/Recall take Lock. `go test -race ./hbrdd/mem/` clean. Files touched ------------- compiler/gengo/gen_class.go, gen_util.go, gengo.go compiler/genpc/genpc.go hbrt/class.go, hbfunc.go, pcinterp.go, pcode.go, thread.go, vm.go hbrdd/dbf/dbf.go, indexer.go, locks_posix.go, locks_windows.go hbrdd/dbf/encode_numeric_test.go (new) hbrdd/mem/memrdd.go cmd/five/main.go hbrtl/frb.go tests/frb/test_frb_pcode_sweep.prg Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 05:29:56 +09:00
CharlesKWON	2008266da7	feat(pp,rtl): Tier 2 audit followups — JOIN hash + PP validation + C heuristic Three medium-priority audit items in one commit, each independently revertible. * #18 JOIN hash-join fast path. New std.ch shape: JOIN WITH <alias> TO <file> [FIELDS ...] ON <mfield> = <dfield> expands to a 6-arg __dbJoin call with the master/detail key field names. Runtime detects the extra args, builds an O(M) hash over the detail's key column, then probes per master row for O(N+M) total — vs the FOR form's O(NM). For 1k×1k that's 2k vs 1M operations; the gap widens with N. The original FOR form is unchanged and stays the fallback for arbitrary predicates. New helper dbHashKey type-tags the key string so `1` (numeric), `"1"` (string), and `.T.` (logical) don't collide in the bucket map. #38 PP rule result-marker validation. ParseRule now walks the result template after parseMarkers and warns about every `<name>` (or `<(name)>` / `<.name.>` / `<{name}>` / `#<name>` / `<"name">`) that doesn't match a pattern marker. Warnings flow into pp.errors via handleDirective with the directive's filename:line, so a typo'd `<NaMe>` in an `#xcommand` case-sensitive rule fails the build with a clear diagnostic instead of silently producing broken expansions. * #44 looksLikeInlineC heuristic strengthened. Catches more of the common Harbour-PRG-with-C-inline-block shapes that used to fall through and produce cryptic Go-side errors: function-like #define, `extern "C"` linkage blocks, C return- type declarations (`int foo(`, `static char* bar(`), and the hb_ret() helper family used by Harbour's C FFI return setters. Two small predicate helpers (allLetters, allIdentChars) keep the C-vs-Go disambiguation tight enough that legit Go code (`func name() int { ... }`) doesn't trip. #28 LIST/DISPLAY pagination — explicitly deferred. Proper pagination requires interactive terminal handling (Inkey(0) for the keypress) which would hang in CI / batch mode. Will revisit when an interactive terminal layer needs it for other reasons. Test fixtures: tests/std_ch/test_join_hash.prg verifies the new ON-form path produces the same output as the FOR form would. std.ch runner now stands at 16/16. Other gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 std.ch suite : 16/16 FRB suite : 7/7 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 19:21:19 +09:00
CharlesKWON	efb615bed9	fix(frb,genpc): in-process compile + 4 pcode bugs Compiling _FiveSql2/test/test_sql_extreme.prg + a sweep of the FRB demos surfaced four real bugs in the dynamic-compilation pipeline. All fixes shipped together because they were on the same critical path; each is independently revertible. * pcode FOR loop ignored STEP and direction. emitFor in compiler/genpc emitted a fixed `<= to` comparison and a hardcoded `+1` increment, then deleted the actual step expression with slice arithmetic on the byte buffer. Result: `FOR 5 TO 1 STEP -1` exited on the first iteration; `FOR 1 TO 10 STEP 2` summed 1..10 (55) instead of 1+3+5+7+9 (25). Rewritten to mirror gengo's emitFor: detect negative step from a literal `-N` or unary MINUS, pick `<=` vs `>=` accordingly, and emit a clean `var := var + step` increment per iteration. * pcode compound `+=` operator stored only the RHS. emitAssign looked at AssignExpr.Op only for the := case; +=/-=/etc. silently took the same path, so `n += i` compiled as `n := i`, discarding the accumulator. Loop reduces were wrong: `Reverse` returned "" and `n := 0; FOR i ... n += i; NEXT` returned only the last increment. New compoundBinOp helper maps PLUSEQ / MINUSEQ / STAREQ / SLASHEQ / PERCENTEQ / POWEREQ to their matching binary opcode; emitAssign emits `local + rhs ; pop local` for compound forms. * Pcode body stack leaks polluted the caller's frame. A pcode function whose body left intermediate values on the data stack (FOR control values, etc.) returned with extra entries past its declared retVal. FrbDoFunc / FrbExecFunc / FrbRunFunc then pushed retVal on top of those leaks, so the caller saw the leaked values where its own preceding arguments should have been: `? "Fibonacci(10) =", FrbDo(...), "(expect 55)"` printed `1 55 (expect 55)` because the FOR loop's `1` lived in arg-1's slot. Two new Thread methods (`SP()` / `SetSP(int)`) let the three FRB dispatchers snapshot stack depth before the inner call and clamp it back afterward, so the leaks evaporate before they reach the caller's frame. * FrbExec / FrbRun recursed into the host's Main forever. Both looked up "MAIN" via t.VM().FindSymbol, which always resolved to the OUTER program's Main since FRB modules deliberately keep Main local. Compile + run + unload became compile + recurse + OOM. Both now look up Main via mod.FindFunc("MAIN") (module scope) — Frbload's policy of leaving Main module-local now actually has the intended effect. Plus an architectural improvement: in-memory compilation no longer depends on shelling out to an external `five` binary. New hbrtl.frbCompileInProc parses + preprocesses + generates pcode in process, building a FrbModule directly. FrbCompile and FrbExec use this exclusively, which means dynamic compilation works from any directory regardless of PATH and without a second process. The plugin-mode path (with its runtime-version-mismatch fragility) is left available via hbrt.FrbCompileSource for callers that want it, but FrbCompile no longer reaches for it by default. Test suite: tests/frb/ holds five fixtures + a runner. 5/5 pass: test_frb_simple / test_frb_pcode_load / test_frb_compile / test_frb_loop / test_frb_step. Other gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 std.ch suite : 14/14 FRB suite : 5/5 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 10:25:35 +09:00
CharlesKWON	412351b67d	feat(rtl): LIST/DISPLAY TO FILE — text output redirection Wire up TO FILE for both LIST and DISPLAY: __dbList grows a 9th parameter cFile, opens it (truncating any prior content) when non- empty, and writes the formatted rows there via fmt.Fprintln. Default behavior (no TO FILE) still goes to stdout. std.ch gets two new rules placed before the regular LIST/DISPLAY patterns so they win when TO FILE is present: LIST [<v,...>] TO FILE <(f)> [OFF] [FOR] [WHILE] [NEXT] ... DISPLAY [<v,...>] TO FILE <(f)> [OFF] [FOR] [WHILE] [NEXT] ... Open failure raises a clear *HbError ("LIST/DISPLAY TO FILE: cannot create <path> — <syscall reason>") so callers know exactly what went wrong instead of getting partial-or-empty output. TO PRINTER stays rejected via __dbNotImpl — Five doesn't drive a printer port. Test coverage: tests/std_ch/test_list_to_file.prg exercises four shapes (full LIST, single-row DISPLAY, OFF + FOR with explicit fields, and confirms TO PRINTER still raises). Wired into the std.ch runner so the regression suite now stands at 14/14. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 std.ch suite : 14/14 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 08:15:32 +09:00
CharlesKWON	3a7f1dea72	feat(rtl,tests): pre-release UX round (Wave 5) Three audit findings around polish + a release-readiness commit: * #UX1 LIST/DISPLAY output: dropped \r\n (unix terminals showed a stray ^M), moved the newline to AFTER each row (no more leading blank line), and added the `` deleted-record marker after the record number — matches xBase LIST/DISPLAY convention. With SET DELETED ON the marker is unreachable since the row would have been skipped at Area.Skip level; with SET DELETED OFF the user now sees which rows are tombstoned. #26 temp aliases: `__copytmp` / `__sorttmp` / `__totaltmp` / `__jointmp` were process-global string constants. A nested invocation (e.g., COPY inside a FOR clause whose expression runs another COPY) collided on the alias and the inner Open failed with "alias already in use" — surfacing as `.F.` with no clear cause. Each Open now goes through a new helper `nextTmpAlias(prefix)` backed by an atomic counter, so every call gets `__copytmp_1`, `__copytmp_2`, etc. — no collisions. * #J test coverage gap: the 13 std.ch regression tests were all sitting in `/tmp` — lost on tmpfs reboot, never in git, never in CI. Move them into `tests/std_ch/` and add a simple `run.sh` runner that builds + executes each one in a temp scratch directory and grep-asserts on FAIL / NOT REJECTED / expectation-mismatch markers. 13/13 pass against the current head: PASS test_pp_stdch PASS test_count PASS test_sum_avg PASS test_sum_multi PASS test_copy PASS test_sort PASS test_list PASS test_total PASS test_join PASS test_update PASS test_set_deleted PASS test_unsupported PASS test_block_comma test_block_comma in particular guards the gengo SeqExpr fix from Wave 1 — without it the comma-in-block miscompile would silently come back. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 std.ch suite : 13/13 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 08:07:50 +09:00
CharlesKWON	1a9e509ee2	perf(rtl): SORT TO swaps insertion sort for sort.SliceStable (Wave 4) Drop the toy O(n²) insertion-sort that __dbSort had been using and delegate to the stdlib's sort.SliceStable. Reasoning: SORT TO is an operation a user reaches for because their dataset is too big to just iterate manually — interactive DBFs routinely have 10k–1M rows, which the old impl would chew on for minutes to hours. SliceStable gives O(n log n) and preserves the original-input ordering for equal keys, which is what the previous implementation also tried to do. The function signature is unchanged (`stableSort(rows, less)`), so all the multi-key / /D / /C dispatch logic from earlier waves keeps working unmodified. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 08:03:13 +09:00
CharlesKWON	5b1d3fb32f	feat(pp,rtl): pre-release accuracy round (Wave 3) Four audit findings around correctness/consistency in std.ch and the SORT/UPDATE/TOTAL handlers: * #13: TOTAL/UPDATE key idiom inconsistency documented as inherent. TOTAL evaluates `<key>` only in the source workarea so verbatim `<{key}>` (alias-qualified or `_FIELD->`-prefixed by the user) works. UPDATE evaluates the same block in BOTH master and detail context, so it must wrap as `_FIELD-><key>` to dispatch to whichever WA is selected at eval time. The two rules look alike but their evaluation contexts differ — also documented in std.ch alongside both rules so the asymmetry isn't a surprise. Plus: TOTAL TO and ON are now mandatory (matching the COUNT/ UPDATE pattern from Wave 1) — bare TOTAL would have produced broken syntax via the unconditional `<(f)>`/`<{key}>` template references. * #15/#16: SDF / DELIMITED variants of COPY and TO PRINTER / TO FILE variants of LIST / DISPLAY are now matched by stub rules (placed before the regular rules so they win) that expand to a new `__dbNotImpl(reason)` RTL primitive raising a clear `&hbrt.HbError`. BEGIN SEQUENCE / RECOVER catches the panic, so callers get a real error instead of the previous silent dispatch-to-regular-DBF-copy. * #19: SORT /C (case-insensitive) now actually folds case before the string compare, instead of being silently treated as ascending. Suffix parser also rebuilt as a multi-letter scanner so `name/CD`, `name/DC`, `name/C/D`, `name/D/C` all parse the same way — combine /C and /D freely. Unknown suffix letters (e.g., `name/X`) leave the suffix attached to the field name so a stray slash in user input doesn't get silently mangled into a broken field reference. * #27 SET DELETED: verified with a regression test that `SET DELETED ON` causes COUNT/COPY (and by extension SORT/TOTAL/JOIN/UPDATE — all of which iterate via Area.Skip) to skip rows marked deleted. The filtering is implemented at the workarea level (skipFilter in dbf.go honors hbrdd.IsSetDeleted) so no RTL changes were needed; this commit just adds the coverage so the behavior doesn't silently regress. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 08:01:42 +09:00
CharlesKWON	f30704a854	fix(rtl,pp): pre-release safety round (Wave 2) Five concrete gaps the audit flagged in the new __dbCopy / __dbSort / __dbTotal / __dbJoin / PP code: * wam.Close() errors were dropped on the floor. Caller saw `.T.` even when the just-written DBF wasn't durable, leading to the classic "delete the source after the COPY succeeds" data-loss pattern. All four functions now capture the close error and return `.F.` if it fired. * drv.Create succeeded → wam.Open failed → orphaned-on-disk DBF. The user-named target file was left around with zero records, and the next call's drv.Create silently truncated it instead of surfacing the original error. Add `os.Remove(cFile)` on the Open-failure cleanup path for COPY/SORT/TOTAL/JOIN. * __dbTotal would write the DBF codec's overflow sentinel (`****`) into the destination's sum-fields when a group total didn't fit in the source's declared field width, and still return `.T.`. Now: precompute each sum-field's max representable magnitude (10^(Len-Dec)) at start, mark the run as overflowed if any flush sees an out-of-range or NaN value, and propagate `.F.` to the caller so they don't trust the file. cleanUnreferencedMarkers walked byte-by-byte and stripped any `<ident>` token in the result, INCLUDING ones that appear inside `"..."` / `'...'` string literals. A user expression like `LIST FOR url == "<a>x</a>"` got the `<a>` and `</a>` eaten on output. Now: track string-literal state and skip the cleanup pass while inside one. Bracket-strings `[…]` are intentionally not treated as strings here — the result template uses `[...]` as the optional-repeat marker, and disambiguating needs context the cleanup pass doesn't have. * (#8 SET SAFETY honoring) deferred. Harbour default is SAFETY OFF, so the current always-overwrite behavior matches default Harbour. The divergence only matters when user explicitly does `SET SAFETY ON`, which Five doesn't support yet — so the no-overwrite-protection is consistent end-to-end. Tracked as a separate followup. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 07:54:41 +09:00
CharlesKWON	80a18daf8d	feat(pp): UPDATE FROM via std.ch + nested-bracket fix in matchSegment `UPDATE [FROM <alias>] [ON <key>] [RANDOM] REPLACE <f1> WITH <x1> [, <fN> WITH <xN>]` becomes a preprocessor rewrite to a new RTL primitive __dbUpdate. For each detail record, find the master record with matching key (forward-walk if both sorted, full scan when RANDOM) and apply the REPLACE clauses in master's context. Same shape as harbour-core/src/rdd/dbupdat.prg. The REPLACE clauses expand to comma-separated assignments inside one block — `{\|\| _FIELD->total := del->amt, _FIELD->status := "OK" }` — using the multi-pair `[, <fN> WITH <xN>]` optional-repeat that std.ch already establishes for SUM and DEFAULT. Five-specific tweak: ON <key> wraps as `{\|\| _FIELD-><key> }` rather than Harbour's bare `<{key}>`. Five doesn't auto-resolve a bare identifier in a code block to the current workarea's field, and the UPDATE block must evaluate against both detail and master so an explicit alias prefix won't do — _FIELD-> dispatches to whichever area is selected at eval time, which is what's needed. Wiring up UPDATE surfaced one further matchSegment gap that fell out of the multi-pair `[REPLACE ... [, ...]]` shape: * matchSegment didn't handle nested `[...]` inside its body. `[REPLACE <f1> WITH <x1> [, <fN> WITH <xN>]]` gave the inner `[` as a literal token to match against the line, so even the single-pair `REPLACE total WITH del->amt` form failed and f1/x1 came back empty. Now matchSegment runs the same repeat-loop on inner `[...]` blocks that the top-level matcher uses, with its own outer-tail computed from the segment tail past the inner `]`. Parser cleanup: UPDATE removed from the IDENT-statement no-op switch. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 17:49:33 +09:00
CharlesKWON	ebe12e1108	feat(pp): JOIN WITH ... TO via std.ch + __dbJoin RTL `JOIN WITH <alias> TO <file> [FIELDS <list>] [FOR <expr>]` becomes a preprocessor rewrite to a new RTL primitive __dbJoin. Cartesian product of the current ("master") workarea and the named "detail" alias, filtered by the FOR expression. Output structure: * No FIELDS clause: master's fields followed by detail's, dropping any detail-side name that clashes with master. * FIELDS list: one column per name in declaration order, resolved against master first then detail. Same shape as harbour-core/src/rdd/dbjoin.prg. Five-specific simplifications: alias->name in FIELDS not yet supported (bare names with master-precedence lookup); RDD/codepage args dropped since Five only has DBFNTX. Note for callers: don't name a workarea `M` or `MEMVAR` — both are Harbour-reserved memvar aliases, so `M->field` and `MEMVAR->field` always go through the memory-variable namespace, not the workarea. This is gengo behavior matching Harbour, not new in this commit. Parser cleanup: JOIN removed from the IDENT-statement no-op switch. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 16:42:06 +09:00
CharlesKWON	699ea90156	feat(pp): TOTAL TO via std.ch + __dbTotal RTL `TOTAL TO <file> ON <key> [FIELDS <list>] [FOR ...] [WHILE ...] [NEXT ...] [RECORD ...] [REST] [ALL]` joins the family of std.ch DML rewrites. New RTL primitive __dbTotal: * Walk the source under dbEval-style FOR/WHILE/NEXT/RECORD/REST bounds. The source must already be sorted/indexed on the key — same precondition as Harbour's dbtotal.prg. * Track the current group key. On each key change, flush the accumulated row to the destination (writing the running totals back into the most recently appended record's sum-fields, preserving each field's declared length/decimals). * On the first record of every group, append a fresh dst row and copy all non-memo source fields into it; subsequent records in the group only contribute to the sums. Net effect: non-summed fields take the first record's value, summed fields hold the group total. Same shape as harbour-core/src/rdd/dbtotal.prg. * Memo fields are dropped from the destination structure (Harbour does the same). Parser cleanup: TOTAL removed from the IDENT-statement no-op switch. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 15:24:41 +09:00
CharlesKWON	1cc2d94927	feat(pp): LIST / DISPLAY via std.ch + four PP completeness fixes `LIST [<fields>] [OFF] [FOR ...] [WHILE ...] [NEXT ...] [RECORD ...] [REST] [ALL]` and `DISPLAY [<fields>] [OFF] [FOR ...] ... [ALL]` reach the parser as plain function calls to a new RTL primitive __dbList (rtlDbList in hbrtl/database.go). Implementation: walk the workarea under dbEval-style FOR/WHILE/NEXT/ RECORD/REST bounds. For each visible record, evaluate each column block and emit the rendered values via valueToDisplay (the same formatter QOut already uses). Empty fields list defaults to "all fields". OFF suppresses the record-number prefix. LIST always emits the full filtered range; DISPLAY without ALL emits only the current record (encoded as nCount=1). TO PRINTER / TO FILE clauses are not yet wired through — for now everything goes to stdout. Wiring up LIST/DISPLAY surfaced four further gaps in PP that were silently masking bugs in any rule with multiple word-list / list / optional clauses chained together: * matchSegment refused MarkerWordList inside `[...]`. The LIST rule's `[<off:OFF>]` clause therefore never set the off capture, and `<.off.>` substituted to nothing instead of .T./.F. matchSegment now matches WordList markers the same way the top-level matcher does. * `<v,...>` and `<(f)>` capture stop boundaries didn't include the values of following MarkerWordList markers. For `[<v,...>] [<off:OFF>] [<all:ALL>]` against `LIST id, name OFF`, the v list would happily eat OFF. New addStopFrom helper contributes both literal keywords and word-list values; both matchSegment's MarkerList branch and captureExpression now use it. * Optional-repeat loop in matchPattern merged a no-progress iteration's empty capture into the running multi-capture string (with the `\x01` separator) before the no-progress break check fired. So a successful first iteration's value got contaminated and the substitution loop then skipped it as multi-capture garbage. The merge now happens after the progress check. * Unreferenced `<.name.>` markers (optional clauses that didn't match in the input) were getting cleaned up to empty by the generic marker scrubber instead of the .F. sentinel Harbour's std.ch expects. New replaceUnreferencedLogify pass mirrors the existing replaceUnreferencedBlockify and runs just before the cleanup. Parser cleanup: LIST and DISPLAY removed from the IDENT-statement no-op switch in both parseIdentStmt and parseExprStmt. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 15:19:36 +09:00
CharlesKWON	989138d12e	feat(pp): SORT TO via std.ch + __dbSort RTL `SORT TO <file> [ON <key-list>] [FOR ...] [WHILE ...] [NEXT ...] [RECORD ...] [REST] [ALL]` joins COPY in being a real preprocessor rewrite to a function call. New RTL primitive __dbSort: * Buffer visible source records (FOR/WHILE/NEXT/RECORD/REST same as __dbCopy). * Multi-key stable insertion sort. Each key may carry `/D` for descending; ascending otherwise. /A and unknown suffixes fall through as ascending. Comparison delegates to the existing compareValues helper in sqlscan.go (numeric / string / NIL-aware). * Create destination DBF with the source's struct, append rows in sorted order, restore source selection. Parser cleanup: SORT removed from the IDENT-statement no-op switch. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 15:04:18 +09:00
CharlesKWON	e961660f61	feat(pp): COPY TO via std.ch + four PP completeness fixes `COPY TO <file> [FIELDS <list>] [FOR ...] [WHILE ...] [NEXT ...] [RECORD ...] [REST] [ALL]` reaches the parser as a plain function call to a new RTL primitive __dbCopy (rtlDbCopy in hbrtl/database.go). Implementation: project the field list (case-insensitive name match against the source's structure, full copy when omitted), dbCreate the target file with that struct, open it under a temp alias, walk the source under dbEval-style FOR/WHILE/NEXT/RECORD/REST bounds, and GetValue/Append/PutValue per record into the target. SDF / DELIMITED variants stay parser no-ops until those backends arrive. Wiring up COPY surfaced four longstanding gaps in the PP that had to be fixed for the rule to even reach the runtime: * `<(name)>` pattern marker was treated as a regular `<name>` with the parens baked into the captured key, so the matching result substitution `<(name)>` couldn't find it. parseOneMarker now strips the parens at parse time so capture key and result marker share the bare name. The smart-stringify result behavior is unchanged. * matchSegment (the optional-clause matcher) bailed on every non-Regular marker. `[FIELDS <fields,...>]` therefore failed to match at all and the fields list arrived empty in the result template. matchSegment now handles MarkerList with paren-balanced capture and segment+outer literal stop boundaries. * captureExpression only used the first literal in the pattern tail as a stop boundary. With std.ch's chain of optional clauses (`[TO <(f)>] [FIELDS ...] [FOR ...] [WHILE ...] ...`) the file-name marker was happy to gobble a trailing FOR clause when FIELDS was absent. It now stops at any of the remaining pattern literals. * `<(name)>` smart-stringify on a list-typed capture wrapped the whole comma-joined string in one set of quotes — `{ "a , b" }` — instead of `{ "a", "b" }`. New helper quoteListElements splits on top-level commas (paren / bracket / brace / string-balanced) and quotes each element. applyResult now consults the rule's marker table to know which captures came from `<name,...>`. Parser cleanup: COPY removed from the IDENT-statement no-op switch in both parseIdentStmt and parseExprStmt. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 15:00:18 +09:00
CharlesKWON	c2e7f7ea27	feat(pp): Phase B — COUNT / SUM / AVERAGE via std.ch Three xBase analytical commands that were silent no-ops in the parser now execute as Harbour-style PP rewrites: COUNT [TO <v>] [FOR <for>] [WHILE <while>] ... -> dbEval() SUM <x> TO <v> [FOR <for>] [WHILE <while>] ... -> dbEval() AVERAGE <x> TO <v> [FOR ...] -> __dbAverage() COUNT and SUM expand to a `<v> := 0 ; dbEval( {\|\| ... } )` pair matching harbour-core/include/std.ch verbatim. AVERAGE delegates to a new RTL function rtlDbAverage (sum + count + divide; returns 0 on empty match) — the chained-private-variable trick Harbour uses to keep AVERAGE inline doesn't translate cleanly through Five's PP. Wiring up these rules surfaced four PP issues that had to be fixed for the rewrite to even reach the parser: * Result template did not implement <{name}> blockify. So a rule body like `{\|\| x := x + <x> }, <{for}>` left the literal text `<{for}>` in the output. Added blockify substitution: captured -> `{\|\| <captured> }`, missing -> NIL. * findMarkerEnd did not recognise `{`/`}` so unreferenced blockify markers were not cleaned up either. Added `{`/`}` to its prefix/suffix sets. * Optional-clause matching had no view of the outer pattern, so a regular marker at the end of `[TO <v>]` would swallow the rest of the line — `COUNT TO n FOR x>5` captured `<v>` as "n FOR x>5". matchSegment now takes outerTail and stops at its first literal. * `#command` directives could not span multiple physical lines. A trailing `;` is harbour-core's line-continuation marker for std.ch and now joins the next line into the directive before parsing. Parser cleanup: COUNT, SUM, AVERAGE removed from the IDENT-statement no-op switch in parseIdentStmt + parseExprStmt. The remaining xBase verbs (COPY, SORT, TOTAL, JOIN, LIST, DISPLAY, LABEL, REPORT, ...) stay in the parser until their RTL backends arrive. Gates green: go test ./... : PASS FiveSql2 SQL:1999 : 43/43 Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 14:11:20 +09:00
CharlesKWON	f4ed42556b	checkpoint: season-wide bug fix campaign + infra Cumulative season's silent-bug hunting (~62 fixes) across the FiveSql2 SQL engine, the Five compiler/runtime, and the hbrdd RDD layer. Saved as a single checkpoint before refactoring the parser to delegate xBase command translation to the preprocessor. Highlights: FiveSql2 engine (_FiveSql2/src/) - prefix-glob index attach -> explicit convention (<table>_pk.ntx, <table>_uq.ntx, <table>.cdx) — fixes silent multi-row INSERT row-drop - DROP/CREATE TABLE FErase chain extended (.cdx, .fsc, .fsv, .dbt, .fpt) - COUNT(DISTINCT col) parsed + aggregated via hSeen hash - UNION column-count mismatch returns SQL_ERR_GRAMMAR (was silent) - DISTINCT + ORDER BY hidden-col leak fixed (trim before DISTINCT) - Derived table FROM (SELECT...) + JOIN right-side derived - Self-FK CASCADE depth 2+ via SqlGetSingleColPK pre-collect - LAG/LEAD default arg uses SqlEvalRowExpr (handles -N const exprs) - DATE literal round-trip validation (Feb 29 non-leap rejected) - CREATE OR REPLACE VIEW; CREATE VIEW errors on already-exists - AlterTable type dispatcher comma-wrapped (1-char type "A" no longer matches CHARACTER) Compiler / runtime - gengo: HB_ -> FV_ prefix on emitted Go function names (Five identity) - gengo split: emit_block.go, emit_stmt.go, folding.go extracted - parser/stmtreg.go nudges - hbrt: debug TUI/CLI restructure (debugcmd, debugkey, termios_*), windows debug stubs collapsed - thread/vm/value/class/pcinterp tightening from panic traces RDD layer (hbrdd/) - dbf: null bitmap support (null.go + null_test.go), mmap split (mmap_posix.go / mmap_windows.go), byte-level numeric parse - ntx/cdx: windows mmap parity - workarea + mem RDD: cross-area state-bleed fixes RTL (hbrtl/) - errorlog rewrite with platform-specific FD (errorlog_fd_unix / errorlog_fd_other) - sqlscan, sqlhelpers, indexrtl, datetime extensions Gates green at checkpoint: - go test ./... : PASS - FiveSql2 SQL:1999 : 43/43 - Harbour compat : 56/56 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 09:26:25 +09:00
CharlesKWON	d6c26104c9	feat(rtl): common.ch aliases — ISNIL/ISARRAY/ISNUMBER and friends Harbour's common.ch exposes classic Clipper type-check shorthands via #translate rules that map to HB_IS* RTL functions: #translate ISNIL(<x>) => ((<x>) == NIL) #translate ISARRAY(<x>) => HB_ISARRAY(<x>) #translate ISCHARACTER(<x>) => HB_ISSTRING(<x>) ... etc. Five's preprocessor currently supports #translate only for lines whose FIRST word is the rule keyword, not for substring matches inside expressions. Real usage like `IF ISNIL(x)` fails the keyword check (first word is IF, not ISNIL) and the rule never fires. Rather than rewrite the PP substring engine (A2 scope), register the nine short names as direct RTL symbols in register.go, each pointing at the same Go function as its HB_IS* twin. ISMEMO maps to HB_ISSTRING as a reasonable approximation for Five (no distinct memo type at the VM level). common.ch becomes a short stub that just #defines TRUE/FALSE/YES/NO and documents where the ISxxx aliases live. DEFAULT / UPDATE #xcommand forms remain unsupported pending A2. Verified with /tmp/test_common.prg — ISNUMBER(42), ISCHARACTER("x"), ISNIL(nilVar) all dispatch correctly. Analyzer still emits "undeclared variable" warnings for the short names (the static checker doesn't see runtime-registered RTL symbols) but the generated code links and runs. FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 17:01:50 +09:00
CharlesKWON	2a662525b3	feat(rtl): DO(xTarget, [args...]) — dynamic dispatch Harbour's DO() accepts a string (looked up as a function name), a code block (evaluated with args), or a symbol, and invokes it. Used for plugin systems and dynamic dispatch idioms like `DO(cHandler, oRequest)`. Five already had stmtDo rewrite `DO(...)` at statement-level to a function-call expression, so callers in expression position just work — but gengo refused to emit DO as a function call because it was on the reserved-word guard list (which existed to catch stray ENDIF/ENDDO from bad IF nesting). Remove DO from that list; the statement form is still handled upstream by parseDoProc, so the guard loses nothing. rtlDo implements the dispatch: - String target → VM.FindSymbol + t.Function - Block target → EvalBlock path (same as Eval) - Anything else → NIL Tested (/tmp/test_do.prg): DO("Greet", "World") → "hello, World" DO({\|x,y\| x*y+1}, 5, 6) → 31 DO(NIL) → NIL (ValType "U") FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 16:33:09 +09:00
CharlesKWON	e089c81bcd	feat(macro): &var / &(expr) runtime compilation Harbour's macro operator was a stub: hbrt.MacroCompile only resolved bare identifier names to memvars/functions and returned the source string unchanged for any non-trivial expression. The gengo emit was also broken — `t.MacroPush() + t.PushNil()` never pushed the inner expression's value, so MacroPush popped whatever happened to be on the stack. Wire it up properly: 1. Gengo fix: `case ast.MacroExpr` now emits `emitExpr(e.Expr); t.MacroPush()`. The inner expression produces the source string; MacroPush consumes it and pushes the evaluated result. 2. Hook pattern in hbrt: `SetMacroEvalHook(fn)` lets hbrtl install the real evaluator without creating an import cycle (genpc already imports hbrt). MacroPush delegates to the hook when installed; otherwise falls back to the legacy stub for hbrt unit tests. 3. hbrtl.init registers macroEval, which reuses compileExprSource (factored out of PcCompile) so macro lookups share the same sync.Map-backed pcode cache — repeat evaluations of the same macro source are free after the first hit. 4. ExecPcode leaves the result in retVal; macroEval copies it to the operand stack via PushRetValue. Tested (/tmp/test_macro.prg): &"10 + 20" → 30 &"Sqrt(16)" → 4 &"Upper('hello')" → HELLO &("30 " + Str(nX, 1)) → 210 (runtime-built source) &"5 > 3 .AND. .T." → .T. &("Str(" + Str(nX*10,2) + ",2)") → 70 FiveSql2 43/43, Harbour compat 56/56, Go test ALL PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 16:02:16 +09:00
CharlesKWON	935883bb88	perf(fivesql2): Go-native FetchRow fast path — 1.3-1.7x on agg/window TSqlExecutor:FetchRow was the per-row workhorse for aggregation, HAVING, and window queries. Even with the pre-built aFetchCache binding columns to (nWA, nFPos), the PRG FOR loop paid one method dispatch per column per row (dbSelectArea, FieldGet, AllTrim, AAdd) — profile pinned it at ~30% of B4 CPU. SqlFetchRowFast collapses the cache-path loop into a single Go call: - bound entry: SelectByNum + area.GetValue directly - unbound (aggregate/expression): self:EvalExpr via Send - character values: TrimSpace inline The PRG FetchRow keeps its original cache-miss fallback path unchanged for rare queries where aFetchCache isn't built. Bench deltas (median of 3 steady runs, 1000 iters): B4_GROUP_HAVING 418 → 327 us -22% (1.28x) B9_ROW_NUMBER 191 → 120 us -37% (1.59x) B10_RANK_PART 228 → 135 us -41% (1.69x) B11_SUM_OVER 249 → 156 us -37% (1.60x) B14_COUNT 235 → 219 us -7% B15_CTE_WIN_JOIN 1577 → 1452 us -8% Single-table SELECT (B1-B3, B5-B7, B8) stays flat — those already hit the column-binding fast path and don't need aggregate dispatch. FiveSql2 43/43, Harbour compat 56/56. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 13:50:02 +09:00
CharlesKWON	c84cde6175	perf(fivesql2): Go-native SqlIsAggName — drop per-row substring scan B4 GROUP+HAVING profile showed SqlIsAggName at ~9% of CPU — SqlEvalFunc checks it for every function in every row, and the PRG body was two string allocations + a substring scan: RETURN ("," + c + ",") $ ("," + AGG_FUNCTIONS + ",") Replace with a hash lookup against the existing aggFuncSet map in hbrtl/sqlexpr.go (already populated for SqlExprHasAgg, same AGG_FUNCTIONS list). Upper-casing skips the allocation when the input is already upper, which it almost always is in practice. Bench deltas (median of 3 steady runs, 1000 iters): B4_GROUP_HAVING 447 → 418 us -6.5% B14_COUNT 252 → 235 us -7% B15_CTE_WIN_JOIN 1595 → 1577 us -1% Other benches unchanged (no aggregate calls per row). FiveSql2 43/43, Harbour compat 56/56. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 13:40:19 +09:00
CharlesKWON	dd270d5d9d	perf: RTL Go-native migration — 27 optimizations, DML up to 70-90x Systematic pass through PRG hot paths, promoting them to Go RTL while preserving Harbour/FiveSql2 semantics. Full log in docs/RTL-Go-Native-Migration.md. Bench (bench_sql) vs 2026-04-08 baseline - B1 SELECT * 2,192 → 114 µs (19x) - B6 INNER JOIN 9,291 → 233 µs (40x) - B7 CTE simple 8,037 → 129 µs (62x) - B9 ROW_NUMBER 3,705 → 265 µs (14x) - B10 RANK PARTITION 4,748 → 309 µs (15x) - B12 INSERT (WA cache) 4,319 → 63 µs (69x) - B13 UPDATE (WA cache) 6,144 → 68 µs (90x) - B15 CTE+WIN+JOIN 18,395 → 1,873 µs (10x) Infrastructure - HbHash O(1) Index preserving insertion order (Harbour KEEPORDER) - HbDeepClone Go RTL (scalar-sharing, immutable hash keys) - MEMRDD auto-imported via gengo; all Five programs get mem:name driver - SQL plan + pcode caches (s_hPlanCache, s_hDmlPcodeCache) - Opt-in SqlWACacheEnable — dbUseArea/Close/Commit batched for DML SQL engine - FiveSql2 lexer ported to Go (byte FSM) with combined automatic template parameterization (literals → ?, concat queries share plan) - Go RTL: SqlDistinct, SqlGroupRows, SqlWindowPartitions, SqlWindowSortPartition, SqlWindowAssignRank, SqlComputeAggSimple, SqlBulkInsert, SqlBulkUpdate, SqlExprHasAgg, SqlEvalHaving - CTE / subquery / driving-table materialize paths use MEMRDD - SqlCoerce/SqlCmp/SqlIsTrue helpers moved from PRG to Go - SqlBulkUpdate defers Flush when WA cache active (APFS fsync was dominant B13 cost — 1.6ms/call → gone) Correctness fixes uncovered during migration - ASort default path now sorts dates/logicals/timestamps (was no-op) - ORDER BY default NULL placement matches PRG SqlRowCompare across Go fast path; explicit NULLS FIRST/LAST honored by both paths - SqlBulkUpdate respects EXCLUSIVE vs SHARED mode record locks - SqlCmp/SqlCmpEq normalize NumInt vs Double (caught by test 6b) Verification - go test ./... ALL PASS - FiveSql2 test_sql1999 43/43 - tests/compat_harbour 56/56 (+5 new: ASort dates/logicals, AScan int cross-type) - Regression test test_null_order.prg for ORDER BY NULL ordering Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 20:20:14 +09:00
CharlesKWON	3caadb23b9	perf: SqlOrderBy + SqlGroupBy Go RTL — native sort and aggregation SqlOrderBy: Go sort.Slice for ORDER BY, 10-50x faster than PRG ASort. SqlGroupBy: Go map-based GROUP BY accumulation (ready for integration). TryBuildSortSpec detects simple ORDER BY columns and routes to Go. Fallback to PRG for complex ORDER BY expressions. 43/43 + 41/41 verify + 51/51 compat + go test ALL PASS. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 14:41:41 +09:00
CharlesKWON	5fc9c3bbea	perf: SqlHashJoin Go RTL — 3-way JOIN 4.2s→61ms (69x) Go-native multi-table hash join bypasses per-row PRG overhead. TryGoJoin detects equi-join + plain-col SELECT, aggregate cols get placeholder. 2-way 73→3ms, 3-way 3.9s→61ms. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 07:16:09 +09:00
CharlesKWON	bfc6ded8cb	perf(FiveSql2): SqlHashBuild + FetchRow column binding — 3-way JOIN 3x Complex-query benchmarking turned up two hot paths that the earlier SqlScan/SqlEach work didn't touch: multi-table JOIN and nested-scan row fetching. This commit hits both. --- Part 1: SqlHashBuild — Go-native hash-join build --- FiveSql2's HashJoin previously built the inner-side hash in PRG: WHILE !Eof() xVal := FieldGet(nFPos) cKey := SqlValToStr(xVal) IF !hb_HHasKey(hHash, cKey) ; hHash[cKey] := {} ; ENDIF AAdd(hHash[cKey], RecNo()) dbSkip() ENDDO That loop runs at ~40μs per row from class dispatch + hb_HHasKey lookups + AAdd growth + SqlValToStr formatting. On a 50k-row inner table that's ~2 seconds wasted on what should be a sub-50ms housekeeping op. New hbrtl.SqlHashBuild does the same thing in one Go-native pass: - Direct *dbf.DBFArea loop (no interface dispatch, same devirt as SqlScan) - Go `map[string][]int64` accumulates RecNos by key — one allocation per distinct key - Inline ASCII-only digit formatter for numeric keys (strconv.Itoa is allocation-heavy for small ints) - CHAR keys are right-trimmed to match SqlCmpEq semantics so the hash probe matches what EvalExpr would compute - Final Five hash is built once from Keys/Values/Order slices directly, skipping the per-key hb_HSet path HashJoin now calls `SqlHashBuild(nFPos)` instead of running the PRG loop. --- Part 2: TSqlExecutor:BuildFetchCache --- The JOIN fallback loop calls FetchRow per row. FetchRow was already column-ref-aware but did the string parse (`At + SubStr + Upper`) and `::FindWA` linear scan every single invocation. For a 50k-row join emitting 50k result rows, that's ~200k redundant resolutions. New BuildFetchCache walks the SELECT list once before the scan and pre-binds each plain-column expression to `{nWA, nFPos}`. FetchRow's new fast path checks ::aFetchCache and jumps straight to `dbSelectArea + FieldGet` when bound. Complex exprs (functions, CASE, subqueries) still fall through to EvalExpr. ::aFetchCache is set right before the join WHILE loop and cleared after — no cross-query bleed. --- Bench (50k ord × 10k emp × 100 dept, 3-run steady state) --- Query Before After Speedup ──────────────────────────────────────────────────────────── 2-way INNER JOIN, 10k rows 91ms 68ms 1.34x 2-way JOIN + GROUP BY 110ms 94ms 1.17x 3-way INNER JOIN COUNT 2610ms 610ms 4.28x 3-way JOIN + GROUP BY 2860ms 830ms 3.45x The 3-way speedup is almost entirely SqlHashBuild. The 2-way case benefits from the fetch cache because its per-row cost is dominated by FetchRow (no second hash build to amortize). --- Limits still standing --- CTE + JOIN queries (Q7 in bench_complex: ~4.5s) aren't affected by either optimization — CTE materialization goes through a different path that writes/reads a temp DBF. Follow-up target. Validation: - FiveSql2 43/43 - Harbour compat 51/51 - go test ./... ALL PASS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 18:47:20 +09:00
CharlesKWON	d2ed140273	feat(FiveSql2): SqlEach block callback — beats raw RDD on end-to-end timing The structural 1.38x gap vs raw RDD for no-WHERE full scans wasn't a limit of our engine — it was a limit of the result shape. SqlScan materializes N rows as HbArray wrappers over a flat Value buffer, then the PRG caller iterates that materialized array. Two passes over the data. Raw RDD is one pass. SqlEach folds both passes into one. The caller supplies a code block that receives the selected column values as positional parameters; SqlEach invokes it per matching row. No result array is ever built. Usage (drop-in replacement for the common "scan + process" idiom): five_SQLEach( "SELECT id, name, salary FROM emp WHERE salary > 50000", {\|nID, cName, nSalary\| Process(nID, cName, nSalary) } ) API shape borrows Harbour's AEval/ASort block-callback convention, so there's nothing new to learn. Positional params also sidestep the `SELECT COUNT()` naming problem — no need to invent names for anonymous expressions. Implementation notes: - 4-way loop specialization ({DBF, generic Area} × {WHERE, none}), matching SqlScan. Each path is zero-allocation in the steady state. - Block invocation uses the direct pendingParams + blk.Fn(t) protocol rather than EvalBlock, which would allocate a temporary args slice on every call (50k scans × small slice adds up). - FastFieldGetter is installed the same way as SqlScan so PcOpFieldGet in the WHERE predicate skips the PushSymbol + Function dispatch. Bench (50k rows, end-to-end including user-code loop, steady state): Path Time vs raw RDD ───────────────────────────────────────────────────── Raw PRG loop, WHERE + sum 8.7ms 1.00x SqlScan + PRG FOR, WHERE 5.1ms 0.59x SqlEach block, WHERE 4.1ms 0.47x ← beats raw ───────────────────────────────────────────────────── Raw PRG loop, no WHERE 6.1ms 1.00x SqlEach block, no WHERE 3.8ms 0.62x ← beats raw SqlEach is faster than a hand-rolled `DO WHILE !Eof()` loop because the per-row FieldGet in raw PRG still goes through a full Frame + RTL dispatch, whereas SqlEach's FastFieldGetter captures the concrete dbf.DBFArea directly. The SQL abstraction now costs nothing — it pays you to use it. Validation: - FiveSql2 43/43 - Harbour compat 51/51 - go test ./... ALL PASS Next step (not in this commit): FiveSql2 TSqlExecutor integration — detect when five_SQL is called with a block argument and route to SqlEach instead of SqlScan + array build. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 15:16:36 +09:00
CharlesKWON	5dd212c761	perf(sqlscan): specialize four loop variants (DBF×WHERE matrix) SqlScan's inner scan was written as a single loop with `if whereFn != nil` and a `keep` shadow variable. Branch-predictable for sure, but still a few extra ops per row and it prevented Go from inlining the non-nil interface call on the Area branch. Split into four specialized loop bodies on the two axes that drive per-row cost: 1. dbfArea != nil && whereFn != nil 2. dbfArea != nil && whereFn == nil ← tightest path (SELECT *) 3. dbfArea == nil && whereFn != nil ← generic Area 4. dbfArea == nil && whereFn == nil Each body has exactly the instructions it needs — no dead branches, no shadow variables, no interface dispatch where avoidable. Copy-paste cost is real but each row save adds up at 50k iterations. Bench impact (50k rows, 3-run steady state): No WHERE 9.1ms → 8.7ms 1.38x vs raw (was 1.47x) Numeric WHERE 6.9ms → 7.0ms ~flat (within noise) String WHERE 6.2ms → 6.4ms ~flat (within noise) Raw RDD 6.3ms baseline Validation: - FiveSql2 43/43 - Harbour compat 51/51 - go test ./hbrtl/... PASS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 14:04:48 +09:00
CharlesKWON	f9ffd4050e	perf(FiveSql2): FieldGet peephole + DBFArea devirt — WHERE at ~1.15x raw RDD Two stacked optimizations land on the SqlScan hot path. Combined effect on the 50k-row benchmark: Before After vs raw Numeric WHERE 10.2ms 7.8ms 1.15x String WHERE 10.5ms 7.9ms 1.15x No WHERE 9.2ms 10.0ms 1.45x Raw RDD baseline 6.8ms 6.8ms 1.00x WHERE-predicate paths are now within 15% of the raw Harbour-style RDD scan loop. The no-WHERE path is unchanged (slight jitter from the added devirt branch); FieldGet peephole doesn't apply there. --- Optimization 1: PcOpFieldGet peephole --- Adds a new pcode opcode `PcOpFieldGet <fieldIdx>` (0x46) that skips the usual PushSymbol+Function+Frame+FieldGet-RTL+EndProc chain and calls a direct field getter closure instead. genpc recognizes the shape `FieldGet(<int-literal>)` during emitCall and emits the specialized opcode automatically — no SQL-side API change. Integration: * hbrt.Thread.FastFieldGetter — hot-path closure set by scan loops. Non-nil → pcode bypasses dispatch. Nil → pcode resolves FIELDGET via the RTL symbol table (correctness fallback for any other callers). * compiler/genpc/genpc.go — peephole in emitCall. * hbrt/pcinterp.go — PcOpFieldGet handler. This alone cut numeric WHERE from 10.2 → 7.9ms: eliminated roughly one full Frame/EndProc + RTL dispatch per row × 50k rows. --- Optimization 2: DBFArea devirtualization --- SqlScan type-asserts the workarea to dbf.DBFArea once and runs a dedicated loop that calls GoTop/EOF/Skip/GetValue directly on the concrete type. Go's compiler inlines these, skipping the interface vtable per row. Non-DBF drivers still work via the generic Area branch. The FastFieldGetter closure also captures DBFArea directly in the DBF branch, so the WHERE predicate side of the hot loop is now entirely devirtualized: no interface dispatch between the pcode dispatch loop and the DBF record buffer. Validation: - FiveSql2 43/43 - Harbour compat 51/51 - go test ./... ALL PASS Remaining gap to raw RDD on no-WHERE (~1.45x) is dominated by the two-column row construction + ArraySlab + flat backing bookkeeping that the raw loop doesn't do. Going below that requires changing the SQL engine's result shape — out of scope here. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:23:31 +09:00
CharlesKWON	5c067f35a4	perf(hbrt): ExecPcodeFast — pcode variant without defer/recover Pcode expressions compiled from SQL WHERE clauses (via genpc.CompileExpr) never contain BEGIN SEQUENCE and can't raise BreakValue, so the defer + recover dance in ExecPcode's EndProc is pure overhead. For FiveSql2's per-row WHERE evaluation on a 50k-row scan, that's 50k × ~15ns = ~750µs of pointless recover bookkeeping. Split ExecPcode into two variants sharing execPcodeBody: ExecPcode — full: Frame + defer EndProc. General-purpose, handles panics. Behavior unchanged. ExecPcodeFast — hot: Frame + execPcodeBody + EndProcFast. No defer, no recover. Caller guarantees the pcode body can't panic with HbError / BreakValue. SqlScan now uses ExecPcodeFast for per-row WHERE evaluation. Measured impact on 50k-row no-WHERE benchmark: 10.6ms → 9.2ms steady state (~13% faster). Effect is smaller on numeric-WHERE because per-row cost there is dominated by the opcode dispatch itself, not the frame exit. Validation: - FiveSql2 43/43 - go test ./hbrt/... PASS (pcode tests) - go test ./hbrtl/... PASS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:07:54 +09:00
CharlesKWON	85541a3035	perf(sqlscan): flat backing buffer — 30% faster no-WHERE scan The prior loop allocated one small `[]hbrt.Value` per matching row (for the row body) plus one HbArray header. For a 50k-row full scan that's 100k allocations of which the small-slice allocs dominated fragmentation and GC pressure. SQLite-inspired fix: pre-allocate a single flat []hbrt.Value of capacity `RecCount * nFields` at scan start and hand each row a three-index sub-slice (flat[off:end:end]). The capped sub-slice still forces a reallocation if PRG code later does `AAdd(row, x)`, so neighbor rows can't get clobbered. Sizing the initial buffer off RecCount(err-ignored) was the actual win — the previous naive grow-from-1024 policy caused five mid-scan reallocations of a ~200 KB buffer, each memcpy'ing everything so far. One upfront allocation amortizes much better. Bench (50k rows, ~/tmp ext4, 3 runs steady-state): Before After Δ no WHERE 14.6ms 10.6ms −27% numeric WHERE 11.7ms 10.0ms −15% string WHERE 10.5ms 11.0ms ~= raw RDD baseline 6.8ms 7.0ms Gap to raw RDD: 2.1x → 1.4x on the dominant no-WHERE case. What's left is pcode WHERE dispatch (ExecPcode frame per row), the Area interface boundary, and the HbArray header allocation per row — all structural costs that would need a wider refactor to close. Validation: - FiveSql2 43/43 - go test ./hbrtl/... PASS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:57:05 +09:00
CharlesKWON	d74014a235	feat(rdd): dbInfo / dbOrderInfo — implement the stubs Replaces the `return NIL` stubs with real implementations that read from the current workarea. Covers the info codes actually used by downstream code (FiveSql2 TSqlIndex, standalone callers): DBINFO: DBI_ISDBF, DBI_CANPUTREC, DBI_FULLPATH, DBI_TABLEEXT, DBI_MEMOEXT, DBI_SHARED, DBI_ISREADONLY, DBI_GETRECSIZE, DBI_DBVERSION, DBI_RDDVERSION, DBI_BOF, DBI_EOF, DBI_FOUND, DBI_FCOUNT, DBI_ALIAS, DBI_POSITIONED DBORDERINFO: DBOI_EXPRESSION, DBOI_NAME, DBOI_NUMBER, DBOI_POSITION, DBOI_ORDERCOUNT, DBOI_KEYCOUNT, DBOI_KEYCOUNTRAW Unknown info codes still return NIL (Harbour's forgiving fallback). New accessors on DBFArea (FullPath, IsShared, IsReadOnly) expose the private filePath/shared/readOnly fields to the hbrtl layer without plumbing them through the generic Area interface. Unblocks TSqlIndex:FindExclusive's original DBI_FULLPATH/DBI_SHARED scan — though the short-circuit there stays in place for now since it's a correctness workaround that no longer masks a crash thanks to the recent gengo PushMemvar fallback. Validation: - FiveSql2 43/43 (0 warnings) - Harbour compat 51/51 - go test ./... ALL PASS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:42:18 +09:00
CharlesKWON	3a00aa5435	feat(hbrtl): field metadata + index creation RTL — TSqlIndex warnings to zero TSqlIndex.prg had five undefined identifiers and six undefined constants that the new CLASS-method analyzer surfaced after the gengo PushMemvar fallback stopped crashing on them. All real tech debt, not false positives. This lands the implementations. New RTL functions (hbrtl/indexrtl.go + register.go): - FieldType(n) → "C"/"N"/"L"/"D"/"M"/... one-letter type - FieldLen(n) → length in bytes - FieldDec(n) → decimal places - ordCreate(cBag, cTag, cExpr [, bExpr] [, lUnique]) → DBFArea.OrderCreate with TagName set (CDX tag or NTX tag) - dbCreateIndex(cFile, cExpr [, bExpr] [, lUnique]) → legacy Clipper single-tag NTX without TagName - dbClearIndex() → OrderListClear All pass through the existing Indexer interface; key expressions go through the MacroEval slow path since callers pass string literals. When callers are updated to pass compiled key blocks, the existing KeyFunc fast path kicks in automatically. New header files (include/): - dbinfo.ch — DBI_* and DBOI_* constants with Harbour-compatible values (FULLPATH=10, SHARED=42, EXPRESSION=2, etc.) - dbstruct.ch — DBS_NAME/TYPE/LEN/DEC field descriptor indices TSqlIndex.prg already did `#include "dbinfo.ch"` and `#include "dbstruct.ch"` but Five's preprocessor silently ignored the missing files. Both headers land in include/ where cmd/five's include-dir chain already looks. Analyzer RTL allow-list updated with the six new function names so the warning pipeline stays clean. Result: FiveSql2 build goes from 17 WARN → 0. Both tracked test suites still pass. Note: dbInfo() / dbOrderInfo() themselves remain stubbed (return NIL) — the constants exist for compile-time resolution and for future use when the stubs are replaced. Callers that depend on actual dbInfo values still get NIL at runtime. Validation: - FiveSql2 43/43 - Harbour compat 51/51 - go test ./... ALL PASS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:11:57 +09:00
CharlesKWON	8aaed994f4	perf(FiveSql2): hybrid fast path — 11x speedup on string WHERE scans Implements hybrid execution model: keep AST tree-walk for SQL:2013+ features (Window, Recursive CTE, JOIN, aggregates) while compiling simple SELECT hot paths to Go + pcode. See docs/FiveSql2-Hybrid-Plan.md for the full architecture rationale (why not SQLite-style VDBE). Hot path (single table, no joins/groups/aggregates): - TryBuildFieldPositions: resolves SELECT column list to FieldPos array once per query (bails to PRG loop on any complex expr). - TryCompileWhere + SqlExprToPrg: walks WHERE AST, emits equivalent PRG source, runs it through PcCompile to get a PcodeFunc. - SqlScan RTL: Go-native scan loop — GoTop/EOF/Skip/GetValue direct, ExecPcode per row for WHERE, result array pre-alloc. WHERE compiler scope: - ND_LIT numeric/logical/string (string literals AllTrim'd to match SqlCmpEq CHAR-padding semantics; rejects embedded quotes/newlines) - ND_COL: CHAR fields auto-wrapped with AllTrim(FieldGet(n)) based on dbStruct() lookup cached once per query in aCompileStruct - ND_BIN: = <> != < <= > >= AND OR + - * / - ND_UNI: NOT - - Anything else (ND_FN, ND_CASE, ND_SUB, ND_PAR, LIKE, IN, IS NULL, BETWEEN, dates) returns NIL → falls back to PRG tree-walk. Bench (50k rows, ~/tmp ext4): Before After Speedup Numeric WHERE ~150ms 11.7ms ~13x String WHERE 119.3ms 10.5ms 11.4x No WHERE - 14.6ms - Raw RDD baseline 6.8ms 6.8ms 1.0x Remaining gap to raw RDD (~1.5x) is structural: Value boxing, result array construction, per-row ExecPcode frame overhead. Would need a Value-pool or SoA refactor to close further. Side fixes bundled: - TSqlIndex:FindExclusive short-circuited. Originally called dbInfo(DBI_FULLPATH)/DBI_SHARED which are unresolved symbols in Five (dbInfo is a stub, DBI_* never defined). Panic'd with "local variable index out of range: 0" whenever a standalone PRG had a workarea Used before calling five_SQL. 43-test masked the bug because it only reached FindExclusive with no open workareas. Restore the scan once dbInfo lands in hbrtl. - cmd/five/main.go: FIVE_KEEP_BUILD=1 env var keeps the temp Go project around for debugging gengo output. Validation: - FiveSql2 43/43 - Harbour compat 51/51 - go test ./... ALL PASS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 09:15:08 +09:00
CharlesKWON	6b26f1b642	feat: genpc.CompileExpr + PcCompile/PcEval runtime bytecode API Expose Five's existing FRB bytecode compiler for single-expression compilation, enabling prepared-statement-style caching in dynamic query engines (FiveSql2, scripting layers, rule engines). 1. genpc.CompileExpr(ast.Expr) *hbrt.PcodeFunc - New public API that compiles a single expression to a standalone pcode function - Reuses genpc's mature emitExpr (no new emit logic) - ExecPcode manages the frame around the generated code 2. hbrtl.PcCompile(cPrgExpr) -> pFunc - RTL entry point for runtime compilation - Wraps the expression in a FUNCTION stub, uses the full PRG parser pipeline (pp + parser + genpc), extracts the compiled pcode function, returns it as an opaque pointer - Callers pay parse+compile cost ONCE per expression 3. hbrtl.PcEval(pFunc) -> xValue - RTL entry point for runtime execution - Calls hbrt.ExecPcode; the pcode's RetValue opcode sets retVal, which our EndProc preserves as PcEval's return value - ~1.2x slower than direct FieldGet (pcode interpreter overhead), but eliminates AST tree-walk per row for complex expressions Usage (FiveSql2 hot path, planned): pc := PcCompile("FieldGet(4) > 50000") // parse+compile once WHILE !Eof() IF PcEval(pc) // ~10us per row AAdd(aRows, ...) ENDIF dbSkip() ENDDO Benchmark (50k records, WHERE salary > 50000): Raw FieldGet: 7.9 ms (baseline) FieldPos+Get: 10.2 ms (with O(1) FieldPos cache) PcEval bytecode: 10.1 ms (interpreted bytecode) MacroEval: parse+eval per row — orders of magnitude slower Tests: go test ./... ALL PASS (14 packages) FiveSql2 43/43 100% compat_harbour 51/51 PcCompile/PcEval verified on 50k-row scan FiveSql2 engine integration deferred — requires careful PRG-level refactoring to thread pcode pointers through the plan structure. The Go-level infrastructure is now in place for that work. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 07:57:52 +09:00
CharlesKWON	ed33af41c5	perf: FieldPos O(1) cache + xbase import detection for function-call PRGs Two SQLite-style optimizations for RDD and SQL workloads: 1. FieldPos() O(1) column binding cache Before: FieldPos(name) linear scan — O(n) per call with string comparison. In SQL engines that call FieldPos per row per column, this is hundreds of thousands of calls. After: DBFArea builds a map[UPPER(name)]→pos on first lookup. All subsequent lookups are O(1) hash. SQLite calls this "column affinity binding" — positions resolved at prepare, not per row. Implementation: - hbrdd/dbf/dbf.go: DBFArea.FieldPosCache(name) method - hbrtl/procinfo.go: FieldPos RTL uses fieldPosCacher interface - Lazy init: only pays for tables that get queried 2. hbrdd import auto-detection for function-call style PRGs Before: compiler only added hbrdd import when PRG used xBase commands (USE, SKIP, INDEX...). Pure function-call style like `dbUseArea(.T.,,"t")`, `FieldPut(1, val)` was missed — generated Go failed to compile ("undefined: hbrdd"). After: scanStmtsForXBase walks ExprStmt bodies too, detecting CallExpr to any of the ~40 xBase RTL function names. FIELD->NAME alias expressions also trigger the import. Resolves: small PRGs that use only dbUseArea/FieldGet/FieldPut. Benchmark notes (50k records): Raw RDD scan: 7 ms (baseline) FiveSql2 SELECT WHERE: 157 ms (unchanged — bottleneck is not FieldPos, it's PRG-level expression tree walk per row) compat_harbour 51/51: PASS FiveSql2 43/43: 100% The FieldPos cache helps heavy field-name-based code paths but the primary FiveSql2 bottleneck is the PRG interpreter walking expression ASTs per row (needs bytecode compilation to close the gap). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 07:42:00 +09:00
CharlesKWON	3adc9d7d59	fix: PCount, Break/RECOVER, SET INDEX TO — 3 Harbour compat fixes Release-blocking compatibility issues discovered during the 258-test pre-release validation suite (100 syntax + 44 RDD + 114 RTL). 1. PCount() always returned 0 in PRG code Root cause: ParamCount() returned t.pendingParams, which is overwritten by every nested Function() call. By the time the PCount() RTL's Frame() executes, pendingParams is already 0. Fix: Frame() now stores pendingParams in frame.paramCount. PCount() RTL uses CallerParamCount() which reads callSP-2 (the PRG caller's frame), while RTL functions still use ParamCount() (reads pendingParams before their own Frame). Verified: PCount(1,2,3)=3, PCount(1)=1, PCount()=0 2. Break("string") panicked instead of being caught by RECOVER USING Root cause: Generated SEQUENCE code only caught HbError panics. Break() panics with BreakValue (a different type), which fell through to EndProc's "runtime error" message and re-panic. Fix (two parts): a) gengo emitBeginSequence: recover closure now catches any panic (interface{}), then dispatches via type switch: - HbError → extract .Error() string - hasValue interface (BreakValue) → extract .GetValue() - other → static "error" string b) hbrtl/error.go: BreakValue gets GetValue() method for duck-type detection without import cycles c) hbrt/thread.go EndProc: BreakValue type name check added so it re-panics silently (no stderr noise) 3. SET INDEX TO a, b, c only opened the last file Root cause: Parser's parseSet() called parseExpr() once for INDEX setting, stopping at the first comma. Remaining file names were consumed by the "eat rest of line" loop. Fix: Parser now collects comma-separated identifiers into a single string literal "a,b,c". gengo splits on comma and calls OrderListAdd() for each file. Verified: SET INDEX TO si_name, si_city → OrdCount=2 All tests pass: go test ./... 14 packages OK FiveSql2 43/43 100% compat_harbour 51/51 Syntax test 100/100 RDD test 44/44 RTL test 114/114 Windows cross-compile OK Linux cross-compile OK Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 18:06:28 +09:00
CharlesKWON	fc1dca9551	feat(rdd): real POSIX file/record locking + gap analysis doc Replaces the FLOCK/DBRLOCK/DBRUNLOCK no-op stubs with actual fcntl(F_SETLK) byte-range advisory locks, matching Harbour's hb_fsLockLarge implementation. Before: rtlDbRLock always returned .T. regardless of contention. Multi-process writers could silently corrupt records. After: Non-blocking POSIX byte-range locks per file descriptor. Cross-process exclusion verified by a subprocess-spawning Go test that witnesses BUSY vs OK transitions. New files: hbrdd/dbf/locks_posix.go fcntl F_WRLCK/F_UNLCK wrappers hbrdd/dbf/locks_windows.go stub (TODO: LockFileEx) hbrdd/dbf/lock_multi_test.go cross-process verification docs/gap-analysis.md honest Harbour parity assessment Modified: hbrdd/dbf/dbf.go - DBFArea gains fileLocked bool + lockedRecs map - Close() calls releaseAllLocks() before dropping the fd hbrtl/database.go - rtlDbRLock / rtlDbRUnlock now delegate to DBFArea.LockRecord / UnlockRecord instead of returning fixed .T./NIL - New rtlFLock / rtlDbUnlock for FLOCK() / DBUNLOCK() hbrtl/register.go - FLOCK and DBUNLOCK symbols registered (were missing entirely) compiler/analyzer/analyzer.go - FLOCK / DBUNLOCK added to RTL known-function set Lock region layout (non-overlapping on purpose): FLOCK region [0, HeaderLen+1) Record N region [RecordOffset(N), RecordLen) So a workarea can hold FLOCK and multiple DBRLOCK simultaneously on the same fd without conflict. Design rationale (captured in locks_posix.go header): * POSIX fcntl, not flock(2) — byte-range + NFS-safe * Non-blocking F_SETLK — matches Clipper FLOCK() → .F. semantics * Released explicitly on Close to avoid workarea-sharing races * Windows falls back to no-op (TODO: LockFileEx) Verification: go test ./hbrdd/dbf/ -run TestFLockBlocksAcrossProcesses PASS go test ./hbrdd/dbf/ -run TestRLockBlocksAcrossProcesses PASS go test ./... ALL PASS FiveSql2 43/43 100% compat_harbour 51/51 100% The gap-analysis doc (docs/gap-analysis.md) is a running inventory of what works vs what's still missing vs Harbour 3.2, written for users evaluating Five for production — not a sales pitch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 17:58:03 +09:00

1 2

64 Commits