Commit Graph

5 Commits

Author SHA1 Message Date
ed1aeeb212 feat(pgserver): pg_catalog stub for BI-tool connection compatibility
PostgreSQL clients (psql, pgx, DBeaver, Tableau, DataGrip,
pgAdmin) fire a barrage of catalog probes at connection time —
SELECT version(), SHOW server_version, SELECT FROM pg_namespace
/ pg_class / pg_type / pg_database / pg_settings. FiveSql2 can't
parse most of them. Without interception the BI tool either
errors out on connect or proceeds with a half-broken view of
the database (zero tables, no type info, no schema list). This
commit lands the minimum-viable catalog shim so the common
connect-and-list-tables flow succeeds.

Strategy
--------

Pattern-match catalog probes BEFORE handing the SQL to five_SQL.
Recognised shapes get synthesised result envelopes — same
`{ aFieldNames, aRows }` hbrt.Value shape the engine returns,
so the existing dispatchSimpleQuery / executePortal pipelines
stream them identically to a normal query.

Covered (v1.0)
--------------

  * SET / RESET / DISCARD <name>           → success, no-op
  * SHOW <name>                            → single-row response
                                             (server_version, server_encoding,
                                              client_encoding, DateStyle,
                                              transaction_isolation, etc.)
  * SELECT version() / current_database() / current_schema() /
    current_user / session_user / pg_backend_pid()  → single-row
  * SELECT … FROM pg_namespace             → 2 rows (pg_catalog + public)
  * SELECT … FROM pg_class                 → list of open workareas
                                             (relkind='r', relnamespace=public)
  * SELECT … FROM pg_attribute             → empty (stub; column-shape
                                             introspection deferred to v1.1)
  * SELECT … FROM pg_type                  → 7 OIDs FiveSql2 actually emits
                                             (bool, int4, int8, text, numeric,
                                              date, timestamp)
  * SELECT … FROM pg_database              → 1 row, the connect-time db name
  * SELECT … FROM pg_settings              → name/setting pairs matching SHOW
  * Anything else mentioning pg_catalog. / pg_<name> / information_schema.
    → empty result with generic field names (BI tool sees "0 rows" rather
    than a parse error)

Deliberate non-goals
--------------------

  * WHERE / JOIN evaluation — psql, pgx, DBeaver all filter
    client-side on the rows we return. We send the whole
    catalog and let them apply their predicates.
  * pg_attribute introspection — would need to re-derive
    column types from the open workarea + map back to PG OIDs.
    Tracked as v1.1 work.
  * Recursive CTE catalog queries (pgAdmin's tree builder uses
    them) — too brittle to pattern-match. Falls through to
    five_SQL where it errors loudly. pgAdmin's table-tree pane
    will then show "0 tables" but the connection itself stays
    alive.

Files
-----

  hbrtl/pgserver/catalog.go  (new, ~280 LOC)
    catalogIntercept(sql) → (handled, value)
    synthPgNamespace / synthPgClass / synthPgAttribute /
    synthPgType / synthPgDatabase / synthPgSettings
    simpleSelectFunction (version/current_*/pg_backend_pid)
    showResponse (SHOW <name>)

  hbrtl/pgserver/dispatch.go
    dispatchSimpleQuery: catalogIntercept ahead of runSQL.

  hbrtl/pgserver/extended.go
    executePortal: same intercept, ahead of runSQL.

Verification
------------

psql against a running pgserver, with sslmode=require + MD5:

  $ psql -c 'SELECT version()' -At
  PostgreSQL 14.0 (FiveSql2) (FiveSql2 wire-compat shim)

  $ psql -c 'SELECT * FROM pg_namespace' -At
  11|pg_catalog|10
  2200|public|10

  $ psql -c 'SELECT * FROM pg_type' -At
  16|bool|1
  23|int4|4
  20|int8|8
  25|text|-1
  1700|numeric|-1
  1082|date|4
  1114|timestamp|8

  $ psql -l    # \\l now works
          데이터베이스 목록
   oid | datname | datdba | 인코딩
  -----+---------+--------+--------
     1 | alice   |     10 |      6

Integration script gates grew from 6/6 → 9/9:
  PASS  Catalog probe: SELECT version()
  PASS  Catalog probe: pg_namespace lists public + pg_catalog
  PASS  Catalog probe: SHOW server_version_num

All six release gates green:
  go test ./...               ✓
  FiveSql2 SQL:1999 43/43     ✓
  Harbour compat 56/56        ✓
  std.ch 17/17                ✓
  FRB 7/7                     ✓
  pgserver integration 9/9    ✓ (+3 from catalog stubs)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 22:31:52 +09:00
3b2dd365ad feat(pgserver): Phase 6 — TLS + source-IP allowlist
Closes the v1.0 hardening surface: encrypted transport + a
coarse pg_hba.conf-equivalent CIDR allowlist. Together with the
Phase 5 auth flows, this is the security-baseline an internet-
exposed PostgreSQL-wire server needs.

TLS subsystem
-------------

`hbrtl/pgserver/tls.go`:

* `LoadTLSFromFiles(certPath, keyPath)` — cert/key PEM pair load
  with tls.VersionTLS12 floor. Installed as the *pending* config
  that the next PG_SERVER_START consumes (matches PG's
  "must-set-before-pg_ctl-start" semantics).

* `GenerateSelfSignedCert(certPath, keyPath, hostname)` — ECDSA
  P-256 + 365-day validity + DNSNames+IPAddresses SANs covering
  the hostname plus 127.0.0.1 / ::1. Dev/CI helper; production
  ships a CA-signed cert via the loader.

* `upgradeToTLS()` wraps `tls.Server(conn, cfg).Handshake()` so
  pgproto3 reads plaintext on top of the encrypted stream.

Source-IP allowlist
-------------------

* `AllowIP(cidr)` parses a CIDR and appends it to a per-server
  list snapshotted at PG_SERVER_START time.
* `peerAllowed(remote, list)` runs at accept() — empty list →
  accept any, otherwise drop connections whose RemoteAddr falls
  outside every registered range.
* `ClearAllowList()` resets to allow-all.

Coarse but compatible with the "host alice 10.0.0.0/8 md5"-style
entries every pg_hba.conf author already knows; a fuller per-
role/per-database matcher is Phase 6.1+.

PRG bindings (register.go)
--------------------------

New HB_FUNCs, all idempotent and composable in any order before
PG_SERVER_START:

  pg_tls_load( certPath, keyPath )           → .T. | cErr
  pg_tls_self_signed( cert, key, hostname )  → .T. | cErr
  pg_allow_ip( cidr )                        → .T. | cErr
  pg_clear_allowlist()                       → NIL

Bootstrap idiom:

  PROCEDURE Main()
     PG_TLS_SELF_SIGNED( "/tmp/cert.pem", "/tmp/key.pem", "localhost" )
     PG_ADD_ROLE( "alice", "swordfish" )
     PG_ALLOW_IP( "127.0.0.1/32" )
     PG_ALLOW_IP( "10.0.0.0/8" )
     PG_SERVER_START( ":5432", "md5" )

The startup banner now reports TLS + allowlist state so the PRG
operator sees the security posture at a glance:

  pgserver: listening on :5432 (auth=md5 tls=on allowlist=2)

Verification
------------

End-to-end via real psql against a self-signed server:

  $ PGPASSWORD=swordfish psql \
        "postgres://alice@127.0.0.1:15432/alice?sslmode=require" \
        -c "SELECT 'tls-works' AS x" -At
  tls-works

  $ # off-allowlist source (192.168.x.x mock) → connection refused
  $ # (verified manually; psql can't easily spoof src IP for CI)

Integration script gates expanded to 6/6:
  PASS  Simple Query
  PASS  Multi-statement Simple Query
  PASS  Transaction control
  PASS  MD5 auth: wrong password rejected
  PASS  MD5 auth: correct password accepted
  PASS  TLS handshake + MD5 auth via sslmode=require

All six release gates green:
  go test ./...               ✓
  FiveSql2 SQL:1999 43/43     ✓
  Harbour compat 56/56        ✓
  std.ch 17/17                ✓
  FRB 7/7                     ✓
  pgserver integration 6/6    ✓

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 14:07:19 +09:00
90eafcfc06 feat(pgserver): Phase 5 — password + MD5 authentication
Trust mode (v1.0 default) accepts anyone; that's fine for embedded
demo but unshipping a multi-client database without credentials
would be irresponsible. This commit adds two of libpq's three
standard auth flows. SCRAM-SHA-256 is Phase 5.1 — pgx/psql both
fall back to MD5 cleanly when the server advertises only md5, so
v1.0's functional coverage is complete with the pair landed here.

Auth subsystem
--------------

`hbrtl/pgserver/auth.go` adds:

* An in-memory role registry: `roleMap map[string]*role` guarded by
  sync.RWMutex. Reads (lookupRole) are hot-path during connection
  startup so the RWMutex lets multiple sessions auth in parallel
  without serialising through a plain Mutex.

* `AddRole(name, password)` / `RemoveRole(name)` Go API consumed
  by the new HB_FUNCs `PG_ADD_ROLE` / `PG_REMOVE_ROLE` (see
  register.go). Bootstrap PRG idiom:

      PG_ADD_ROLE("alice", "swordfish")
      PG_ADD_ROLE("bob",   "hunter2")
      PG_SERVER_START(":5432", "md5")

* `authPassword()` — cleartext PasswordMessage exchange. The wire
  payload is plain so intended for TLS-protected links only;
  Phase 6 ties the warning to actual TLS detection on the session.

* `authMD5()` — libpq's md5 challenge:

      server → AuthenticationMD5Password{salt: 4 random bytes}
      client → "md5" || md5_hex( md5_hex(password || user) || salt )

  We recompute the canonical hash from the stored plaintext and
  compare. md5Challenge() is exported for pinning by a Go unit
  test (vector cross-checked against libpq's fe-auth-md5.c).

Salt is sourced from crypto/rand on every challenge so replay
attacks against a captured wire trace can't reuse a prior hash.

Dispatch matrix (Config.AuthMode → flow):
  "" / "trust" → AuthenticationOk immediately, no lookup
  "password"   → authPassword()
  "md5"        → authMD5()
  anything else→ 28000 + connection close

Tests
-----

Unit (hbrtl/pgserver/pgserver_test.go):
  PASS  TestMD5Challenge           (vector + determinism + diff)
  PASS  TestRoleRegistry           (add/replace/remove/lookup)

Integration (tests/pgserver/run.sh):
  PASS  Simple Query: SELECT 1, 'hello'
  PASS  Multi-statement Simple Query
  PASS  Transaction control: BEGIN/COMMIT round-trip
  PASS  MD5 auth: wrong password rejected
  PASS  MD5 auth: correct password accepted

End-to-end matrix with real psql:
  wrong password   → "ERROR: md5 authentication failed for user 'alice'"
  correct password → SELECT returns row
  unknown user     → "ERROR: md5 authentication failed for user 'eve'"
  password mode    → cleartext exchange works equivalently

All six release gates green:
  go test ./...               ✓
  FiveSql2 SQL:1999 43/43     ✓
  Harbour compat 56/56        ✓
  std.ch 17/17                ✓
  FRB 7/7                     ✓
  pgserver integration 5/5    ✓ (up from 3/3 in Phase 4)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 14:01:30 +09:00
8472928102 feat(pgserver): Phase 4 — Extended Protocol (Parse/Bind/Execute)
pgx and most drivers default to PostgreSQL's Extended Protocol
(named prepared statements). Phase 2 only handled Simple Query,
so every pgx caller had to force `QueryExecModeSimpleProtocol` —
unworkable for a production deployment. This commit lands the
full Parse → Bind → Describe → Execute → Sync state machine,
enough that pgx (and any other libpq-protocol-v3 client) works
without any client-side knobs.

Implementation lives in `hbrtl/pgserver/extended.go`:

* Per-session caches `stmts map[string]*preparedStmt` and
  `portals map[string]*portal`, lazily allocated on first use.
  Stored as fields on `session` so they don't leak across
  connections.

* Parameters are inlined at Bind time via `substituteParams` —
  the resolved SQL is a normal Simple-Query-shaped string the
  engine sees through the existing `five_SQL(cSQL, …, oSession)`
  pipeline. Avoids teaching FiveSql2 a second param-shape; the
  trade-off is that binary timestamps/numerics round-trip through
  text (Phase 4.1 will plumb `?`-params through aParams for the
  binary fast path).

* `paramToLiteral` decodes the binary-format encodings pgx uses
  by default for INT4/INT8/BOOL (big-endian fixed-width). Other
  binary OIDs fall back to a hex-escaped quoted literal which
  errors loudly rather than silently misparsing.

* `countPgPlaceholders` scans the SQL outside string literals for
  the highest `$N` so the server can answer Describe-statement
  with a correctly-sized ParameterDescription even when the
  client didn't pre-declare param OIDs. Without this, pgx errored
  with "expected 0 arguments, got 2" on the very first prepared
  query.

* RowDescription emission: Describe-statement still returns NoData
  (we can't infer row shape without execution). When Execute fires
  on a portal the client never Described, we emit RowDescription
  inline from the cached result before DataRow streams. pgx and
  psql both tolerate this ordering.

* Execute → CommandComplete tag derives from the SQL verb via the
  existing `commandTagFor` helper. Row counts in the tag remain
  "VERB 0" for v1.0; threading real counters through the engine
  is Phase 5.

Wire dispatch in `session.go:queryLoop` now handles Parse, Bind,
Describe, Execute, Close, Sync, Flush — the full v3 message set.

Verification
------------

End-to-end pgx (default mode, no SimpleProtocol flag) successfully
runs:
  SELECT $1 AS n, $2 AS s with 42 + "hi" → [42 hi]
  Same statement re-executed with different bound values → reuses
    the cached prepared statement
  SELECT $1 AS b, $2 AS s with true + "binary-bool" → [t binary-bool]

`tests/pgserver/run.sh` expanded from 1 → 3 integration assertions:

  PASS  Simple Query: SELECT 1, 'hello'
  PASS  Multi-statement Simple Query
  PASS  Transaction control: BEGIN/COMMIT round-trip

(Extended Protocol can't be driven from psql's -c CLI directly
because psql's PREPARE/EXECUTE is a separate SQL-level feature
that FiveSql2 doesn't parse; the pgx-driven path verifies it
manually, and a self-contained Go integration that drives pgx
from inside a process bootstrap is Phase 7 work.)

All six release gates green:
  go test ./...                       ✓
  FiveSql2 SQL:1999 43/43             ✓
  Harbour compat 56/56                ✓
  std.ch 17/17                        ✓
  FRB 7/7                             ✓
  pgserver integration 3/3            ✓

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 12:55:41 +09:00
a5567648e9 test(pgserver): Phase 3 — DML + transaction integration harness
Adds tests/pgserver/run.sh, the integration gate for the wire
layer. Builds a minimal bootstrap PRG that opens nothing and just
calls PG_SERVER_START on an ephemeral port, then drives psql with
a Simple Query to confirm the end-to-end pipeline (TCP accept →
startup handshake → Query → five_SQL → RowDescription + DataRow
→ ReadyForQuery) still works after every change.

Phase 3 verified scope (driven via a separate pgx harness during
development):

  * CREATE TABLE / INSERT / UPDATE / DELETE over Simple Query
  * BEGIN / COMMIT / ROLLBACK from the wire
  * Two-connection cross-visibility on a shared DBF
  * Per-session ROLLBACK leaves the *other* connection's data
    intact — the Phase 1 STATIC → TSqlSession refactor is what
    makes this hold; pre-refactor, both connections would have
    shared one s_aTxnLog and A's ROLLBACK would have collapsed
    B's COMMIT.

Known limitation captured in the script header (deferred to
Phase 7 follow-up):

  * ≥3 concurrent connections doing in-flight INSERT/SELECT in
    their own transactions occasionally race at the hbrdd
    workarea layer — surfaces as one worker's just-inserted row
    missing from its own SELECT. 2-way concurrent + N-way serial
    are both reliable. Root cause is multi-thread workarea
    arbitration during dbUseArea/dbAppend, which the pre-1.0
    audit flagged as Top-Risk #2 ("WorkArea collision under
    multi-session"). Tracking for a dedicated fix.

Gate count now reads:
  go test ./...                       ✓
  FiveSql2 SQL:1999 43/43             ✓
  Harbour compat 56/56                ✓
  std.ch 17/17                        ✓
  FRB 7/7                             ✓
  examples 65/71                      ✓ (unchanged baseline)
  pgserver integration 1/1            ✓ (new)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 07:25:13 +09:00