docs(pgserver): v1.0 reference + known-limitations writeup
Single source of truth for what's shipping in v1.0: PRG API, supported features (wire protocol, auth modes, TLS, allowlist, type marshalling, pg_catalog stubs), explicit known-limitations section (high-concurrency writes, in-memory roles, no pg_hba.conf, CancelRequest no-op, binary BYTEA, no idle timeout), and a security-model note clarifying when each auth mode is appropriate. README.md gets a feature bullet pointing at the new doc plus a worked example showing pg_demo.prg + psql, and the status table now reflects the full 6-gate suite (43/43 SQL, 56/56 compat, 17/17 std.ch, 7/7 FRB, 11/11 pgserver). Stale 51/51 references corrected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
253
docs/pgserver.md
Normal file
253
docs/pgserver.md
Normal file
@@ -0,0 +1,253 @@
|
||||
# pgserver — PostgreSQL Wire Server for FiveSql2
|
||||
|
||||
> Expose Five's DBF/NTX/CDX-backed SQL engine to any PostgreSQL client over TCP/IP.
|
||||
|
||||
Copyright (c) 2026 Charles KWON OhJun (charleskwonohjun@gmail.com). All rights reserved.
|
||||
|
||||
## Overview
|
||||
|
||||
`pgserver` is a PostgreSQL-protocol-v3 server that runs in-process with a Five
|
||||
program. Once it starts, any PostgreSQL client (`psql`, pgx, JDBC, DBeaver,
|
||||
Tableau, ...) can connect to the same DBF tables your PRG code sees, with full
|
||||
SQL DML, transactions, password / MD5 / SCRAM-SHA-256 authentication, and TLS.
|
||||
|
||||
```
|
||||
psql / pgx / JDBC ──TCP:5432──▶ pgserver ──HB_FUNC──▶ five_SQL ──▶ DBF/NTX/CDX
|
||||
```
|
||||
|
||||
Single binary. No separate daemon. The Five program *is* the database server.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```harbour
|
||||
// pgserver_demo.prg — open a few tables and serve them
|
||||
#include "FiveSqlDef.ch"
|
||||
|
||||
PROCEDURE Main()
|
||||
USE customers SHARED NEW
|
||||
USE orders SHARED NEW
|
||||
|
||||
PG_ADD_ROLE( "alice", "swordfish" )
|
||||
PG_SERVER_START( ":5432", "scram-sha-256" ) /* blocks; clients can connect */
|
||||
RETURN
|
||||
```
|
||||
|
||||
```bash
|
||||
./five build pgserver_demo.prg _FiveSql2/src/*.prg -o pgserver_demo
|
||||
./pgserver_demo &
|
||||
PGPASSWORD=swordfish psql 'postgres://alice@127.0.0.1:5432/alice?sslmode=disable' \
|
||||
-c "SELECT id, name FROM customers ORDER BY id"
|
||||
```
|
||||
|
||||
## PRG API
|
||||
|
||||
| Function | Purpose |
|
||||
|----------|---------|
|
||||
| `PG_SERVER_START( cAddr [, cAuthMode] )` | Start the listener; blocks the calling thread. `cAddr` is `:5432` or `"127.0.0.1:5432"`. `cAuthMode` ∈ `"trust"`, `"password"`, `"md5"`, `"scram-sha-256"` (default `"trust"`). |
|
||||
| `PG_SERVER_STOP()` | Close the active listener. |
|
||||
| `PG_ADD_ROLE( cName, cPassword )` | Register a credential. Idempotent — re-adding replaces. |
|
||||
| `PG_REMOVE_ROLE( cName )` | Drop a credential. |
|
||||
| `PG_TLS_LOAD( cCertPath, cKeyPath )` | Install a PEM cert/key for TLS. Must be called *before* `PG_SERVER_START`. |
|
||||
| `PG_TLS_SELF_SIGNED( cCertPath, cKeyPath, cHostname )` | Generate + install a self-signed ECDSA P-256 cert. Dev only. |
|
||||
| `PG_ALLOW_IP( cCIDR )` | Append a CIDR range to the source-IP allowlist. Repeated calls accumulate. Empty list ⇒ allow all. |
|
||||
| `PG_CLEAR_ALLOWLIST()` | Reset to allow-all. |
|
||||
|
||||
## v1.0 Feature Coverage
|
||||
|
||||
### Wire protocol
|
||||
|
||||
- **Simple Query** — psql-style `-c "SELECT 1"`
|
||||
- **Extended Protocol** — Parse / Bind / Describe / Execute / Sync / Close (pgx default path)
|
||||
- **Multi-statement** — `BEGIN; INSERT ...; COMMIT;` over one Query message
|
||||
- **EmptyQueryResponse** for `";"` etc.
|
||||
- **Cancel keys** are emitted in BackendKeyData but `CancelRequest` is a no-op (v1.1)
|
||||
|
||||
### SQL coverage
|
||||
|
||||
Whatever FiveSql2 parses works over the wire — see [_FiveSql2/README.md](../_FiveSql2/README.md):
|
||||
|
||||
- Full DML: SELECT / INSERT / UPDATE / DELETE
|
||||
- Transactions: BEGIN / COMMIT / ROLLBACK
|
||||
- DDL: CREATE TABLE / DROP TABLE / CREATE INDEX / DROP INDEX
|
||||
- JOIN, subquery, CTE (incl. RECURSIVE), CASE, CAST, window functions
|
||||
- Multi-row INSERT, INSERT ... SELECT, MERGE
|
||||
- 60+ scalar functions
|
||||
|
||||
Each connection has its own `TSqlSession` instance — transactions and plan
|
||||
caches don't leak between concurrent clients.
|
||||
|
||||
### Authentication
|
||||
|
||||
| Mode | Hash on disk | TLS recommended | Notes |
|
||||
|------|--------------|-----------------|-------|
|
||||
| `trust` | — | — | No authentication — loopback / dev only |
|
||||
| `password` | plaintext (in-memory) | **required** | Cleartext over the wire — never use without TLS |
|
||||
| `md5` | plaintext (in-memory) | optional | libpq classic; works with every client |
|
||||
| `scram-sha-256` | plaintext (in-memory) | optional | PG 14+ default; pgx + JDBC + libpq all prefer this |
|
||||
|
||||
The credential store is **in-memory** — there's no on-disk role table in v1.0.
|
||||
Reseed via `PG_ADD_ROLE` on every server restart (typically from the bootstrap
|
||||
PRG).
|
||||
|
||||
### TLS
|
||||
|
||||
- TLS via `crypto/tls`; cert + key loaded from PEM files (or generated via
|
||||
`PG_TLS_SELF_SIGNED` for development).
|
||||
- `sslmode=require` works; `sslmode=verify-ca` / `verify-full` work when the
|
||||
client trusts the issuing cert.
|
||||
- TLS state is captured at `PG_SERVER_START` time — changing the cert
|
||||
afterwards only affects new listeners.
|
||||
|
||||
### Source-IP allowlist
|
||||
|
||||
CIDR-style allowlist with semantics: empty list = allow anyone, non-empty list
|
||||
= deny anything not matching. Repeated `PG_ALLOW_IP` calls accumulate.
|
||||
Convenience for keeping a port open to localhost / lab subnets without putting
|
||||
the server behind a real firewall.
|
||||
|
||||
### Type marshalling
|
||||
|
||||
DataRows go out as text format by default — the simplest, most widely-supported
|
||||
option. Binary inbound parameters (the path pgx uses) are decoded for:
|
||||
|
||||
| OID | Type | Status |
|
||||
|-----|------|--------|
|
||||
| 16 | BOOL | ✓ |
|
||||
| 21 | INT2 | ✓ |
|
||||
| 23 | INT4 | ✓ |
|
||||
| 20 | INT8 | ✓ |
|
||||
| 700 | FLOAT4 | ✓ |
|
||||
| 701 | FLOAT8 | ✓ |
|
||||
| 1700 | NUMERIC | ✓ (base-10000 decoder, full precision) |
|
||||
| 1082 | DATE | ✓ |
|
||||
| 1114 | TIMESTAMP | ✓ |
|
||||
| 1184 | TIMESTAMPTZ | ✓ |
|
||||
| 25 | TEXT / VARCHAR | ✓ (text format) |
|
||||
| 17 | BYTEA | text format only (binary in v1.1) |
|
||||
|
||||
### pg_catalog stubs
|
||||
|
||||
BI tools (DBeaver, Tableau, DataGrip, pgAdmin) fire dozens of catalog probes
|
||||
on connect. v1.0 ships synthesised responses for the commonly-required shapes
|
||||
so the client gets to a usable state:
|
||||
|
||||
- `SELECT version()` → `"PostgreSQL 14.0 (FiveSql2)"`
|
||||
- `SHOW server_version_num` → `"140000"`
|
||||
- `pg_namespace` (lists `public` + `pg_catalog`)
|
||||
- `pg_class` (DBFs + indexes as relations)
|
||||
- `pg_attribute` (column lists)
|
||||
- `pg_type`, `pg_settings`, `pg_database`
|
||||
|
||||
Probes outside this set return empty result sets rather than erroring — most
|
||||
BI tools tolerate that and proceed.
|
||||
|
||||
## Known Limitations
|
||||
|
||||
These are documented v1.0 boundaries, not bugs. Each has a clear v1.1+
|
||||
remediation path.
|
||||
|
||||
### High-concurrency writes
|
||||
|
||||
Two-connection serial use is reliable. N-connection use where each connection
|
||||
finishes its transaction before the next starts is reliable. With **3+
|
||||
concurrent connections doing in-flight INSERT/SELECT inside their own
|
||||
transactions**, the underlying DBF format (no MVCC) can surface a race where
|
||||
one connection's just-appended row isn't visible to its own SELECT inside the
|
||||
same transaction.
|
||||
|
||||
This is bounded by the DBF on-disk format, not by pgserver. Mitigations in
|
||||
v1.0:
|
||||
|
||||
- Append-intent lock (`hbrdd/dbf/locks_posix.go`) — eliminates EOF-marker
|
||||
corruption between concurrent appends.
|
||||
- Per-path mmap-generation registry — guarantees fresh reads after a peer
|
||||
flush.
|
||||
- Header max-merge on `Close()` / EOF write — preserves peer record counts
|
||||
during shared-mode cleanup.
|
||||
|
||||
For workloads with 5+ concurrent writers, either serialise at the application
|
||||
layer or wait for v1.1's per-table writer mutex (planned).
|
||||
|
||||
### No on-disk role storage
|
||||
|
||||
`PG_ADD_ROLE` registers credentials in memory only. The bootstrap PRG must
|
||||
re-add them on every restart. v1.1 will land `__five_roles.dbf` +
|
||||
`__five_grants.dbf` for persistent role/grant storage with full
|
||||
`CREATE ROLE / GRANT / REVOKE` support.
|
||||
|
||||
### No `pg_hba.conf` parsing
|
||||
|
||||
`AuthMode` is server-global, `AllowIP` is a simple CIDR list. You can't say
|
||||
"alice authenticates with SCRAM from the internal subnet but is rejected from
|
||||
external IPs" yet. v1.1 will add per-user/per-source-IP routing via a
|
||||
`pg_hba.conf`-style config.
|
||||
|
||||
### `CancelRequest` is a no-op
|
||||
|
||||
BackendKeyData is sent (so clients don't error on the missing message), but
|
||||
the cancel handshake itself doesn't currently interrupt a running query. Long
|
||||
queries run to completion regardless. v1.1.
|
||||
|
||||
### Binary BYTEA + advanced binary outputs
|
||||
|
||||
Inbound BYTEA params arrive as binary in pgx; we decode them but only emit
|
||||
them as text in DataRow. Most BI tools handle the text BYTEA format
|
||||
(`\\x<hex>`) fine. Outbound binary NUMERIC (the base-10000 encoding) is also
|
||||
text-only in v1.0.
|
||||
|
||||
### Connection limits / idle timeout
|
||||
|
||||
`Server.MaxConnections` (configurable, default 100) gates the accept loop via
|
||||
a semaphore, but there's no idle-connection timeout. A misbehaving client can
|
||||
hold a session open indefinitely.
|
||||
|
||||
## Security Notes
|
||||
|
||||
- **`trust` mode** is loopback/dev only. Never expose a `trust`-mode server to
|
||||
the network — anyone who can reach the port can read every table.
|
||||
- **`password` mode** transmits the password in cleartext. Use only with TLS.
|
||||
- **`md5` mode** is the libpq default fallback. It's not cryptographically
|
||||
strong against a network attacker who can capture the salt + response, but
|
||||
it's universally implemented. Pair with TLS for any non-loopback use.
|
||||
- **`scram-sha-256`** is the modern default. PG 14+ libpq, pgx 5+, and JDBC
|
||||
42.5+ all prefer it when offered. PBKDF2 iteration count is 4096 (the
|
||||
RFC 5802 minimum, matching PG's default).
|
||||
- The role registry holds **plaintext passwords**. Same security posture as
|
||||
HTTP basic auth with an in-memory user table — fine for embedded /
|
||||
single-process deployments where the process boundary is already a trust
|
||||
boundary, not fine for shared multi-tenant hosting.
|
||||
- TLS state is locked at `PG_SERVER_START` time. To rotate certs, restart the
|
||||
server.
|
||||
|
||||
## Testing
|
||||
|
||||
The integration suite at [`tests/pgserver/run.sh`](../tests/pgserver/run.sh)
|
||||
runs a real `psql` against an in-process server and verifies:
|
||||
|
||||
1. Simple Query: `SELECT 1, 'hello'`
|
||||
2. Multi-statement Simple Query
|
||||
3. Transaction control: BEGIN/COMMIT round-trip
|
||||
4. MD5 auth: wrong password rejected
|
||||
5. MD5 auth: correct password accepted
|
||||
6. SCRAM-SHA-256: wrong password rejected
|
||||
7. SCRAM-SHA-256: correct password accepted
|
||||
8. TLS handshake + MD5 auth via `sslmode=require`
|
||||
9. Catalog probe: `SELECT version()`
|
||||
10. Catalog probe: `pg_namespace`
|
||||
11. Catalog probe: `SHOW server_version_num`
|
||||
|
||||
`bash tests/pgserver/run.sh` — requires `psql` on the PATH.
|
||||
|
||||
Unit tests in `hbrtl/pgserver/*_test.go` pin the wire encoding, the MD5
|
||||
challenge formula, the SCRAM math (PBKDF2 / HMAC / proof verify / server
|
||||
signature), and every binary param decoder against handcrafted PG payloads.
|
||||
|
||||
## Roadmap (v1.1+)
|
||||
|
||||
- Per-table writer mutex / WAL for high-concurrency write isolation
|
||||
- Persistent role/grant tables (`__five_roles.dbf`, `__five_grants.dbf`)
|
||||
- `pg_hba.conf`-style per-user/per-IP auth routing
|
||||
- `CancelRequest` honoured
|
||||
- Binary BYTEA + NUMERIC DataRow output
|
||||
- `LISTEN / NOTIFY` for cross-connection events
|
||||
- Connection idle timeout
|
||||
Reference in New Issue
Block a user