docs(rag): add security idioms & gotchas (06-security.md)

Capture the hardening patterns from the solmade audit so future Five work
reuses them: authorize on resolved function name (not URL path), CSPRNG
session tokens stored as hashes, argon2id with legacy-verify + upgrade,
login rate-limit + timing-safe dummy hash, bluemonday HTML sanitize vs
EscHtml, security headers + nonce CSP, upload allowlist (no SVG), bind-all
SQL. Theme: thin Go RTL over an ecosystem crypto lib. INDEX/README updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
CharlesKWON
2026-06-15 15:49:49 +09:00
parent 80131b1225
commit f26911177f
3 changed files with 123 additions and 0 deletions

121
rag/06-security.md Normal file
View File

@@ -0,0 +1,121 @@
---
doc: five-security
title: Security idioms & gotchas for Five web apps
keywords: [security, auth, session, csprng, crypto/rand, argon2, password hash, bluemonday, xss, sanitize, csp, security headers, rate limit, role gate, authorization, cookie, upload, sql injection]
summary: Hardening patterns and traps for a Five web app, grounded in the solmade codebase. Covers authz gating, session tokens, password hashing, XSS sanitization, headers/CSP, login rate-limiting, uploads — and the "thin Go RTL over an ecosystem crypto lib" pattern.
---
# Five web security — idioms & gotchas
Grounded in `solmade`. The recurring theme: reach for a **thin Go RTL wrapping a
battle-tested ecosystem library** (crypto/rand, bluemonday, argon2id) rather than
hand-rolling crypto in PRG.
## 1. GOTCHA — authorize on the RESOLVED function name, not the URL path
The router maps `-` and `/` both to `_` (`/api/admin/x`, `/api/admin-x`, `/api/admin_x`
all reach `ADMIN_X__MAIN`). A role gate that matches the path prefix `"/api/admin/"`
is bypassed by the hyphen/underscore variants → privilege escalation.
```five
// WRONG: SubStr(cPath,1,11) == "/api/admin/" ← misses /api/admin-x , /api/admin_x
// RIGHT: gate on the resolved function name
cFunc := PathToFunc( cPath )
IF Left( cFunc, 6 ) == "ADMIN_"
RETURN cRole == "superadmin" .OR. cRole == "operator"
ENDIF
```
Anon allowlists (login/logout/health) are safe as *exact* matches — they fail closed.
(solmade: `app/auth/auth_middleware.prg` RoleAllows.)
## 2. Session tokens — CSPRNG, and store the HASH
`hb_RandomInt` is Mersenne Twister (non-crypto) — predictable tokens → session
hijacking. Use a `crypto/rand` RTL. And store `SHA256(token)`, not the raw token, so a
DB leak yields nothing reusable; the cookie holds the raw value.
```five
// hbrtl_ext/secrand: SEC_RANDHEX(nBytes) via crypto/rand
FUNCTION SESSION_TOKEN() ; RETURN SEC_RANDHEX( 32 ) // 64-hex
FUNCTION SESSION_TOKEN_HASH( c ) ; RETURN hb_SHA256( hb_CStr( c ) )
// login : INSERT ... token = SESSION_TOKEN_HASH(cToken); Set-Cookie = raw cToken
// verify: WHERE s.token = SESSION_TOKEN_HASH(cCookie)
// logout: DELETE WHERE token = SESSION_TOKEN_HASH(cCookie)
```
Cookie flags: `HttpOnly; SameSite=Lax; Max-Age=…` and `Secure` when
`x-forwarded-proto == https`. (`hb_SHA256` == `shasum -a 256`, lowercase hex.)
## 3. Passwords — argon2id, with legacy verify + upgrade-on-login
Salted SHA-256 stretch is GPU-crackable (not memory-hard). Hash with argon2id
(`alexedwards/argon2id` via RTL). `PASSWD_VERIFY` detects the scheme so legacy
`$sha256s$` rows still log in; on success, transparently re-hash to argon2id.
```five
FUNCTION PASSWD_HASH( c ) // try SEC_ARGON2_HASH; fall back to legacy if empty
FUNCTION PASSWD_VERIFY( c, cEnc ) // Left(cEnc,9)=="$argon2id" → SEC_ARGON2_VERIFY else legacy
FUNCTION PASSWD_NEEDS_REHASH( cEnc ) ; RETURN Left(hb_CStr(cEnc),9) != "$argon2id"
// on login success: IF PASSWD_NEEDS_REHASH(stored) → UPDATE users SET password_hash=PASSWD_HASH(pw)
```
## 4. Login — rate limit + timing-safe unknown-user path
- Rate limit per IP: a `login_attempts` table; count failures in the last 15 min; ≥20 → 429.
Clear an IP's failures on success.
- Timing: an unknown email must NOT return before doing hash work, or response time leaks
which emails exist. Run a dummy hash on the not-found branch.
```five
IF aRows == NIL .OR. Len(aRows) == 0
PASSWD_VERIFY_DUMMY( cPass ) // burns argon2 work → constant-ish timing
RecordLogin( nPG, cIP, cEmail, .f. )
RETURN API_ERR( 401, "invalid credentials" )
ENDIF
```
Use the SAME error text/status for "no such user" and "wrong password".
## 5. XSS — sanitize user HTML; escape plain-text contexts
A CMS stores rich text (Editor.js: `<b><i><a>`). You cannot blanket-escape it (breaks
formatting) and you must not concat it raw into HTML (stored XSS). Sanitize with an
allowlist (`bluemonday` via RTL); escape only genuinely plain-text slots.
```five
// hbrtl_ext/sanitize: HTML_SANITIZE(html) via bluemonday.UGCPolicy()
cOut += '<p>' + HTML_SANITIZE( cUserText ) + '</p>' // rich text
cHtml += '<title>' + EscHtml( cPlainTitle ) + '</title>' // plain text
cImg += 'src="' + EscAttr( cUrl ) + '"' // attribute
```
Strips `<script>`, `on*=` handlers, `javascript:`; keeps `<b>`, links, lists.
## 6. Response headers + CSP
Set on every response: `X-Content-Type-Options: nosniff`, `X-Frame-Options: SAMEORIGIN`,
`Referrer-Policy: strict-origin-when-cross-origin`. For a page that renders user content,
add a **per-request nonce CSP** so injected scripts can't run while your own data blocks
(JSON-LD) still do:
```five
cNonce := SEC_RANDHEX( 16 )
hHdrs[ "Content-Security-Policy" ] := ;
"default-src 'self'; img-src 'self' https: data:; " + ;
"style-src 'self' 'unsafe-inline' https://fonts.googleapis.com; " + ;
"font-src 'self' https://fonts.gstatic.com; " + ;
"script-src 'nonce-" + cNonce + "'; object-src 'none'; base-uri 'none'"
// emit JSON-LD as: <script type="application/ld+json" nonce="<cNonce>">
```
A global CSP often breaks editor CDNs/inline styles — scope it to the rendered page.
## 7. Uploads
Extension **allowlist** (unknown → `.bin`), and **exclude `.svg`** (SVG can carry script
and would XSS if served inline). Strip path traversal from filenames (`..`, `/`, `\`, NUL)
and prefix with a server id so names aren't attacker-controlled.
## 8. SQL — always bind, never concat user input
`PG_QUERY(nPG, "… WHERE x=$1", { val })`. Bind every user value as `$1/$2…`. Only ever
concatenate *validated-allowlist* identifiers (table/column names), never raw input.
(See [[five-idioms]] for the Postgres patterns and the strings-not-ints column gotcha.)
Related: [[five-idioms]], [[five-gotchas]]

View File

@@ -9,6 +9,7 @@ Route a query to the right doc(s). Each row: file · when to retrieve · keyword
| `03-rtl-catalog.md` | "what function does X" — string/array/hash/json/date/regex/charset/math/crypto builtins | rtl, builtin, Len, SubStr, Left, Right, At, Upper, AllTrim, PadL, PadR, StrTran, Chr, Asc, Val, Str, hb_NToS, hb_CStr, AAdd, AScan, AEval, hb_HGetDef, hb_HHasKey, hb_jsonDecode, hb_jsonEncode, ValType, HB_ISHASH, regex, HB_GETCHARSET, date, hb_ATokens |
| `04-idioms.md` | building an endpoint, DB access, async/queue work, calling the LLM, building/deploying | idioms, http, endpoint, routing, AP_BODY, AP_GETPAIRS, AP_JSONRESPONSE, ctx_set, ctx_get, LABDB_GET_PG, PG_QUERY, PG_EXEC, PG_LAST_ERROR, RETURNING, CREATE TABLE IF NOT EXISTS, text_tasks, FOR UPDATE SKIP LOCKED, job queue, LLM_CHAT, fnode, build.sh, launchctl |
| `05-gotchas.md` | debugging "why doesn't this work", or BEFORE editing string funcs / charset / SQL / LLM | gotcha, trap, intrinsic, gengo, charset, utf8, string escape, Chr, pgrtl string columns, Val, hb_CStr, model local, ResolveLlmModel, two runtimes, fnode, analyzer warning, CWD module resolution |
| `06-security.md` | adding auth/login, sessions, password hashing, file uploads, or rendering user content into HTML | security, auth, authorization, role gate, session token, csprng, crypto/rand, argon2, password hash, xss, bluemonday, sanitize, csp, security headers, rate limit, cookie, upload, sql injection |
## Quick routing heuristics

View File

@@ -20,6 +20,7 @@ of the gap; the accumulating **gotchas** file closes the semantic long tail.
| `03-rtl-catalog.md` | Runtime-library functions (strings, array, hash, JSON, date, regex, charset, …) |
| `04-idioms.md` | Web/worker patterns: HTTP endpoint, routing, Postgres, job queue, LLM, build/deploy |
| `05-gotchas.md` | Non-obvious traps + fixes (the highest-signal file) |
| `06-security.md` | Web security patterns: authz, sessions, password hashing, XSS, CSP, uploads |
| `INDEX.md` | Retrieval manifest (doc → keywords + one-line) |
Every file has YAML frontmatter (`doc`, `title`, `keywords`, `summary`) for ranking.