--- doc: five-gotchas title: Five gotchas & non-obvious traps keywords: [gotcha, trap, pitfall, intrinsic, gengo, charset, utf8, string escape, Chr, pgrtl string columns, Val, model local, analyzer warning, fnode, runtime] summary: The non-obvious semantic traps that pure grammar knowledge will NOT prevent. Each entry is a real mistake observed in practice plus the fix. This is the long-tail corpus that makes Five RAG actually work. --- # Five gotchas (read before writing/debugging Five) These are discovered-the-hard-way facts. Grammar docs won't save you here. ## 1. String functions are inlined intrinsics — editing `hbrtl` alone does nothing The compiler (`compiler/gengo/gengo.go`) **inlines** `LEN, CHR, ASC, SUBSTR, LEFT, RIGHT, AT, PADR, PADL` directly as Go. They do **not** dispatch through the `hbrtl` registry at runtime. So changing the registered `hbrtl` function has **no effect** on these calls. - To change their runtime behavior you must edit the gengo intrinsic cases. They now emit calls to charset-aware helpers `hbrtl.StrLen/StrChr/StrAsc/StrSubStr/StrLeft/StrRight/ StrAt/StrPadR/StrPadL` (in `hbrtl/charset.go`). - Functions used as code blocks / passed around DO hit the registry, so keep the registry impl and the intrinsic in agreement. ## 2. Strings are UTF-8 (runes) by default; legacy charset is opt-in `LEN("한글")` is `2` (runes), not bytes. `CHR(9650)` is `▲`. This is the default. - Select a legacy charset with `HB_SETCHARSET("CP949")` / `HB_CDPSELECT("CP949")` — then byte/charset semantics apply. `HB_GETCHARSET()` reads the active one. - Initial charset comes from env `FIVE_CHARSET` (or `HB_CODEPAGE`); default `UTF8`. - Convert across charsets with `HB_TRANSLATE(cStr, cFrom, cTo)`. ## 3. String literals do NOT process escapes (single OR double quotes) `"a\nb"` is the literal characters `a \ n b` — **not** a newline. Same for `'...'`. - For a newline use `Chr(10)` (and `Chr(13)` for CR); build control chars explicitly. - To embed a quote: wrap in the *other* quote. A string containing `"` → use `'...'`; a string containing `'` → use `"..."`. (No backslash-escaping exists.) - Watch out building SQL/format strings: e.g. a literal `T` separator inside a double-quoted SQL fragment can clash — concatenate instead: `... || 'T' || ...`. ## 4. Postgres columns come back as STRINGS `PG_QUERY` (pgrtl) returns rows as hashes whose values are **all strings**, even for `INTEGER`/`NUMERIC` columns. `Int("100")` semantics will bite you. - Convert: `Val( hb_CStr( row["id"] ) )` for numbers. - Bind params as strings too: `{ hb_NToS( nId ) }`. ## 5. `ctx_get("auth_user_id")` is a string Auth context values are strings. `nUser := Val( ctx_get( "auth_user_id", "0" ) )`. ## 6. LLM `model = "local"` is rejected by mlx/llama servers OpenAI-compatible local servers (mlx_lm, llama.cpp) 404 on unknown model names. The app's `ResolveLlmModel` queries `/v1/models` and substitutes the actually-loaded id. If you call an LLM endpoint directly, never send `model:"local"` — resolve the real id first. ## 7. Two runtimes — build with the right one `solmade` builds with the **`fnode`** toolchain in `fivenode_go`, NOT the `five` CLI in `fivedev/five`. They are separate runtimes with separate RTL behavior. Historic example: `fivenode_go`'s `Chr()` double-encoded multibyte values (corruption), which is what prompted implementing proper UTF-8 in `fivedev/five`. Don't assume behavior carries over. ## 8. Run the `five` CLI from inside `fivedev/five` Module resolution depends on CWD. Building/running `five` from elsewhere can pick up the wrong `replace` directive (e.g. resolving `five =>` to an unrelated repo). Always `cd /Users/charleskwon/fivenode/fivedev/five` first, e.g. `go build -o /tmp/five ./cmd/five && /tmp/five run x.prg`. ## 9. Analyzer "undeclared variable" warnings for RTL functions are harmless The static analyzer warns `undeclared variable 'HB_FOO'` for RTL functions it doesn't know about; they still resolve at runtime via the registry. To silence, add the name to the known-function set in `compiler/analyzer/analyzer.go` (e.g. `HB_GETCHARSET` etc. were added there). A warning is not an error. ## 10. STYLE: no inline `;` multi-statements (banned) Five aims to be easy for a human to verify by eye. Do **not** pack multiple statements onto one line with `;`: `IF nPG < 0 ; RETURN NIL ; ENDIF`, `IF Empty(x) ; x:="y" ; ENDIF` are banned — they hurt visual review. Always expand: ```five IF nPG < 0 RETURN NIL ENDIF ``` (The *trailing* `;` for line continuation — joining a long string/SQL/arg list across lines — is a different, allowed feature. The ban is only on `;` as a statement separator.) ## 11. Density is a double-edged sword when debugging One line doing a lot means one line failing does a lot. When a dense statement misbehaves, expand it (split the chained `hb_*`/`PG_*`/`LLM_CHAT` calls into temporaries) to localize the fault before reasoning about it. --- > Maintenance discipline: when you hit a NEW non-obvious trap, add it here. Pure-grammar > RAG closes ~80% of the gap; this accumulating gotcha list closes the rest.