# Five RAG — knowledge corpus for LLM agents writing Five A compact, retrieval-ready knowledge base that lets an LLM read and write **Five** (xBase/Harbour → Go) code correctly without prior training on it. This is the practical form of "give the model the grammar via RAG": grammar + RTL surface + real idioms + the long-tail gotchas. ## Why this exists Five is token-dense, so the corpus needed to *teach* a model is small and cheap to inject — a dense language is cheaper to RAG than a verbose one. Grammar/RTL retrieval closes most of the gap; the accumulating **gotchas** file closes the semantic long tail. ## Contents | File | What it covers | |------|----------------| | `01-overview.md` | What Five is, design priorities, the two runtimes, compile model | | `02-syntax.md` | Declarations, literals, operators, control flow, code blocks | | `03-rtl-catalog.md` | Runtime-library functions (strings, array, hash, JSON, date, regex, charset, …) | | `04-idioms.md` | Web/worker patterns: HTTP endpoint, routing, Postgres, job queue, LLM, build/deploy | | `05-gotchas.md` | Non-obvious traps + fixes (the highest-signal file) | | `06-security.md` | Web security patterns: authz, sessions, password hashing, XSS, CSP, uploads | | `INDEX.md` | Retrieval manifest (doc → keywords + one-line) | Every file has YAML frontmatter (`doc`, `title`, `keywords`, `summary`) for ranking. ## How to consume - **Direct context injection (simplest):** for a small/medium task, paste the relevant doc(s). For broad work, `01`+`02`+`05` fit easily; pull `03`/`04` sections as needed. - **Keyword retrieval (built-in):** run `./search.sh ` — a dependency-free ripgrep/grep ranker over the corpus (frontmatter `keywords` weighted ×3 + body), printing ranked docs with the matching `##` section headers. No index to build. e.g. `./search.sh session token csprng` → `06-security.md §2`. `INDEX.md` is the hand-curated routing table; a bluge/embeddings index can ingest the same `.md` files. - **Embedding RAG:** chunk by `##` headers (each section is self-contained). Frontmatter `summary` makes a good chunk preamble. Suggested system-prompt pointer: *"When writing Five (.prg) code, consult the Five RAG at `fivedev/five/rag/` — especially `05-gotchas.md` — and prefer patterns from `04-idioms.md`."* ## Maintenance - Keep `03-rtl-catalog.md` honest against `hbrtl/register.go` (names are authoritative; rare signatures may drift). - **Append every new trap to `05-gotchas.md`.** That file is the compounding asset. - Grammar truth: `compiler/{lexer,parser,ast}`. Idiom truth: the `solmade` app.