Corpus indexed into KWONDoc's bluge index (~/.kwondoc/search-index, category five-rag) so bluge_search surfaces it; README documents the re-index command (cmd/ragindex online upsert, doesn't wipe other docs). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2.9 KiB
Five RAG — knowledge corpus for LLM agents writing Five
A compact, retrieval-ready knowledge base that lets an LLM read and write Five (xBase/Harbour → Go) code correctly without prior training on it. This is the practical form of "give the model the grammar via RAG": grammar + RTL surface + real idioms + the long-tail gotchas.
Why this exists
Five is token-dense, so the corpus needed to teach a model is small and cheap to inject — a dense language is cheaper to RAG than a verbose one. Grammar/RTL retrieval closes most of the gap; the accumulating gotchas file closes the semantic long tail.
Contents
| File | What it covers |
|---|---|
01-overview.md |
What Five is, design priorities, the two runtimes, compile model |
02-syntax.md |
Declarations, literals, operators, control flow, code blocks |
03-rtl-catalog.md |
Runtime-library functions (strings, array, hash, JSON, date, regex, charset, …) |
04-idioms.md |
Web/worker patterns: HTTP endpoint, routing, Postgres, job queue, LLM, build/deploy |
05-gotchas.md |
Non-obvious traps + fixes (the highest-signal file) |
06-security.md |
Web security patterns: authz, sessions, password hashing, XSS, CSP, uploads |
INDEX.md |
Retrieval manifest (doc → keywords + one-line) |
Every file has YAML frontmatter (doc, title, keywords, summary) for ranking.
How to consume
- Direct context injection (simplest): for a small/medium task, paste the relevant
doc(s). For broad work,
01+02+05fit easily; pull03/04sections as needed. - Keyword retrieval (built-in): run
./search.sh <terms>— a dependency-free ripgrep/grep ranker over the corpus (frontmatterkeywordsweighted ×3 + body), printing ranked docs with the matching##section headers. No index to build. e.g../search.sh session token csprng→06-security.md §2. - bluge full-text index (KWONDoc): this corpus is indexed into KWONDoc's bluge
index (
~/.kwondoc/search-index, categoryfive-rag) sobluge_searchfinds it. Re-index after edits:cd ~/kwondoc && go run ./cmd/ragindex <abs path to rag> five-rag(online upsert — keyed by file path, does not wipe other docs). INDEX.mdis the hand-curated routing table; an embeddings index can ingest the same.md.- Embedding RAG: chunk by
##headers (each section is self-contained). Frontmattersummarymakes a good chunk preamble.
Suggested system-prompt pointer: "When writing Five (.prg) code, consult the Five RAG at
fivedev/five/rag/ — especially 05-gotchas.md — and prefer patterns from 04-idioms.md."
Maintenance
- Keep
03-rtl-catalog.mdhonest againsthbrtl/register.go(names are authoritative; rare signatures may drift). - Append every new trap to
05-gotchas.md. That file is the compounding asset. - Grammar truth:
compiler/{lexer,parser,ast}. Idiom truth: thesolmadeapp.