A retrieval-ready knowledge base so an LLM can read/write Five without prior training: overview, syntax, full RTL catalog (from hbrtl/register.go), web/worker idioms (from the solmade app), and a long-tail gotchas file. Every doc has keyword/summary frontmatter; INDEX.md is the routing manifest. Grounded by parallel source exploration; RTL names spot-checked against register.go. The gotchas file is the compounding asset — append new traps. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
45 lines
2.2 KiB
Markdown
45 lines
2.2 KiB
Markdown
# Five RAG — knowledge corpus for LLM agents writing Five
|
|
|
|
A compact, retrieval-ready knowledge base that lets an LLM read and write **Five**
|
|
(xBase/Harbour → Go) code correctly without prior training on it. This is the practical
|
|
form of "give the model the grammar via RAG": grammar + RTL surface + real idioms +
|
|
the long-tail gotchas.
|
|
|
|
## Why this exists
|
|
|
|
Five is token-dense, so the corpus needed to *teach* a model is small and cheap to inject
|
|
— a dense language is cheaper to RAG than a verbose one. Grammar/RTL retrieval closes most
|
|
of the gap; the accumulating **gotchas** file closes the semantic long tail.
|
|
|
|
## Contents
|
|
|
|
| File | What it covers |
|
|
|------|----------------|
|
|
| `01-overview.md` | What Five is, design priorities, the two runtimes, compile model |
|
|
| `02-syntax.md` | Declarations, literals, operators, control flow, code blocks |
|
|
| `03-rtl-catalog.md` | Runtime-library functions (strings, array, hash, JSON, date, regex, charset, …) |
|
|
| `04-idioms.md` | Web/worker patterns: HTTP endpoint, routing, Postgres, job queue, LLM, build/deploy |
|
|
| `05-gotchas.md` | Non-obvious traps + fixes (the highest-signal file) |
|
|
| `INDEX.md` | Retrieval manifest (doc → keywords + one-line) |
|
|
|
|
Every file has YAML frontmatter (`doc`, `title`, `keywords`, `summary`) for ranking.
|
|
|
|
## How to consume
|
|
|
|
- **Direct context injection (simplest):** for a small/medium task, paste the relevant
|
|
doc(s). For broad work, `01`+`02`+`05` fit easily; pull `03`/`04` sections as needed.
|
|
- **Keyword retrieval (e.g. bluge / ripgrep):** index the `.md` files; rank on the
|
|
frontmatter `keywords` + body. `INDEX.md` is a hand-curated routing table.
|
|
- **Embedding RAG:** chunk by `##` headers (each section is self-contained). Frontmatter
|
|
`summary` makes a good chunk preamble.
|
|
|
|
Suggested system-prompt pointer: *"When writing Five (.prg) code, consult the Five RAG at
|
|
`fivedev/five/rag/` — especially `05-gotchas.md` — and prefer patterns from `04-idioms.md`."*
|
|
|
|
## Maintenance
|
|
|
|
- Keep `03-rtl-catalog.md` honest against `hbrtl/register.go` (names are authoritative;
|
|
rare signatures may drift).
|
|
- **Append every new trap to `05-gotchas.md`.** That file is the compounding asset.
|
|
- Grammar truth: `compiler/{lexer,parser,ast}`. Idiom truth: the `solmade` app.
|