Files
five/rag/README.md
CharlesKWON cf370564f3 docs(rag): Five knowledge corpus for LLM agents
A retrieval-ready knowledge base so an LLM can read/write Five without
prior training: overview, syntax, full RTL catalog (from hbrtl/register.go),
web/worker idioms (from the solmade app), and a long-tail gotchas file.
Every doc has keyword/summary frontmatter; INDEX.md is the routing manifest.

Grounded by parallel source exploration; RTL names spot-checked against
register.go. The gotchas file is the compounding asset — append new traps.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 13:54:03 +09:00

45 lines
2.2 KiB
Markdown

# Five RAG — knowledge corpus for LLM agents writing Five
A compact, retrieval-ready knowledge base that lets an LLM read and write **Five**
(xBase/Harbour → Go) code correctly without prior training on it. This is the practical
form of "give the model the grammar via RAG": grammar + RTL surface + real idioms +
the long-tail gotchas.
## Why this exists
Five is token-dense, so the corpus needed to *teach* a model is small and cheap to inject
— a dense language is cheaper to RAG than a verbose one. Grammar/RTL retrieval closes most
of the gap; the accumulating **gotchas** file closes the semantic long tail.
## Contents
| File | What it covers |
|------|----------------|
| `01-overview.md` | What Five is, design priorities, the two runtimes, compile model |
| `02-syntax.md` | Declarations, literals, operators, control flow, code blocks |
| `03-rtl-catalog.md` | Runtime-library functions (strings, array, hash, JSON, date, regex, charset, …) |
| `04-idioms.md` | Web/worker patterns: HTTP endpoint, routing, Postgres, job queue, LLM, build/deploy |
| `05-gotchas.md` | Non-obvious traps + fixes (the highest-signal file) |
| `INDEX.md` | Retrieval manifest (doc → keywords + one-line) |
Every file has YAML frontmatter (`doc`, `title`, `keywords`, `summary`) for ranking.
## How to consume
- **Direct context injection (simplest):** for a small/medium task, paste the relevant
doc(s). For broad work, `01`+`02`+`05` fit easily; pull `03`/`04` sections as needed.
- **Keyword retrieval (e.g. bluge / ripgrep):** index the `.md` files; rank on the
frontmatter `keywords` + body. `INDEX.md` is a hand-curated routing table.
- **Embedding RAG:** chunk by `##` headers (each section is self-contained). Frontmatter
`summary` makes a good chunk preamble.
Suggested system-prompt pointer: *"When writing Five (.prg) code, consult the Five RAG at
`fivedev/five/rag/` — especially `05-gotchas.md` — and prefer patterns from `04-idioms.md`."*
## Maintenance
- Keep `03-rtl-catalog.md` honest against `hbrtl/register.go` (names are authoritative;
rare signatures may drift).
- **Append every new trap to `05-gotchas.md`.** That file is the compounding asset.
- Grammar truth: `compiler/{lexer,parser,ast}`. Idiom truth: the `solmade` app.