From 59d7e490b4c95b2483111f6ff19920cb072808a1 Mon Sep 17 00:00:00 2001 From: CharlesKWON Date: Mon, 15 Jun 2026 16:26:54 +0900 Subject: [PATCH] docs(rag): quality-gate idiom + dependency-free search.sh MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - 04-idioms: document the lint.sh + smoke_test.sh gates and their wiring (build.sh gate, pre-commit hook, deploy-time smoke). - search.sh: ripgrep/grep keyword ranker over the corpus (keywords ×3 + body), prints ranked docs + matching section headers — makes the RAG searchable with no index to build. README updated. - Note: KWONDoc bluge MCP/CLI was unavailable here (MCP not connected; CLI license-gated), so search.sh delivers the "searchable" goal now; a bluge/embeddings index can ingest the same .md files later. Co-Authored-By: Claude Opus 4.8 (1M context) --- rag/04-idioms.md | 15 +++++++++++++++ rag/README.md | 7 +++++-- rag/search.sh | 45 +++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 65 insertions(+), 2 deletions(-) create mode 100755 rag/search.sh diff --git a/rag/04-idioms.md b/rag/04-idioms.md index 8ce59d8..7093bd0 100644 --- a/rag/04-idioms.md +++ b/rag/04-idioms.md @@ -180,6 +180,21 @@ Web and worker are **separate binaries** built from explicit file lists. - `--module ` sets the temp module path (must sit under the app's module so RTL packages can import the app's internal packages). +### Quality gates (lint + smoke) + +Two committed gates keep a dynamically-typed PRG codebase honest (solmade): + +- **`lint.sh`** — static checks: bans inline `;` statement separators + (`IF x ; y ; ENDIF` — hurts visual review; trailing-`;` continuation is fine) + and flags user input concatenated into SQL instead of `$1` binds. Exit 1 on + any ERROR. Fast, no server/DB needed. +- **`smoke_test.sh`** — runtime endpoint contract test (below). + +Wiring: `build.sh` runs `./lint.sh` first and aborts the build on violation; a +`.githooks/pre-commit` runs it on every commit (enable once with +`git config core.hooksPath .githooks`; bypass with `git commit --no-verify`). +Run `smoke_test.sh` at deploy time (it needs the server up), not in the hook. + ### Safety net: endpoint smoke test Dynamic typing has no compile-time guarantee, so a smoke test is the practical safety net diff --git a/rag/README.md b/rag/README.md index 2f5c004..511fa0f 100644 --- a/rag/README.md +++ b/rag/README.md @@ -29,8 +29,11 @@ Every file has YAML frontmatter (`doc`, `title`, `keywords`, `summary`) for rank - **Direct context injection (simplest):** for a small/medium task, paste the relevant doc(s). For broad work, `01`+`02`+`05` fit easily; pull `03`/`04` sections as needed. -- **Keyword retrieval (e.g. bluge / ripgrep):** index the `.md` files; rank on the - frontmatter `keywords` + body. `INDEX.md` is a hand-curated routing table. +- **Keyword retrieval (built-in):** run `./search.sh ` — a dependency-free + ripgrep/grep ranker over the corpus (frontmatter `keywords` weighted ×3 + body), + printing ranked docs with the matching `##` section headers. No index to build. + e.g. `./search.sh session token csprng` → `06-security.md §2`. `INDEX.md` is the + hand-curated routing table; a bluge/embeddings index can ingest the same `.md` files. - **Embedding RAG:** chunk by `##` headers (each section is self-contained). Frontmatter `summary` makes a good chunk preamble. diff --git a/rag/search.sh b/rag/search.sh new file mode 100755 index 0000000..857f172 --- /dev/null +++ b/rag/search.sh @@ -0,0 +1,45 @@ +#!/usr/bin/env bash +# rag/search.sh "" — Five RAG 코퍼스 키워드 검색. +# +# bluge/임베딩 검색계층이 없을 때 쓰는 의존성 없는 검색기. 각 문서를 +# frontmatter `keywords:`(가중치 3) + 본문 매치 수로 점수화해 랭킹하고, +# 매칭 섹션 헤더를 함께 보여준다. 결과를 컨텍스트에 그대로 붙여 쓰면 됨. +# +# ./search.sh session token csprng +# ./search.sh xss sanitize +set -u +RAGDIR="$(cd "$(dirname "$0")" && pwd)" +if [ $# -eq 0 ]; then + echo "usage: ./search.sh " + exit 1 +fi + +tmp="$(mktemp)" +for f in "$RAGDIR"/0*.md; do + score=0 + kwline="$(awk '/^keywords:/{print; exit}' "$f")" + for term in "$@"; do + kw=$(printf '%s' "$kwline" | grep -io "$term" | grep -c .) + bd=$(grep -io "$term" "$f" | grep -c .) + score=$(( score + kw*3 + bd )) + done + [ "$score" -gt 0 ] && printf '%d\t%s\n' "$score" "$f" >> "$tmp" +done + +if [ ! -s "$tmp" ]; then + echo "(매치 없음)" + rm -f "$tmp" + exit 0 +fi + +# 검색어를 OR 정규식으로(섹션 헤더 매칭 표시용) +pat="$(printf '%s|' "$@" | sed 's/|$//')" + +sort -t"$(printf '\t')" -k1 -rn "$tmp" | while IFS="$(printf '\t')" read -r s f; do + title="$(awk -F': ' '/^title:/{print $2; exit}' "$f")" + echo "■ [$s] $(basename "$f") — $title" + # 매칭되는 ## 섹션 헤더 상위 3개 + grep -nE "^#{1,6} " "$f" | grep -iE "$pat" | head -3 | sed 's/^/ §/' + echo +done +rm -f "$tmp"