Three medium-priority audit items in one commit, each independently
revertible.
* **#18 JOIN hash-join fast path.** New std.ch shape:
JOIN WITH <alias> TO <file> [FIELDS ...] ON <mfield> = <dfield>
expands to a 6-arg __dbJoin call with the master/detail key
field names. Runtime detects the extra args, builds an O(M)
hash over the detail's key column, then probes per master row
for O(N+M) total — vs the FOR form's O(N*M). For 1k×1k that's
2k vs 1M operations; the gap widens with N. The original FOR
form is unchanged and stays the fallback for arbitrary
predicates. New helper dbHashKey type-tags the key string so
`1` (numeric), `"1"` (string), and `.T.` (logical) don't
collide in the bucket map.
* **#38 PP rule result-marker validation.** ParseRule now walks
the result template after parseMarkers and warns about every
`<name>` (or `<(name)>` / `<.name.>` / `<{name}>` / `#<name>`
/ `<"name">`) that doesn't match a pattern marker. Warnings
flow into pp.errors via handleDirective with the directive's
filename:line, so a typo'd `<NaMe>` in an `#xcommand`
case-sensitive rule fails the build with a clear diagnostic
instead of silently producing broken expansions.
* **#44 looksLikeInlineC heuristic strengthened.** Catches more
of the common Harbour-PRG-with-C-inline-block shapes that
used to fall through and produce cryptic Go-side errors:
function-like #define, `extern "C"` linkage blocks, C return-
type declarations (`int foo(`, `static char* bar(`), and the
hb_ret*() helper family used by Harbour's C FFI return
setters. Two small predicate helpers (allLetters,
allIdentChars) keep the C-vs-Go disambiguation tight enough
that legit Go code (`func name() int { ... }`) doesn't trip.
* **#28 LIST/DISPLAY pagination** — explicitly deferred. Proper
pagination requires interactive terminal handling (Inkey(0)
for the keypress) which would hang in CI / batch mode. Will
revisit when an interactive terminal layer needs it for
other reasons.
Test fixtures: tests/std_ch/test_join_hash.prg verifies the new
ON-form path produces the same output as the FOR form would.
std.ch runner now stands at 16/16.
Other gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
std.ch suite : 16/16
FRB suite : 7/7
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
216 lines
10 KiB
Plaintext
216 lines
10 KiB
Plaintext
/*
|
|
* std.ch — Five standard preprocessor rules
|
|
*
|
|
* Equivalent to harbour-core/include/std.ch. Translates xBase legacy
|
|
* commands into function calls so the parser does not have to know
|
|
* about them. Auto-loaded by compiler/pp at startup.
|
|
*
|
|
* Phase A: only rules whose backend RTL function already exists in
|
|
* Five. Rules whose backend is not yet implemented (COPY, SORT,
|
|
* COUNT, SUM, AVERAGE, TOTAL, JOIN, LIST, DISPLAY, LABEL, REPORT,
|
|
* DIR) are deliberately NOT included here — the parser still handles
|
|
* them as silent no-ops until their RTL backend lands.
|
|
*
|
|
* Copyright (c) 2026 Charles KWON OhJun (charleskwonohjun@gmail.com)
|
|
* All rights reserved.
|
|
*/
|
|
|
|
/* --- file system --- */
|
|
#command ERASE <(f)> => FErase(<(f)>)
|
|
#command DELETE FILE <(f)> => FErase(<(f)>)
|
|
#command RENAME <(s)> TO <(d)> => FRename(<(s)>, <(d)>)
|
|
|
|
/* --- workarea lifecycle ---
|
|
Order matters: literal-keyword forms first, then bare CLOSE,
|
|
then the alias-form last so it doesn't shadow the others. */
|
|
#command CLOSE ALL => DbCloseAll()
|
|
#command CLOSE DATABASES => DbCloseAll()
|
|
#command CLOSE => DbCloseArea()
|
|
#command CLOSE <a> => <a>->( DbCloseArea() )
|
|
|
|
/* --- record state --- */
|
|
#command COMMIT => DbCommit()
|
|
#command UNLOCK ALL => DbUnlock()
|
|
#command UNLOCK => DbRUnlock()
|
|
|
|
/* --- record search --- */
|
|
#command LOCATE [FOR <for>] [WHILE <while>] ;
|
|
[NEXT <next>] [RECORD <rec>] [<rest:REST>] [ALL] => ;
|
|
__dbLocate(<{for}>, <{while}>, <next>, <rec>, <.rest.>)
|
|
|
|
#command CONTINUE => __dbContinue()
|
|
|
|
/* --- analytical (no extra RTL — just dbEval) ---
|
|
These mirror Harbour's std.ch but use single-value forms. Multi-
|
|
expression SUM/AVERAGE (`SUM x, y TO sx, sy`) use optional-repeat
|
|
syntax in Harbour and can be added here once a real test exercises
|
|
the more elaborate form. */
|
|
/* COUNT/SUM/AVERAGE require TO <var> — without it the rewrite
|
|
would produce naked assignment with no LHS. Match Harbour
|
|
std.ch which also makes TO non-optional. */
|
|
#command COUNT TO <v> [FOR <for>] [WHILE <while>] ;
|
|
[NEXT <next>] [RECORD <rec>] [<rest:REST>] [ALL] => ;
|
|
<v> := 0 ; dbEval( {|| <v> := <v> + 1 }, ;
|
|
<{for}>, <{while}>, <next>, <rec>, <.rest.> )
|
|
|
|
/* SUM and AVERAGE accept multiple paired expressions/destinations:
|
|
`SUM x, y, z TO sx, sy, sz`. The optional `[, <xN>]` and
|
|
`[, <vN>]` repeats are matched pairwise; the result template's
|
|
chained `<v1> :=[ <vN> :=] 0` and comma-list inside the dbEval
|
|
block expand once per extra pair. Single-pair usage is unchanged. */
|
|
#command SUM <x1> [, <xN>] TO <v1> [, <vN>] ;
|
|
[FOR <for>] [WHILE <while>] [NEXT <next>] ;
|
|
[RECORD <rec>] [<rest:REST>] [ALL] => ;
|
|
<v1> :=[ <vN> :=] 0 ; ;
|
|
dbEval( {|| <v1> := <v1> + <x1>[, <vN> := <vN> + <xN>] }, ;
|
|
<{for}>, <{while}>, <next>, <rec>, <.rest.> )
|
|
|
|
#command AVERAGE <x> TO <v> ;
|
|
[FOR <for>] [WHILE <while>] [NEXT <next>] ;
|
|
[RECORD <rec>] [<rest:REST>] [ALL] => ;
|
|
<v> := __dbAverage( <{x}>, ;
|
|
<{for}>, <{while}>, <next>, <rec>, <.rest.> )
|
|
|
|
/* --- bulk record export ---
|
|
COPY TO copies visible records of the current workarea into a fresh
|
|
DBF. FIELDS/FOR/WHILE/NEXT/RECORD/REST work as in Harbour. SDF and
|
|
DELIMITED variants are not implemented; the matching rules below
|
|
raise a clear runtime error so callers don't quietly get a regular
|
|
DBF copy when they asked for an SDF dump. Order matters: the SDF /
|
|
DELIMITED rules must come before the regular COPY rule. */
|
|
#command COPY [TO <(f)>] [FIELDS <fields,...>] SDF [<*tail*>] => ;
|
|
__dbNotImpl("COPY TO ... SDF")
|
|
#command COPY [TO <(f)>] [FIELDS <fields,...>] DELIMITED [<*tail*>] => ;
|
|
__dbNotImpl("COPY TO ... DELIMITED")
|
|
|
|
#command COPY [TO <(f)>] [FIELDS <fields,...>] ;
|
|
[FOR <for>] [WHILE <while>] [NEXT <next>] ;
|
|
[RECORD <rec>] [<rest:REST>] [ALL] => ;
|
|
__dbCopy( <(f)>, { <(fields)> }, ;
|
|
<{for}>, <{while}>, <next>, <rec>, <.rest.> )
|
|
|
|
/* SORT TO copies the visible records into a fresh DBF in key order.
|
|
Each key in `<fields>` may carry `/D` for descending; default is
|
|
ascending. */
|
|
#command SORT [TO <(f)>] [ON <fields,...>] ;
|
|
[FOR <for>] [WHILE <while>] [NEXT <next>] ;
|
|
[RECORD <rec>] [<rest:REST>] [ALL] => ;
|
|
__dbSort( <(f)>, { <(fields)> }, ;
|
|
<{for}>, <{while}>, <next>, <rec>, <.rest.> )
|
|
|
|
/* --- console output ---
|
|
LIST emits every record matching the filter; DISPLAY without ALL
|
|
shows just the current record. Both share __dbList — lAll
|
|
distinguishes them. TO FILE redirects to a freshly-truncated text
|
|
file; TO PRINTER is rejected at PP-time (Five doesn't drive a
|
|
printer port). Order matters: more specific rules first. */
|
|
#command LIST [<v,...>] TO PRINTER [<*tail*>] => ;
|
|
__dbNotImpl("LIST ... TO PRINTER")
|
|
#command DISPLAY [<v,...>] TO PRINTER [<*tail*>] => ;
|
|
__dbNotImpl("DISPLAY ... TO PRINTER")
|
|
|
|
#command LIST [<v,...>] TO FILE <(f)> [<off:OFF>] ;
|
|
[FOR <for>] [WHILE <while>] [NEXT <next>] ;
|
|
[RECORD <rec>] [<rest:REST>] [ALL] => ;
|
|
__dbList( <.off.>, { <{v}> }, .T., ;
|
|
<{for}>, <{while}>, <next>, <rec>, <.rest.>, <(f)> )
|
|
|
|
#command DISPLAY [<v,...>] TO FILE <(f)> [<off:OFF>] ;
|
|
[FOR <for>] [WHILE <while>] [NEXT <next>] ;
|
|
[RECORD <rec>] [<rest:REST>] [<all:ALL>] => ;
|
|
__dbList( <.off.>, { <{v}> }, <.all.>, ;
|
|
<{for}>, <{while}>, <next>, <rec>, <.rest.>, <(f)> )
|
|
|
|
#command LIST [<v,...>] [<off:OFF>] ;
|
|
[FOR <for>] [WHILE <while>] [NEXT <next>] ;
|
|
[RECORD <rec>] [<rest:REST>] [ALL] => ;
|
|
__dbList( <.off.>, { <{v}> }, .T., ;
|
|
<{for}>, <{while}>, <next>, <rec>, <.rest.> )
|
|
|
|
#command DISPLAY [<v,...>] [<off:OFF>] ;
|
|
[FOR <for>] [WHILE <while>] [NEXT <next>] ;
|
|
[RECORD <rec>] [<rest:REST>] [<all:ALL>] => ;
|
|
__dbList( <.off.>, { <{v}> }, <.all.>, ;
|
|
<{for}>, <{while}>, <next>, <rec>, <.rest.> )
|
|
|
|
/* TOTAL TO writes one record per consecutive run of equal key values
|
|
from the source. Numeric fields named in FIELDS are summed; every
|
|
other (non-memo) field takes the first record's value. The source
|
|
must already be sorted/indexed on the key for the grouping to
|
|
produce one row per distinct value.
|
|
|
|
Note on key syntax — TOTAL evaluates `<key>` only in the source
|
|
workarea, so `<{key}>` (verbatim blockify) is enough; user can
|
|
write `ON src->dept` (alias-qualified) or `ON _FIELD->dept`
|
|
(current-area). UPDATE FROM evaluates the key block in BOTH
|
|
master and detail context and therefore needs `_FIELD->`-wrapped
|
|
bare keys instead — the two rules look superficially similar but
|
|
their evaluation contexts differ. */
|
|
#command TOTAL TO <(f)> ON <key> [FIELDS <fields,...>] ;
|
|
[FOR <for>] [WHILE <while>] [NEXT <next>] ;
|
|
[RECORD <rec>] [<rest:REST>] [ALL] => ;
|
|
__dbTotal( <(f)>, <{key}>, { <(fields)> }, ;
|
|
<{for}>, <{while}>, <next>, <rec>, <.rest.> )
|
|
|
|
/* JOIN merges the current ("master") workarea with the named
|
|
detail alias into a fresh DBF, emitting one output row per
|
|
master/detail pair where FOR evaluates true.
|
|
|
|
The ON form takes the equality-key field names directly and
|
|
activates a hash-join fast path: build a hash over the detail's
|
|
key column once, then probe per master row — O(N+M) total instead
|
|
of the FOR form's O(N*M) nested-loop. Use it whenever the
|
|
join predicate is a simple `master.k = detail.k` equality. The
|
|
FOR form remains available for arbitrary predicates. Order
|
|
matters: the ON rule is more specific so it wins.
|
|
|
|
Note for callers: ON expects bare field names, not expressions.
|
|
Five doesn't auto-resolve bare identifiers to fields, but std.ch
|
|
passes them as quoted strings via <(mfield)> / <(dfield)> so the
|
|
PP captures the field names verbatim — runtime-side __dbJoin
|
|
does its own field lookup. */
|
|
#command JOIN WITH <(alias)> TO <(f)> [FIELDS <fields,...>] ;
|
|
ON <mfield> = <dfield> => ;
|
|
__dbJoin( <(alias)>, <(f)>, { <(fields)> }, NIL, ;
|
|
<(mfield)>, <(dfield)> )
|
|
|
|
#command JOIN [WITH <(alias)>] [TO <(f)>] [FIELDS <fields,...>] ;
|
|
[FOR <for>] => ;
|
|
__dbJoin( <(alias)>, <(f)>, { <(fields)> }, <{for}> )
|
|
|
|
/* UPDATE FROM walks the named detail alias and applies the
|
|
REPLACE ... WITH ... clauses to the matching master record.
|
|
Both areas should be sorted on the key for the default forward-
|
|
walk; pass RANDOM to scan master from top for each detail key.
|
|
|
|
Note 1: ON <key> is wrapped as `_FIELD-><key>` rather than the bare
|
|
`<{key}>` Harbour uses, because the same block must evaluate
|
|
against both master and detail. Bare identifiers don't auto-bind
|
|
to fields under Five — `_FIELD->` makes the dispatch explicit.
|
|
|
|
Note 2: FROM/ON/REPLACE are all required (Harbour technically allows
|
|
them in any order but every real call site provides all three). The
|
|
former optional brackets allowed compile-clean garbage like a bare
|
|
`UPDATE` to expand to a broken-syntax call. Keep them mandatory. */
|
|
#command UPDATE FROM <(alias)> ON <key> [<rand:RANDOM>] ;
|
|
REPLACE <f1> WITH <x1> [, <fN> WITH <xN>] => ;
|
|
__dbUpdate( <(alias)>, {|| _FIELD-><key> }, <.rand.>, ;
|
|
{|| _FIELD-><f1> := <x1>[, _FIELD-><fN> := <xN>] } )
|
|
|
|
/* --- bulk maintenance --- */
|
|
#command REINDEX => DbReindex()
|
|
#command PACK => DbPack()
|
|
#command ZAP => DbZap()
|
|
|
|
/* --- input / shell --- */
|
|
#command KEYBOARD <text> => Keyboard(<text>)
|
|
#command RUN <*cmd*> => hb_Run(<(cmd)>)
|
|
|
|
/* --- legacy GET system ---
|
|
MENU TO is intentionally absent: it requires the @ PROMPT statement
|
|
companion which Five doesn't implement. Adding the rule would let
|
|
user code compile and then panic at runtime on the missing
|
|
__MenuTo() symbol. Keep the parser's silent no-op for MENU TO until
|
|
@ PROMPT lands. */
|
|
#command CLEAR GETS => GetList := {}
|