Four audit findings around correctness/consistency in std.ch and the
SORT/UPDATE/TOTAL handlers:
* #13: TOTAL/UPDATE key idiom inconsistency documented as inherent.
TOTAL evaluates `<key>` only in the source workarea so verbatim
`<{key}>` (alias-qualified or `_FIELD->`-prefixed by the user)
works. UPDATE evaluates the same block in BOTH master and detail
context, so it must wrap as `_FIELD-><key>` to dispatch to
whichever WA is selected at eval time. The two rules look alike
but their evaluation contexts differ — also documented in
std.ch alongside both rules so the asymmetry isn't a surprise.
Plus: TOTAL TO and ON are now mandatory (matching the COUNT/
UPDATE pattern from Wave 1) — bare TOTAL would have produced
broken syntax via the unconditional `<(f)>`/`<{key}>` template
references.
* #15/#16: SDF / DELIMITED variants of COPY and TO PRINTER /
TO FILE variants of LIST / DISPLAY are now matched by stub
rules (placed *before* the regular rules so they win) that
expand to a new `__dbNotImpl(reason)` RTL primitive raising a
clear `&hbrt.HbError`. BEGIN SEQUENCE / RECOVER catches the
panic, so callers get a real error instead of the previous
silent dispatch-to-regular-DBF-copy.
* #19: SORT /C (case-insensitive) now actually folds case before
the string compare, instead of being silently treated as
ascending. Suffix parser also rebuilt as a multi-letter scanner
so `name/CD`, `name/DC`, `name/C/D`, `name/D/C` all parse the
same way — combine /C and /D freely. Unknown suffix letters
(e.g., `name/X`) leave the suffix attached to the field name
so a stray slash in user input doesn't get silently mangled
into a broken field reference.
* #27 SET DELETED: verified with a regression test that
`SET DELETED ON` causes COUNT/COPY (and by extension
SORT/TOTAL/JOIN/UPDATE — all of which iterate via Area.Skip)
to skip rows marked deleted. The filtering is implemented at
the workarea level (skipFilter in dbf.go honors hbrdd.IsSetDeleted)
so no RTL changes were needed; this commit just adds the
coverage so the behavior doesn't silently regress.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five concrete gaps the audit flagged in the new __dbCopy / __dbSort /
__dbTotal / __dbJoin / PP code:
* wam.Close() errors were dropped on the floor. Caller saw `.T.`
even when the just-written DBF wasn't durable, leading to the
classic "delete the source after the COPY succeeds" data-loss
pattern. All four functions now capture the close error and
return `.F.` if it fired.
* drv.Create succeeded → wam.Open failed → orphaned-on-disk DBF.
The user-named target file was left around with zero records,
and the next call's drv.Create silently truncated it instead of
surfacing the original error. Add `os.Remove(cFile)` on the
Open-failure cleanup path for COPY/SORT/TOTAL/JOIN.
* __dbTotal would write the DBF codec's overflow sentinel
(`*****`) into the destination's sum-fields when a group total
didn't fit in the source's declared field width, and still
return `.T.`. Now: precompute each sum-field's max representable
magnitude (10^(Len-Dec)) at start, mark the run as overflowed if
any flush sees an out-of-range or NaN value, and propagate
`.F.` to the caller so they don't trust the file.
* cleanUnreferencedMarkers walked byte-by-byte and stripped any
`<ident>` token in the result, INCLUDING ones that appear
inside `"..."` / `'...'` string literals. A user expression
like `LIST FOR url == "<a>x</a>"` got the `<a>` and `</a>`
eaten on output. Now: track string-literal state and skip the
cleanup pass while inside one. Bracket-strings `[…]` are
intentionally not treated as strings here — the result template
uses `[...]` as the optional-repeat marker, and disambiguating
needs context the cleanup pass doesn't have.
* (#8 SET SAFETY honoring) deferred. Harbour default is SAFETY
OFF, so the current always-overwrite behavior matches default
Harbour. The divergence only matters when user explicitly does
`SET SAFETY ON`, which Five doesn't support yet — so the
no-overwrite-protection is consistent end-to-end. Tracked as a
separate followup.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`UPDATE [FROM <alias>] [ON <key>] [RANDOM] REPLACE <f1> WITH <x1>
[, <fN> WITH <xN>]` becomes a preprocessor rewrite to a new RTL
primitive __dbUpdate. For each detail record, find the master
record with matching key (forward-walk if both sorted, full scan
when RANDOM) and apply the REPLACE clauses in master's context.
Same shape as harbour-core/src/rdd/dbupdat.prg. The REPLACE clauses
expand to comma-separated assignments inside one block —
`{|| _FIELD->total := del->amt, _FIELD->status := "OK" }` — using
the multi-pair `[, <fN> WITH <xN>]` optional-repeat that std.ch
already establishes for SUM and DEFAULT.
Five-specific tweak: ON <key> wraps as `{|| _FIELD-><key> }` rather
than Harbour's bare `<{key}>`. Five doesn't auto-resolve a bare
identifier in a code block to the current workarea's field, and the
UPDATE block must evaluate against both detail and master so an
explicit alias prefix won't do — _FIELD-> dispatches to whichever
area is selected at eval time, which is what's needed.
Wiring up UPDATE surfaced one further matchSegment gap that fell
out of the multi-pair `[REPLACE ... [, ...]]` shape:
* matchSegment didn't handle nested `[...]` inside its body.
`[REPLACE <f1> WITH <x1> [, <fN> WITH <xN>]]` gave the inner
`[` as a literal token to match against the line, so even the
single-pair `REPLACE total WITH del->amt` form failed and f1/x1
came back empty. Now matchSegment runs the same repeat-loop on
inner `[...]` blocks that the top-level matcher uses, with its
own outer-tail computed from the segment tail past the inner
`]`.
Parser cleanup: UPDATE removed from the IDENT-statement no-op switch.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`JOIN WITH <alias> TO <file> [FIELDS <list>] [FOR <expr>]` becomes a
preprocessor rewrite to a new RTL primitive __dbJoin. Cartesian
product of the current ("master") workarea and the named "detail"
alias, filtered by the FOR expression.
Output structure:
* No FIELDS clause: master's fields followed by detail's, dropping
any detail-side name that clashes with master.
* FIELDS list: one column per name in declaration order, resolved
against master first then detail.
Same shape as harbour-core/src/rdd/dbjoin.prg. Five-specific
simplifications: alias->name in FIELDS not yet supported (bare
names with master-precedence lookup); RDD/codepage args dropped
since Five only has DBFNTX.
Note for callers: don't name a workarea `M` or `MEMVAR` — both are
Harbour-reserved memvar aliases, so `M->field` and `MEMVAR->field`
always go through the memory-variable namespace, not the workarea.
This is gengo behavior matching Harbour, not new in this commit.
Parser cleanup: JOIN removed from the IDENT-statement no-op switch.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`TOTAL TO <file> ON <key> [FIELDS <list>] [FOR ...] [WHILE ...]
[NEXT ...] [RECORD ...] [REST] [ALL]` joins the family of std.ch
DML rewrites. New RTL primitive __dbTotal:
* Walk the source under dbEval-style FOR/WHILE/NEXT/RECORD/REST
bounds. The source must already be sorted/indexed on the key —
same precondition as Harbour's dbtotal.prg.
* Track the current group key. On each key change, flush the
accumulated row to the destination (writing the running totals
back into the most recently appended record's sum-fields,
preserving each field's declared length/decimals).
* On the *first* record of every group, append a fresh dst row
and copy all non-memo source fields into it; subsequent records
in the group only contribute to the sums. Net effect: non-summed
fields take the first record's value, summed fields hold the
group total. Same shape as harbour-core/src/rdd/dbtotal.prg.
* Memo fields are dropped from the destination structure (Harbour
does the same).
Parser cleanup: TOTAL removed from the IDENT-statement no-op switch.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`LIST [<fields>] [OFF] [FOR ...] [WHILE ...] [NEXT ...] [RECORD ...]
[REST] [ALL]` and `DISPLAY [<fields>] [OFF] [FOR ...] ... [ALL]`
reach the parser as plain function calls to a new RTL primitive
__dbList (rtlDbList in hbrtl/database.go).
Implementation: walk the workarea under dbEval-style FOR/WHILE/NEXT/
RECORD/REST bounds. For each visible record, evaluate each column
block and emit the rendered values via valueToDisplay (the same
formatter QOut already uses). Empty fields list defaults to
"all fields". OFF suppresses the record-number prefix.
LIST always emits the full filtered range; DISPLAY without ALL emits
only the current record (encoded as nCount=1). TO PRINTER / TO FILE
clauses are not yet wired through — for now everything goes to
stdout.
Wiring up LIST/DISPLAY surfaced four further gaps in PP that were
silently masking bugs in any rule with multiple word-list / list /
optional clauses chained together:
* matchSegment refused MarkerWordList inside `[...]`. The LIST
rule's `[<off:OFF>]` clause therefore never set the off
capture, and `<.off.>` substituted to nothing instead of .T./.F.
matchSegment now matches WordList markers the same way the
top-level matcher does.
* `<v,...>` and `<(f)>` capture stop boundaries didn't include the
values of following MarkerWordList markers. For
`[<v,...>] [<off:OFF>] [<all:ALL>]` against `LIST id, name OFF`,
the v list would happily eat OFF. New addStopFrom helper
contributes both literal keywords and word-list values; both
matchSegment's MarkerList branch and captureExpression now use
it.
* Optional-repeat loop in matchPattern merged a no-progress
iteration's empty capture into the running multi-capture string
(with the `\x01` separator) before the no-progress break check
fired. So a successful first iteration's value got contaminated
and the substitution loop then skipped it as multi-capture
garbage. The merge now happens after the progress check.
* Unreferenced `<.name.>` markers (optional clauses that didn't
match in the input) were getting cleaned up to empty by the
generic marker scrubber instead of the .F. sentinel Harbour's
std.ch expects. New replaceUnreferencedLogify pass mirrors the
existing replaceUnreferencedBlockify and runs just before the
cleanup.
Parser cleanup: LIST and DISPLAY removed from the IDENT-statement
no-op switch in both parseIdentStmt and parseExprStmt.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`SORT TO <file> [ON <key-list>] [FOR ...] [WHILE ...] [NEXT ...]
[RECORD ...] [REST] [ALL]` joins COPY in being a real preprocessor
rewrite to a function call. New RTL primitive __dbSort:
* Buffer visible source records (FOR/WHILE/NEXT/RECORD/REST same
as __dbCopy).
* Multi-key stable insertion sort. Each key may carry `/D` for
descending; ascending otherwise. /A and unknown suffixes fall
through as ascending. Comparison delegates to the existing
compareValues helper in sqlscan.go (numeric / string / NIL-aware).
* Create destination DBF with the source's struct, append rows in
sorted order, restore source selection.
Parser cleanup: SORT removed from the IDENT-statement no-op switch.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`COPY TO <file> [FIELDS <list>] [FOR ...] [WHILE ...] [NEXT ...]
[RECORD ...] [REST] [ALL]` reaches the parser as a plain function
call to a new RTL primitive __dbCopy (rtlDbCopy in hbrtl/database.go).
Implementation: project the field list (case-insensitive name match
against the source's structure, full copy when omitted), dbCreate the
target file with that struct, open it under a temp alias, walk the
source under dbEval-style FOR/WHILE/NEXT/RECORD/REST bounds, and
GetValue/Append/PutValue per record into the target. SDF / DELIMITED
variants stay parser no-ops until those backends arrive.
Wiring up COPY surfaced four longstanding gaps in the PP that had to
be fixed for the rule to even reach the runtime:
* `<(name)>` *pattern* marker was treated as a regular `<name>`
with the parens baked into the captured key, so the matching
result substitution `<(name)>` couldn't find it. parseOneMarker
now strips the parens at parse time so capture key and result
marker share the bare name. The smart-stringify result behavior
is unchanged.
* matchSegment (the optional-clause matcher) bailed on every
non-Regular marker. `[FIELDS <fields,...>]` therefore failed to
match at all and the fields list arrived empty in the result
template. matchSegment now handles MarkerList with paren-balanced
capture and segment+outer literal stop boundaries.
* captureExpression only used the first literal in the pattern
tail as a stop boundary. With std.ch's chain of optional
clauses (`[TO <(f)>] [FIELDS ...] [FOR ...] [WHILE ...] ...`)
the file-name marker was happy to gobble a trailing FOR clause
when FIELDS was absent. It now stops at *any* of the remaining
pattern literals.
* `<(name)>` smart-stringify on a list-typed capture wrapped the
whole comma-joined string in one set of quotes — `{ "a , b" }` —
instead of `{ "a", "b" }`. New helper quoteListElements splits on
top-level commas (paren / bracket / brace / string-balanced) and
quotes each element. applyResult now consults the rule's marker
table to know which captures came from `<name,...>`.
Parser cleanup: COPY removed from the IDENT-statement no-op switch in
both parseIdentStmt and parseExprStmt.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three xBase analytical commands that were silent no-ops in the
parser now execute as Harbour-style PP rewrites:
COUNT [TO <v>] [FOR <for>] [WHILE <while>] ... -> dbEval()
SUM <x> TO <v> [FOR <for>] [WHILE <while>] ... -> dbEval()
AVERAGE <x> TO <v> [FOR ...] -> __dbAverage()
COUNT and SUM expand to a `<v> := 0 ; dbEval( {|| ... } )` pair
matching harbour-core/include/std.ch verbatim. AVERAGE delegates to
a new RTL function rtlDbAverage (sum + count + divide; returns 0 on
empty match) — the chained-private-variable trick Harbour uses to
keep AVERAGE inline doesn't translate cleanly through Five's PP.
Wiring up these rules surfaced four PP issues that had to be fixed
for the rewrite to even reach the parser:
* Result template did not implement <{name}> blockify. So a rule
body like `{|| x := x + <x> }, <{for}>` left the literal text
`<{for}>` in the output. Added blockify substitution: captured
-> `{|| <captured> }`, missing -> NIL.
* findMarkerEnd did not recognise `{`/`}` so unreferenced
blockify markers were not cleaned up either. Added `{`/`}` to
its prefix/suffix sets.
* Optional-clause matching had no view of the outer pattern, so a
regular marker at the end of `[TO <v>]` would swallow the rest
of the line — `COUNT TO n FOR x>5` captured `<v>` as
"n FOR x>5". matchSegment now takes outerTail and stops at its
first literal.
* `#command` directives could not span multiple physical lines.
A trailing `;` is harbour-core's line-continuation marker for
std.ch and now joins the next line into the directive before
parsing.
Parser cleanup: COUNT, SUM, AVERAGE removed from the IDENT-statement
no-op switch in parseIdentStmt + parseExprStmt. The remaining xBase
verbs (COPY, SORT, TOTAL, JOIN, LIST, DISPLAY, LABEL, REPORT, ...)
stay in the parser until their RTL backends arrive.
Gates green:
go test ./... : PASS
FiveSql2 SQL:1999 : 43/43
Harbour compat : 56/56
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the FLOCK/DBRLOCK/DBRUNLOCK no-op stubs with actual
fcntl(F_SETLK) byte-range advisory locks, matching Harbour's
hb_fsLockLarge implementation.
Before: rtlDbRLock always returned .T. regardless of contention.
Multi-process writers could silently corrupt records.
After: Non-blocking POSIX byte-range locks per file descriptor.
Cross-process exclusion verified by a subprocess-spawning
Go test that witnesses BUSY vs OK transitions.
New files:
hbrdd/dbf/locks_posix.go fcntl F_WRLCK/F_UNLCK wrappers
hbrdd/dbf/locks_windows.go stub (TODO: LockFileEx)
hbrdd/dbf/lock_multi_test.go cross-process verification
docs/gap-analysis.md honest Harbour parity assessment
Modified:
hbrdd/dbf/dbf.go
- DBFArea gains fileLocked bool + lockedRecs map
- Close() calls releaseAllLocks() before dropping the fd
hbrtl/database.go
- rtlDbRLock / rtlDbRUnlock now delegate to DBFArea.LockRecord /
UnlockRecord instead of returning fixed .T./NIL
- New rtlFLock / rtlDbUnlock for FLOCK() / DBUNLOCK()
hbrtl/register.go
- FLOCK and DBUNLOCK symbols registered (were missing entirely)
compiler/analyzer/analyzer.go
- FLOCK / DBUNLOCK added to RTL known-function set
Lock region layout (non-overlapping on purpose):
FLOCK region [0, HeaderLen+1)
Record N region [RecordOffset(N), RecordLen)
So a workarea can hold FLOCK and multiple DBRLOCK simultaneously
on the same fd without conflict.
Design rationale (captured in locks_posix.go header):
* POSIX fcntl, not flock(2) — byte-range + NFS-safe
* Non-blocking F_SETLK — matches Clipper FLOCK() → .F. semantics
* Released explicitly on Close to avoid workarea-sharing races
* Windows falls back to no-op (TODO: LockFileEx)
Verification:
go test ./hbrdd/dbf/ -run TestFLockBlocksAcrossProcesses PASS
go test ./hbrdd/dbf/ -run TestRLockBlocksAcrossProcesses PASS
go test ./... ALL PASS
FiveSql2 43/43 100%
compat_harbour 51/51 100%
The gap-analysis doc (docs/gap-analysis.md) is a running inventory
of what works vs what's still missing vs Harbour 3.2, written for
users evaluating Five for production — not a sales pitch.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. SOFTSEEK: use idx.CurRecNo() for positioning (was checking recNo > 0)
- SEEK with SET SOFTSEEK ON now positions at next higher key
- SEEK command reads SET SOFTSEEK at runtime (was compile-time only)
- rtlDbSeek defaults to GetSetSoftSeek() when no explicit param
2. SET DELETED ON + INDEX: SkipIndexed skips deleted records
- GoTopIndexed: skip deleted record at top position
- SkipIndexed: inner loop continues past deleted records
3. Compound key (CITY+NAME): field name TrimSpace before lookup
- evalKeyExprInner: TrimSpace on fieldName after FIELD-> strip
- Fixed "CITY " != "CITY" mismatch from + operator splitting
4. SET INDEX TO filename: treated as string, not variable
- gengo uses exprToString for SET INDEX TO (was emitExpr)
- Prevents identifier being resolved as local variable
5. hasXBaseCommands: recursive scan into nested blocks
- BEGIN SEQUENCE, IF, FOR, DO WHILE, SWITCH bodies now scanned
- Fixes missing hbrdd import for DB commands inside blocks
Thorough test: 77 items (14 sections) covering exact/partial/soft seek,
SET DELETED, duplicate keys, numeric keys, compound keys, empty/single
table, state consistency, order switching, full traversal — all identical.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- skipFilter: skip deleted records in GoTop/GoBottom/Skip when SET DELETED ON
- hbrdd.IsSetDeleted callback: avoids circular import hbrdd→hbrtl
- Parser: capture ON/OFF for boolean SET commands (DELETED, EXACT, SOFTSEEK, etc.)
- Parser: capture TO expr for SET DATE/DECIMALS/EPOCH
- Gengo: emit proper t.Do() calls for 11 SET toggles + 3 value SETs
- stmtSet: was stub (skipToEOL), now calls parseSet()
- RTL: register 11 SET toggle functions (SETDELETED, SETEXACT, etc.)
- RTL: DBLOCATE/DBCONTINUE for sequential search
- RTL: DBSETFILTER/DBCLEARFILTER/DBFILTER
- PadL/PadR: support 3rd param fill character
- Area interface: added SetFound, SetLocate, LocateBlock, filter methods
- MemRDD: implements new Area interface methods
- Comprehensive PRG test: test_search.prg (7 test suites all pass)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>