Files
five/_FiveSql2
CharlesKWON 64b7cf6676 perf(FiveSql2): compound-AND equi-join picks up hash path — CTE+JOIN 22x
FiveSql2's HashJoin only recognized bare equi-terms (xOnCond[1]=ND_BIN,
xOnCond[2]="="), so a compound ON predicate like

    ON e.dept_id = t.dept_id AND e.salary = t.max_sal

fell through to the nested-loop ELSE branch:

    dbSelectArea(nInnerWA)
    dbGoTop()
    WHILE !Eof()
        IF SqlIsTrue(EvalExpr(xOnCond))
            JoinRecurse(...)
        ENDIF
        dbSkip()
    ENDDO

That's O(outer × inner) per outer row, re-evaluating the full AND tree
every probe. Query Q7 in the complex benchmark (CTE top_emp joined back
to emp on compound key) ran at 4.6 seconds for 100 inner × 10k outer.

Fix has two pieces:

1. **Probe-term extraction in JoinRecurse**: when xOnCond is an AND,
   walk the left-associative chain looking for the first equi-term
   (`a.x = b.x`). Use that as the hash-probe key, drive the normal
   hash-join code path through it.

2. **Post-filter in HashJoin**: after a hash match, if the *original*
   xOnCond was compound, re-evaluate the full predicate with
   EvalExpr to drop matches that satisfied the hash key but not the
   rest of the AND (e.g. same dept but different salary). Bare equi-
   joins still skip the re-eval — the hash match is conclusive.

Bench (10k × 100 × compound ON predicate):

  Query                          Before     After    Speedup
  ─────────────────────────────────────────────────────────
  Q7 CTE + JOIN compound ON      4573ms     209ms    21.9x

Still works for the existing bare equi case (43-test unchanged) and
the 3-way JOIN case (no regression). Falls back to the generic nested
loop only when no probe-term can be extracted at all.

Validation:
  - FiveSql2 43/43
  - Harbour compat 51/51
  - go test ./... ALL PASS
  - Q7 result: 100 rows (correct)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 20:31:27 +09:00
..

FiveSql2 — SQL Engine for Harbour DBF/NTX/CDX

Pratt parser + SQL:1992-2023 full standard support Supports both NTX (Clipper) and CDX (FoxPro/ADS) indexes

Architecture

five_SQL("SELECT ...")
   │
   ├── TSqlLexer        Tokenizer
   ├── TSqlParser2      Pratt parser (data-driven operators)
   ├── TSqlExecutor     Query executor (Volcano model)
   │     ├── TSqlAlias  Central alias manager (no collisions)
   │     ├── TSqlIndex  NTX/CDX index optimization (auto-detect)
   │     ├── TSqlAgg    GROUP BY / aggregation
   │     ├── TSqlSort   ORDER BY / DISTINCT
   │     ├── TSqlDDL    CREATE/DROP/ALTER TABLE/INDEX
   │     └── TSqlTxn    BEGIN/COMMIT/ROLLBACK
   ├── TSqlExpr         AST nodes + expression evaluation
   └── TSqlFunc         60+ scalar functions

Build & Test

export PATH="/path/to/harbour-core/bin/linux/gcc:$PATH"
export HB_INSTALL_PREFIX="/path/to/harbour-core"

make          # Build all tests
make test     # Run all 157 tests
make bench    # Parser benchmark
make clean    # Clean

SQL Standard Coverage

Standard Features Tests
SQL:1992 SELECT, JOIN, GROUP BY, HAVING, Subquery, CASE, CAST 43
SQL:1999 CTE, Recursive CTE, Window Functions, MERGE 10
SQL:2003 SIMILAR TO, GROUPING SETS, LATERAL, Window frames 64
SQL:2008 FETCH/OFFSET, FOR UPDATE, Extended MERGE (incl.)
SQL:2016 JSON functions, LISTAGG (incl.)
SQL:2023 ANY_VALUE, GREATEST/LEAST, BOOL_AND/OR (incl.)
Challenge LeetCode-level complex queries 15
Extreme Production analytics stress tests 15

Adding New Operators

Edit TSqlParser2.prg, method InitInfixTables():

::hInfixTT[ TK_MYOP ] := { "<=>", 40, 41, ND_BIN }

One line. No structural changes needed.

Copyright (c) 2025-2026 Charles KWON (Charles KWON OhJun) Email: charleskwonohjun@gmail.com All rights reserved.