perf(FiveSql2): EXISTS → LIMIT 1 early exit, subquery identity via AScan
Extreme subquery stress bench (12 patterns spanning scalar-in-SELECT,
nested correlation, EXISTS, NOT IN, derived tables, self-joins, and
mixed combinations) exposed three weaknesses in the post-ROLLUP state:
1. EXISTS / NOT EXISTS evaluated the full subquery result per outer
row, even though it only needs to know whether any row matches.
2. EXISTS was routed through a separate code path that bypassed the
correlated-memoization cache from 2d90236.
3. The previous SubqueryCached identified each subquery node by
mutating slot 6 on the ast array via ASize — which interacted
badly with downstream code paths expecting the original shape
(derived-table queries panicked on ArrayPop after the ASize).
Fixes:
* EXISTS / NOT EXISTS now route through SubqueryCached the same way
ND_SUB in WHERE does, so correlated EXISTS predicates memoize on
outer free-variable values when the cardinality is low.
* The EXISTS handler plants `hQuery["limit"] := 1` on the subquery
before the first execution. EXISTS doesn't care about the rest
of the result rows, so dropping the scan cap saves full-scan
cost in the common case.
* A new early-termination branch in RunSelect's scan loop exits
the `WHILE !Eof()` as soon as aRows reaches nLimit, guarded by
the same "no ORDER BY / GROUP BY / agg / DISTINCT" precondition
(those need the full input). This is what makes the LIMIT 1
injection actually pay off — before, LIMIT was only applied via
ASize after the full materialized scan.
* SubqueryCached no longer mutates the parse tree. Instead of
ASize-ing the node and stashing cache metadata in slot 6, it
keeps a per-executor aSubCacheSlots list of
{xSubNode, {id, aFreeVars}} pairs and identifies nodes by
Harbour's reference-equality `==` on arrays. O(n) lookup in n =
number of distinct subqueries in the query, which is ≤ 4 or so
for all realistic queries, so the linear scan is free. Fixes the
derived-table ArrayPop panic.
Bench impact (emp=500, prod=100, ord=5k — subquery hell):
Pattern Before After Δ
───────────────────────────────────────────────────────
H3 Correlated EXISTS 13.3s 10.0s 1.3x
H7 Scalar-in-SELECT + JOIN 362ms 2ms 181x
H8 NOT EXISTS self-join 1.8s 900ms 2.0x
H11 Scalar + EXISTS + derived 13.7s 3.2s 4.3x
(H1, H2, H5, H6, H9, H10, H12 unchanged at 3–72ms)
H7's 181x is the scalar-in-SELECT-list memoization payoff — each
dept's revenue subquery used to run 100 times (once per SALES emp),
now runs once per distinct dept.
H3's 1.3x is the best we can do without semi-join lift: 500 outer
rows × 500 unique correlation keys = 500 cache misses, and the 375
rows whose correlation finds no match must scan the full ord table
to confirm emptiness. Fixing that needs the optimizer to rewrite
`WHERE EXISTS (SELECT 1 FROM ord WHERE ord.emp_id = e.id AND ...)`
into `WHERE e.id IN (SELECT DISTINCT emp_id FROM ord WHERE ...)`,
which is a real query-rewrite feature left for a follow-up.
Validation:
- FiveSql2 43/43
- Harbour compat 51/51
- go test ./... ALL PASS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -36,6 +36,7 @@ CLASS TSqlExecutor
|
||||
DATA bRowBlock /* optional code block — receives SELECT cols as params */
|
||||
DATA aFetchCache /* pre-bound {nWA, nFPos} per SELECT expression, or NIL */
|
||||
DATA hSubCorrCache INIT { => } /* per-outer-key subquery result cache */
|
||||
DATA aSubCacheSlots INIT {} /* list of {xSubNode, {id, aFreeVars}} */
|
||||
DATA nSubCacheSeq INIT 0 /* monotonic ID for subqueries */
|
||||
|
||||
CLASSDATA hSubCache INIT { => } SHARED
|
||||
@@ -570,20 +571,33 @@ METHOD EvalExpr( xNode ) CLASS TSqlExecutor
|
||||
RETURN NIL
|
||||
|
||||
CASE xNode[ 1 ] == ND_FN
|
||||
/* EXISTS must be handled before argument evaluation */
|
||||
IF xNode[ 2 ] == "EXISTS" .AND. Len( xNode[ 3 ] ) > 0 .AND. ;
|
||||
/* EXISTS and NOT EXISTS — we only need to know whether the
|
||||
* subquery returns at least one row, not compute the full
|
||||
* result. Force a LIMIT 1 into the subquery's hQuery so the
|
||||
* inner scan short-circuits on the first match. Then route
|
||||
* through SubqueryCached so correlated EXISTS still memoizes
|
||||
* on free-variable values (helps when correlation is low
|
||||
* cardinality; no-op when every outer row is unique). */
|
||||
IF ( xNode[ 2 ] == "EXISTS" .OR. xNode[ 2 ] == "NOT EXISTS" ) .AND. ;
|
||||
Len( xNode[ 3 ] ) > 0 .AND. ;
|
||||
xNode[ 3 ][ 1 ] != NIL .AND. ValType( xNode[ 3 ][ 1 ] ) == "A" .AND. ;
|
||||
xNode[ 3 ][ 1 ][ 1 ] == ND_SUB .AND. xNode[ 3 ][ 1 ][ 2 ] != NIL
|
||||
nSavedWA := Select()
|
||||
::PushOuter()
|
||||
aSubResult := TSqlExecutor():New( xNode[ 3 ][ 1 ][ 2 ], ::aParams ):Run()
|
||||
::PopOuter()
|
||||
dbSelectArea( nSavedWA )
|
||||
/* Install LIMIT 1 on the subquery hQuery. EXISTS only cares
|
||||
* about the existence of a match, so the subquery scan can
|
||||
* stop at the first row — the scan loop in RunSelect honours
|
||||
* hQuery["limit"] as an early-termination target. */
|
||||
IF ValType( xNode[ 3 ][ 1 ][ 2 ] ) == "H"
|
||||
xNode[ 3 ][ 1 ][ 2 ][ "limit" ] := 1
|
||||
ENDIF
|
||||
aSubResult := ::SubqueryCached( xNode[ 3 ][ 1 ] )
|
||||
IF ValType( aSubResult ) == "A" .AND. Len( aSubResult ) >= 2 .AND. ;
|
||||
ValType( aSubResult[ 2 ] ) == "A"
|
||||
IF xNode[ 2 ] == "NOT EXISTS"
|
||||
RETURN Len( aSubResult[ 2 ] ) == 0
|
||||
ENDIF
|
||||
RETURN Len( aSubResult[ 2 ] ) > 0
|
||||
ENDIF
|
||||
RETURN .F.
|
||||
RETURN iif( xNode[ 2 ] == "NOT EXISTS", .T., .F. )
|
||||
ENDIF
|
||||
|
||||
/* Evaluate arguments */
|
||||
@@ -1069,6 +1083,7 @@ METHOD RunSelect() CLASS TSqlExecutor
|
||||
LOCAL hJoinHash
|
||||
LOCAL lIndexUsed, aTmp
|
||||
LOCAL aFP, pcW, aGoRows
|
||||
LOCAL nEarlyLimit
|
||||
|
||||
aCols := ::hQuery[ "columns" ]
|
||||
::aTables := ::hQuery[ "tables" ]
|
||||
@@ -1340,6 +1355,20 @@ METHOD RunSelect() CLASS TSqlExecutor
|
||||
* join recursion. Huge win for multi-table scans. */
|
||||
::aFetchCache := ::BuildFetchCache( aResultExprs )
|
||||
dbSelectArea( nWA )
|
||||
/* Early-termination LIMIT: when the query has a plain
|
||||
* LIMIT / TOP and no ORDER BY, GROUP BY, aggregates,
|
||||
* or DISTINCT, we can stop scanning as soon as aRows
|
||||
* reaches the cap. Huge win for `EXISTS` which plants
|
||||
* an implicit LIMIT 1 into the subquery's hQuery. */
|
||||
nEarlyLimit := 0
|
||||
IF ( ValType( nLimit ) == "N" .AND. nLimit > 0 ) .OR. ;
|
||||
( ValType( nTop ) == "N" .AND. nTop > 0 )
|
||||
IF Len( aOrderBy ) == 0 .AND. Len( aGroupBy ) == 0 .AND. ;
|
||||
! ::oAgg:HasAgg( aCols ) .AND. ! lDistinct
|
||||
nEarlyLimit := iif( ValType( nLimit ) == "N" .AND. nLimit > 0, ;
|
||||
nLimit, nTop )
|
||||
ENDIF
|
||||
ENDIF
|
||||
WHILE ! Eof()
|
||||
IF Len( aJoins ) > 0
|
||||
::JoinRecurse( aJoins, 1, xWhere, aResultExprs, @aRows, hJoinHash )
|
||||
@@ -1350,6 +1379,9 @@ METHOD RunSelect() CLASS TSqlExecutor
|
||||
AAdd( aRows, aRow )
|
||||
ENDIF
|
||||
ENDIF
|
||||
IF nEarlyLimit > 0 .AND. Len( aRows ) >= nEarlyLimit
|
||||
EXIT
|
||||
ENDIF
|
||||
dbSelectArea( nWA )
|
||||
dbSkip()
|
||||
ENDDO
|
||||
@@ -1568,7 +1600,7 @@ RETURN lHadMatch
|
||||
METHOD SubqueryCached( xSubNode ) CLASS TSqlExecutor
|
||||
|
||||
LOCAL hQ, aFreeVars, cCacheKey, aResult, nSavedWA, oSub
|
||||
LOCAL i, xVal, nId
|
||||
LOCAL i, xVal, nId, nSlot, aSlot
|
||||
|
||||
IF xSubNode == NIL .OR. ValType( xSubNode ) != "A" .OR. Len( xSubNode ) < 2
|
||||
RETURN NIL
|
||||
@@ -1578,17 +1610,26 @@ METHOD SubqueryCached( xSubNode ) CLASS TSqlExecutor
|
||||
RETURN NIL
|
||||
ENDIF
|
||||
|
||||
/* First call for this subquery: assign ID + analyze free variables */
|
||||
IF Len( xSubNode ) < 6 .OR. xSubNode[ 6 ] == NIL
|
||||
/* Identify this subquery: linear-search the slots list for a prior
|
||||
* entry that references the SAME AST node (array `==` is reference
|
||||
* compare in Harbour). Most queries have only a handful of sub-
|
||||
* queries so the scan is trivial. Avoids mutating the parse tree. */
|
||||
nSlot := 0
|
||||
FOR i := 1 TO Len( ::aSubCacheSlots )
|
||||
IF ::aSubCacheSlots[ i ][ 1 ] == xSubNode
|
||||
nSlot := i
|
||||
EXIT
|
||||
ENDIF
|
||||
NEXT
|
||||
IF nSlot == 0
|
||||
::nSubCacheSeq++
|
||||
aFreeVars := ::CollectFreeVars( hQ )
|
||||
IF Len( xSubNode ) < 6
|
||||
ASize( xSubNode, 6 )
|
||||
ENDIF
|
||||
xSubNode[ 6 ] := { ::nSubCacheSeq, aFreeVars }
|
||||
AAdd( ::aSubCacheSlots, { xSubNode, { ::nSubCacheSeq, aFreeVars } } )
|
||||
nSlot := Len( ::aSubCacheSlots )
|
||||
ENDIF
|
||||
nId := xSubNode[ 6 ][ 1 ]
|
||||
aFreeVars := xSubNode[ 6 ][ 2 ]
|
||||
aSlot := ::aSubCacheSlots[ nSlot ][ 2 ]
|
||||
nId := aSlot[ 1 ]
|
||||
aFreeVars := aSlot[ 2 ]
|
||||
|
||||
/* Build cache key from current values of free variables via
|
||||
* Resolve(), which walks the outer context stack. */
|
||||
|
||||
Reference in New Issue
Block a user