perf(FiveSql2): EXISTS → LIMIT 1 early exit, subquery identity via AScan

Extreme subquery stress bench (12 patterns spanning scalar-in-SELECT, nested correlation, EXISTS, NOT IN, derived tables, self-joins, and mixed combinations) exposed three weaknesses in the post-ROLLUP state: 1. EXISTS / NOT EXISTS evaluated the full subquery result per outer row, even though it only needs to know whether any row matches. 2. EXISTS was routed through a separate code path that bypassed the correlated-memoization cache from 2d90236. 3. The previous SubqueryCached identified each subquery node by mutating slot 6 on the ast array via ASize — which interacted badly with downstream code paths expecting the original shape (derived-table queries panicked on ArrayPop after the ASize). Fixes: * EXISTS / NOT EXISTS now route through SubqueryCached the same way ND_SUB in WHERE does, so correlated EXISTS predicates memoize on outer free-variable values when the cardinality is low. * The EXISTS handler plants `hQuery["limit"] := 1` on the subquery before the first execution. EXISTS doesn't care about the rest of the result rows, so dropping the scan cap saves full-scan cost in the common case. * A new early-termination branch in RunSelect's scan loop exits the `WHILE !Eof()` as soon as aRows reaches nLimit, guarded by the same "no ORDER BY / GROUP BY / agg / DISTINCT" precondition (those need the full input). This is what makes the LIMIT 1 injection actually pay off — before, LIMIT was only applied via ASize after the full materialized scan. * SubqueryCached no longer mutates the parse tree. Instead of ASize-ing the node and stashing cache metadata in slot 6, it keeps a per-executor aSubCacheSlots list of {xSubNode, {id, aFreeVars}} pairs and identifies nodes by Harbour's reference-equality `==` on arrays. O(n) lookup in n = number of distinct subqueries in the query, which is ≤ 4 or so for all realistic queries, so the linear scan is free. Fixes the derived-table ArrayPop panic. Bench impact (emp=500, prod=100, ord=5k — subquery hell): Pattern Before After Δ ─────────────────────────────────────────────────────── H3 Correlated EXISTS 13.3s 10.0s 1.3x H7 Scalar-in-SELECT + JOIN 362ms 2ms 181x H8 NOT EXISTS self-join 1.8s 900ms 2.0x H11 Scalar + EXISTS + derived 13.7s 3.2s 4.3x (H1, H2, H5, H6, H9, H10, H12 unchanged at 3–72ms) H7's 181x is the scalar-in-SELECT-list memoization payoff — each dept's revenue subquery used to run 100 times (once per SALES emp), now runs once per distinct dept. H3's 1.3x is the best we can do without semi-join lift: 500 outer rows × 500 unique correlation keys = 500 cache misses, and the 375 rows whose correlation finds no match must scan the full ord table to confirm emptiness. Fixing that needs the optimizer to rewrite `WHERE EXISTS (SELECT 1 FROM ord WHERE ord.emp_id = e.id AND ...)` into `WHERE e.id IN (SELECT DISTINCT emp_id FROM ord WHERE ...)`, which is a real query-rewrite feature left for a follow-up. Validation: - FiveSql2 43/43 - Harbour compat 51/51 - go test ./... ALL PASS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 16:31:36 +09:00
parent 2d9023622c
commit ce7593c50f
1 changed files with 58 additions and 17 deletions
--- a/_FiveSql2/src/TSqlExecutor.prg
+++ b/_FiveSql2/src/TSqlExecutor.prg
@@ -36,6 +36,7 @@ CLASS TSqlExecutor
   DATA bRowBlock   /* optional code block — receives SELECT cols as params */
   DATA aFetchCache /* pre-bound {nWA, nFPos} per SELECT expression, or NIL */
   DATA hSubCorrCache   INIT { => }    /* per-outer-key subquery result cache */
+   DATA aSubCacheSlots  INIT {}         /* list of {xSubNode, {id, aFreeVars}} */
   DATA nSubCacheSeq    INIT 0          /* monotonic ID for subqueries */

   CLASSDATA hSubCache INIT { => } SHARED
@@ -570,20 +571,33 @@ METHOD EvalExpr( xNode ) CLASS TSqlExecutor
      RETURN NIL

   CASE xNode[ 1 ] == ND_FN
-      /* EXISTS must be handled before argument evaluation */
-      IF xNode[ 2 ] == "EXISTS" .AND. Len( xNode[ 3 ] ) > 0 .AND. ;
+      /* EXISTS and NOT EXISTS — we only need to know whether the
+       * subquery returns at least one row, not compute the full
+       * result. Force a LIMIT 1 into the subquery's hQuery so the
+       * inner scan short-circuits on the first match. Then route
+       * through SubqueryCached so correlated EXISTS still memoizes
+       * on free-variable values (helps when correlation is low
+       * cardinality; no-op when every outer row is unique). */
+      IF ( xNode[ 2 ] == "EXISTS" .OR. xNode[ 2 ] == "NOT EXISTS" ) .AND. ;
+         Len( xNode[ 3 ] ) > 0 .AND. ;
         xNode[ 3 ][ 1 ] != NIL .AND. ValType( xNode[ 3 ][ 1 ] ) == "A" .AND. ;
         xNode[ 3 ][ 1 ][ 1 ] == ND_SUB .AND. xNode[ 3 ][ 1 ][ 2 ] != NIL
-         nSavedWA := Select()
-         ::PushOuter()
-         aSubResult := TSqlExecutor():New( xNode[ 3 ][ 1 ][ 2 ], ::aParams ):Run()
-         ::PopOuter()
-         dbSelectArea( nSavedWA )
+         /* Install LIMIT 1 on the subquery hQuery. EXISTS only cares
+          * about the existence of a match, so the subquery scan can
+          * stop at the first row — the scan loop in RunSelect honours
+          * hQuery["limit"] as an early-termination target. */
+         IF ValType( xNode[ 3 ][ 1 ][ 2 ] ) == "H"
+            xNode[ 3 ][ 1 ][ 2 ][ "limit" ] := 1
+         ENDIF
+         aSubResult := ::SubqueryCached( xNode[ 3 ][ 1 ] )
         IF ValType( aSubResult ) == "A" .AND. Len( aSubResult ) >= 2 .AND. ;
            ValType( aSubResult[ 2 ] ) == "A"
+            IF xNode[ 2 ] == "NOT EXISTS"
+               RETURN Len( aSubResult[ 2 ] ) == 0
+            ENDIF
            RETURN Len( aSubResult[ 2 ] ) > 0
         ENDIF
-         RETURN .F.
+         RETURN iif( xNode[ 2 ] == "NOT EXISTS", .T., .F. )
      ENDIF

      /* Evaluate arguments */
@@ -1069,6 +1083,7 @@ METHOD RunSelect() CLASS TSqlExecutor
   LOCAL hJoinHash
   LOCAL lIndexUsed, aTmp
   LOCAL aFP, pcW, aGoRows
+   LOCAL nEarlyLimit

   aCols    := ::hQuery[ "columns" ]
   ::aTables := ::hQuery[ "tables" ]
@@ -1340,6 +1355,20 @@ METHOD RunSelect() CLASS TSqlExecutor
                   * join recursion. Huge win for multi-table scans. */
                  ::aFetchCache := ::BuildFetchCache( aResultExprs )
                  dbSelectArea( nWA )
+                  /* Early-termination LIMIT: when the query has a plain
+                   * LIMIT / TOP and no ORDER BY, GROUP BY, aggregates,
+                   * or DISTINCT, we can stop scanning as soon as aRows
+                   * reaches the cap. Huge win for `EXISTS` which plants
+                   * an implicit LIMIT 1 into the subquery's hQuery. */
+                  nEarlyLimit := 0
+                  IF ( ValType( nLimit ) == "N" .AND. nLimit > 0 ) .OR. ;
+                     ( ValType( nTop ) == "N" .AND. nTop > 0 )
+                     IF Len( aOrderBy ) == 0 .AND. Len( aGroupBy ) == 0 .AND. ;
+                        ! ::oAgg:HasAgg( aCols ) .AND. ! lDistinct
+                        nEarlyLimit := iif( ValType( nLimit ) == "N" .AND. nLimit > 0, ;
+                                            nLimit, nTop )
+                     ENDIF
+                  ENDIF
                  WHILE ! Eof()
                     IF Len( aJoins ) > 0
                        ::JoinRecurse( aJoins, 1, xWhere, aResultExprs, @aRows, hJoinHash )
@@ -1350,6 +1379,9 @@ METHOD RunSelect() CLASS TSqlExecutor
                           AAdd( aRows, aRow )
                        ENDIF
                     ENDIF
+                     IF nEarlyLimit > 0 .AND. Len( aRows ) >= nEarlyLimit
+                        EXIT
+                     ENDIF
                     dbSelectArea( nWA )
                     dbSkip()
                  ENDDO
@@ -1568,7 +1600,7 @@ RETURN lHadMatch
 METHOD SubqueryCached( xSubNode ) CLASS TSqlExecutor

   LOCAL hQ, aFreeVars, cCacheKey, aResult, nSavedWA, oSub
-   LOCAL i, xVal, nId
+   LOCAL i, xVal, nId, nSlot, aSlot

   IF xSubNode == NIL .OR. ValType( xSubNode ) != "A" .OR. Len( xSubNode ) < 2
      RETURN NIL
@@ -1578,17 +1610,26 @@ METHOD SubqueryCached( xSubNode ) CLASS TSqlExecutor
      RETURN NIL
   ENDIF

-   /* First call for this subquery: assign ID + analyze free variables */
-   IF Len( xSubNode ) < 6 .OR. xSubNode[ 6 ] == NIL
+   /* Identify this subquery: linear-search the slots list for a prior
+    * entry that references the SAME AST node (array `==` is reference
+    * compare in Harbour). Most queries have only a handful of sub-
+    * queries so the scan is trivial. Avoids mutating the parse tree. */
+   nSlot := 0
+   FOR i := 1 TO Len( ::aSubCacheSlots )
+      IF ::aSubCacheSlots[ i ][ 1 ] == xSubNode
+         nSlot := i
+         EXIT
+      ENDIF
+   NEXT
+   IF nSlot == 0
      ::nSubCacheSeq++
      aFreeVars := ::CollectFreeVars( hQ )
-      IF Len( xSubNode ) < 6
-         ASize( xSubNode, 6 )
-      ENDIF
-      xSubNode[ 6 ] := { ::nSubCacheSeq, aFreeVars }
+      AAdd( ::aSubCacheSlots, { xSubNode, { ::nSubCacheSeq, aFreeVars } } )
+      nSlot := Len( ::aSubCacheSlots )
   ENDIF
-   nId := xSubNode[ 6 ][ 1 ]
-   aFreeVars := xSubNode[ 6 ][ 2 ]
+   aSlot := ::aSubCacheSlots[ nSlot ][ 2 ]
+   nId := aSlot[ 1 ]
+   aFreeVars := aSlot[ 2 ]

   /* Build cache key from current values of free variables via
    * Resolve(), which walks the outer context stack. */