Tuesday, September 20, 2022

No Execution Plan Survives Contact with the Optimizer Untransformed

One of the benefits of attending Oracle conferences is that by listening and talking to other people I get a different perspective on things. Sometimes, something gives me an idea or reminds me of the importance of something that I don't use often enough. I was talking with Neil Chandler about SQL Query Transformation. We came up with a variation of a well known quote:
It isn't completely accurate.  Not every query gets transformed, but it occurs commonly, it made a good title, and you are reading this blog!
During SQL parse, the optimizer can transform a SQL query into another SQL query that is functionally identical but that results in an execution plan with a lower cost (and therefore should execute more quickly). Sometimes, multiple transformations can be applied to a single statement. 
The Oracle documentation describes various forms of transformation. You can see in the execution plan that something has happened, but you can't see the transformed SQL statement directly. However, it can be obtained from the optimizer trace that can be enabled by setting event 10053.

Demonstration 

I am going to take a simple SQL query
set pages 99 lines 200 autotrace off
alter session set tracefile_identifier='no_unnest';
alter session set events '10053 trace name context forever, level 1';
select emplid, name, effdt, last_name
from ps_names x
where x.last_name = 'Smith'
and x.name_type = 'PRI'
and x.effdt = (
  SELECT /*+NO_UNNEST*/ MAX(x1.effdt)
  FROM ps_names x1
  WHERE x1.emplid = x.emplid
  AND x1.name_type = x.name_type
  AND x1.effdt <= SYSDATE)
/
alter session set events '10053 trace name context off';
@spooltrc
  • For the second execution, an UNNEST hint is used to force the optimizer to unnest the sub-query.
alter session set tracefile_identifier='unnest';
alter session set events '10053 trace name context forever, level 1';
select emplid, name, effdt, last_name
from ps_names x
where x.last_name = 'Smith'
and x.name_type = 'PRI'
and x.effdt = (
  SELECT /*+UNNEST*/ MAX(x1.effdt)
  FROM ps_names x1
  WHERE x1.emplid = x.emplid
  AND x1.name_type = x.name_type
  AND x1.effdt <= SYSDATE)
/
alter session set events '10053 trace name context off';
@spooltrc
This is the execution plan from the first trace file for the statement with the NO_UNNEST hint. The select query blocks are simply numbered sequentially and thus are called SEL$1 and SEL$2. SEL$2 is the sub-query that references PS_NAMES with the row source alias X1. No query transformation has occurred.
-------------------------------------------------------+-----------------------------------+
| Id  | Operation                            | Name    | Rows  | Bytes | Cost  | Time      |
-------------------------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT                     |         |       |       |   122 |           |
| 1   |  TABLE ACCESS BY INDEX ROWID BATCHED | PS_NAMES|     1 |    44 |   120 |  00:00:02 |
| 2   |   INDEX SKIP SCAN                    | PS_NAMES|    11 |       |   112 |  00:00:02 |
| 3   |    SORT AGGREGATE                    |         |     1 |    21 |       |           |
| 4   |     FIRST ROW                        |         |     1 |    21 |     2 |  00:00:01 |
| 5   |      INDEX RANGE SCAN (MIN/MAX)      | PS_NAMES|     1 |    21 |     2 |  00:00:01 |
-------------------------------------------------------+-----------------------------------+
Query Block Name / Object Alias (identified by operation id):
------------------------------------------------------------
 1 - SEL$1                / "X"@"SEL$1"
 2 - SEL$1                / "X"@"SEL$1"
 3 - SEL$2
 5 - SEL$2                / "X1"@"SEL$2"
------------------------------------------------------------
Predicate Information:
----------------------
1 - filter("X"."LAST_NAME"='Smith')
2 - access("X"."NAME_TYPE"='PRI')
2 - filter(("X"."NAME_TYPE"='PRI' AND "X"."EFFDT"=))
5 - access("X1"."EMPLID"=:B1 AND "X1"."NAME_TYPE"=:B2 AND "X1"."EFFDT"<=SYSDATE@!)
Now, let's look at the optimizer trace file for the statement with the UNNEST hint. First, we can see the statement as submitted with its SQL_ID.
Trace file /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM/trace/CDBHCM_ora_21909_unnest.trc
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.7.0.0.0
…
----- Current SQL Statement for this session (sql_id=7r3mwa86fma5t) -----
select emplid, name, effdt, last_name
from ps_names x
where x.last_name = 'Smith'
and x.name_type = 'PRI'
and x.effdt = (
  SELECT /*+UNNEST*/ MAX(x1.effdt)
  FROM ps_names x1
  WHERE x1.emplid = x.emplid
  AND x1.name_type = x.name_type
  AND x1.effdt <= SYSDATE)
…
Later in the trace, we can see the fully expanded SQL statement preceded by the 'UNPARSED QUERY IS' message. 
  • All the SQL language keywords have been forced into upper case.
  • All the object and column names have been made upper case to match the objects.
  • Every column and table is double-quoted which makes them case sensitive.    
  • The columns all have row source aliases. 
  • The row sources (tables in this case) are fully qualified.
  • Only the literal 'Smith' is in mixed case.
Various unparsed queries may appear in the trace as the optimizer tries and costs different transformations. These are not nicely formatted, the expanded statements are just a long string of text.   The first one is the expanded form of the untransformed statement.
Stmt: ******* UNPARSED QUERY IS *******
SELECT "X"."EMPLID" "EMPLID","X"."NAME" "NAME","X"."EFFDT" "EFFDT","X"."LAST_NAME" "LAST_NAME" FROM "SYSADM"."PS_NAMES" 
"X" WHERE "X"."LAST_NAME"='Smith' AND "X"."NAME_TYPE"='PRI' AND "X"."EFFDT"= (SELECT /*+ UNNEST */ MAX("X1"."EFFDT") 
"MAX(X1.EFFDT)" FROM "SYSADM"."PS_NAMES" "X1" WHERE "X1"."EMPLID"="X"."EMPLID" AND "X1"."NAME_TYPE"="X"."NAME_TYPE" AND
"X1"."EFFDT"<=SYSDATE@!)
Here the sub-query has been transformed into an in-line view.  I have reformatted it to make it easier to read.
CVM:   Merging complex view SEL$683B0107 (#2) into SEL$C772B8D1 (#1).
qbcp:******* UNPARSED QUERY IS *******
SELECT "X"."EMPLID" "EMPLID","X"."NAME" "NAME","X"."EFFDT" "EFFDT","X"."LAST_NAME" "LAST_NAME" 
FROM  (SELECT /*+ UNNEST */ MAX("X1"."EFFDT") "MAX(X1.EFFDT)","X1"."EMPLID" "ITEM_0","X1"."NAME_TYPE" "ITEM_1" 
       FROM   "SYSADM"."PS_NAMES" "X1" 
       WHERE  "X1"."EFFDT"<=SYSDATE@! 
       GROUP BY "X1"."EMPLID","X1"."NAME_TYPE") "VW_SQ_1"
      ,"SYSADM"."PS_NAMES" "X" 
WHERE "X"."LAST_NAME"='Smith' 
AND   "X"."NAME_TYPE"='PRI' 
AND   "X"."EFFDT"="VW_SQ_1"."MAX(X1.EFFDT)" 
AND   "VW_SQ_1"."ITEM_0"="X"."EMPLID" 
AND   "VW_SQ_1"."ITEM_1"="X"."NAME_TYPE"
This is the final form of the statement that was executed and that produced the execution plan.  The in-line view has been merged into the parent query.  There will only be a final query section if any transformations have occurred. Again, I have reformatted it to make it easier to read.  
Final query after transformations:******* UNPARSED QUERY IS *******
SELECT /*+ UNNEST */ "X"."EMPLID" "EMPLID","X"."NAME" "NAME","X"."EFFDT" "EFFDT",'Smith' "LAST_NAME" 
FROM  "SYSADM"."PS_NAMES" "X1"
     ,"SYSADM"."PS_NAMES" "X" 
WHERE "X"."LAST_NAME"='Smith' 
AND   "X"."NAME_TYPE"='PRI' 
AND   "X1"."EMPLID"="X"."EMPLID" 
AND   "X1"."NAME_TYPE"="X"."NAME_TYPE" 
AND   "X1"."EFFDT"<=SYSDATE@! 
AND   "X1"."NAME_TYPE"='PRI' 
GROUP BY "X1"."NAME_TYPE","X".ROWID,"X"."EFFDT","X"."NAME","X"."EMPLID" 
HAVING "X"."EFFDT"=MAX("X1"."EFFDT")
…
  • PS_NAMES X1 has been moved from the subquery into the main from clause. Instead of a correlated subquery, we now have a two-table join. 
  • The query is grouped by the ROWID on row source X and the other selected columns. 
  • Instead of joining the tables on NAME_TYPE, the literal criterion has been duplicated in X1 
  • A having clause is used to join X.EFFDT to the maximum value of X1.EFFDT. 
  • Instead of selecting LAST_NAME from X, the literal value in the predicate has been put in the select clause. 
If we look at the execution plan for the unnested statement we can see X and X1 are now in query block SEL$841DDE77 that has been unnested and merged.
…
----- Explain Plan Dump -----
…
----------------------------------------+-----------------------------------+
| Id  | Operation             | Name    | Rows  | Bytes | Cost  | Time      |
----------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT      |         |       |       |   139 |           |
| 1   |  FILTER               |         |       |       |       |           |
| 2   |   SORT GROUP BY       |         |     1 |    77 |   139 |  00:00:02 |
| 3   |    NESTED LOOPS       |         |     3 |   231 |   138 |  00:00:02 |
| 4   |     TABLE ACCESS FULL | PS_NAMES|     2 |   112 |   136 |  00:00:02 |
| 5   |     INDEX RANGE SCAN  | PS_NAMES|     1 |    21 |     1 |  00:00:01 |
----------------------------------------+-----------------------------------+
Query Block Name / Object Alias (identified by operation id):
------------------------------------------------------------
 1 - SEL$841DDE77
 4 - SEL$841DDE77         / "X"@"SEL$1"
 5 - SEL$841DDE77         / "X1"@"SEL$2"
------------------------------------------------------------
Predicate Information:
----------------------
1 - filter("EFFDT"=MAX("X1"."EFFDT"))
4 - filter(("X"."LAST_NAME"='Smith' AND "X"."NAME_TYPE"='PRI'))
5 - access("X1"."EMPLID"="X"."EMPLID" AND "X1"."NAME_TYPE"='PRI' AND "X1"."EFFDT"<=SYSDATE@!)
…
The new query block name is a hash value based on the names of other blocks.  The presence of such a block name is an indication of query transformation occurring.  The query block name is stable and it is referenced in the outline of hints. 
"A question that we could ask about the incomprehensible query block names that Oracle generates is: 'are they deterministic?' – is it possible for the same query to give you the same plan while generating different query block names on different versions of Oracle (or different days of the week). The answer is (or should be) no; when Oracle generates a query block name (after supplying the initial defaults of sel$1, sel$2 etc.) it applies a hashing function to the query block names that have gone INTO a transformation to generate the name that it will use for the block that comes OUT of the transformation." - Jonathan Lewis: Query Blocks and Inline Views 
As Jonathan points out "the 'Outline Data' section of the report tells us that query block" in my example SEL$841DDE77 "is an 'outline_leaf', in other words, it is a 'final' query block that has actually been subject to independent optimization". We can also see other query block names referenced in OUTLINE hints.
  Outline Data:
  /*+
    BEGIN_OUTLINE_DATA
…
      OUTLINE_LEAF(@"SEL$841DDE77")
      MERGE(@"SEL$683B0107" >"SEL$C772B8D1")
      OUTLINE(@"SEL$C772B8D1")
      UNNEST(@"SEL$2")
      OUTLINE(@"SEL$683B0107")
      OUTLINE(@"SEL$7511BFD2")
      OUTLINE(@"SEL$2")
      OUTLINE(@"SEL$1")
      FULL(@"SEL$841DDE77" "X"@"SEL$1")
      INDEX(@"SEL$841DDE77" "X1"@"SEL$2" ("PS_NAMES"."EMPLID" "PS_NAMES"."NAME_TYPE" "PS_NAMES"."EFFDT"))
      LEADING(@"SEL$841DDE77" "X"@"SEL$1" "X1"@"SEL$2")
      USE_NL(@"SEL$841DDE77" "X1"@"SEL$2")
    END_OUTLINE_DATA
  */
We can see these query block names being generated in the trace as a number of transformations are applied with some description of the transformation.
Registered qb: SEL$683B0107 0xfc6e3030 (SUBQ INTO VIEW FOR COMPLEX UNNEST SEL$2)
Registered qb: SEL$7511BFD2 0xfc6c5c68 (VIEW ADDED SEL$1)
Registered qb: SEL$C772B8D1 0xfc6c5c68 (SUBQUERY UNNEST SEL$7511BFD2; SEL$2)
Registered qb: SEL$841DDE77 0xfc6d91e0 (VIEW MERGE SEL$C772B8D1; SEL$683B0107; SEL$C772B8D1)

No comments :