Tuesday, September 20, 2022

No Execution Plan Survives Contact with the Optimizer Untransformed

One of the benefits of attending Oracle conferences is that by listening and talking to other people I get a different perspective on things. Sometimes, something gives me an idea or reminds me of the importance of something that I don't use often enough. I was talking with Neil Chandler about SQL Query Transformation. We came up with a variation of a well known quote:
It isn't completely accurate.  Not every query gets transformed, but it occurs commonly, it made a good title, and you are reading this blog!
During SQL parse, the optimizer can transform a SQL query into another SQL query that is functionally identical but that results in an execution plan with a lower cost (and therefore should execute more quickly). Sometimes, multiple transformations can be applied to a single statement. 
The Oracle documentation describes various forms of transformation. You can see in the execution plan that something has happened, but you can't see the transformed SQL statement directly. However, it can be obtained from the optimizer trace that can be enabled by setting event 10053.

Demonstration 

I am going to take a simple SQL query
set pages 99 lines 200 autotrace off
alter session set tracefile_identifier='no_unnest';
alter session set events '10053 trace name context forever, level 1';
select emplid, name, effdt, last_name
from ps_names x
where x.last_name = 'Smith'
and x.name_type = 'PRI'
and x.effdt = (
  SELECT /*+NO_UNNEST*/ MAX(x1.effdt)
  FROM ps_names x1
  WHERE x1.emplid = x.emplid
  AND x1.name_type = x.name_type
  AND x1.effdt <= SYSDATE)
/
alter session set events '10053 trace name context off';
@spooltrc
  • For the second execution, an UNNEST hint is used to force the optimizer to unnest the sub-query.
alter session set tracefile_identifier='unnest';
alter session set events '10053 trace name context forever, level 1';
select emplid, name, effdt, last_name
from ps_names x
where x.last_name = 'Smith'
and x.name_type = 'PRI'
and x.effdt = (
  SELECT /*+UNNEST*/ MAX(x1.effdt)
  FROM ps_names x1
  WHERE x1.emplid = x.emplid
  AND x1.name_type = x.name_type
  AND x1.effdt <= SYSDATE)
/
alter session set events '10053 trace name context off';
@spooltrc
This is the execution plan from the first trace file for the statement with the NO_UNNEST hint. The select query blocks are simply numbered sequentially and thus are called SEL$1 and SEL$2. SEL$2 is the sub-query that references PS_NAMES with the row source alias X1. No query transformation has occurred.
-------------------------------------------------------+-----------------------------------+
| Id  | Operation                            | Name    | Rows  | Bytes | Cost  | Time      |
-------------------------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT                     |         |       |       |   122 |           |
| 1   |  TABLE ACCESS BY INDEX ROWID BATCHED | PS_NAMES|     1 |    44 |   120 |  00:00:02 |
| 2   |   INDEX SKIP SCAN                    | PS_NAMES|    11 |       |   112 |  00:00:02 |
| 3   |    SORT AGGREGATE                    |         |     1 |    21 |       |           |
| 4   |     FIRST ROW                        |         |     1 |    21 |     2 |  00:00:01 |
| 5   |      INDEX RANGE SCAN (MIN/MAX)      | PS_NAMES|     1 |    21 |     2 |  00:00:01 |
-------------------------------------------------------+-----------------------------------+
Query Block Name / Object Alias (identified by operation id):
------------------------------------------------------------
 1 - SEL$1                / "X"@"SEL$1"
 2 - SEL$1                / "X"@"SEL$1"
 3 - SEL$2
 5 - SEL$2                / "X1"@"SEL$2"
------------------------------------------------------------
Predicate Information:
----------------------
1 - filter("X"."LAST_NAME"='Smith')
2 - access("X"."NAME_TYPE"='PRI')
2 - filter(("X"."NAME_TYPE"='PRI' AND "X"."EFFDT"=))
5 - access("X1"."EMPLID"=:B1 AND "X1"."NAME_TYPE"=:B2 AND "X1"."EFFDT"<=SYSDATE@!)
Now, let's look at the optimizer trace file for the statement with the UNNEST hint. First, we can see the statement as submitted with its SQL_ID.
Trace file /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM/trace/CDBHCM_ora_21909_unnest.trc
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.7.0.0.0
…
----- Current SQL Statement for this session (sql_id=7r3mwa86fma5t) -----
select emplid, name, effdt, last_name
from ps_names x
where x.last_name = 'Smith'
and x.name_type = 'PRI'
and x.effdt = (
  SELECT /*+UNNEST*/ MAX(x1.effdt)
  FROM ps_names x1
  WHERE x1.emplid = x.emplid
  AND x1.name_type = x.name_type
  AND x1.effdt <= SYSDATE)
…
Later in the trace, we can see the fully expanded SQL statement preceded by the 'UNPARSED QUERY IS' message. 
  • All the SQL language keywords have been forced into upper case.
  • All the object and column names have been made upper case to match the objects.
  • Every column and table is double-quoted which makes them case sensitive.    
  • The columns all have row source aliases. 
  • The row sources (tables in this case) are fully qualified.
  • Only the literal 'Smith' is in mixed case.
Various unparsed queries may appear in the trace as the optimizer tries and costs different transformations. These are not nicely formatted, the expanded statements are just a long string of text.   The first one is the expanded form of the untransformed statement.
Stmt: ******* UNPARSED QUERY IS *******
SELECT "X"."EMPLID" "EMPLID","X"."NAME" "NAME","X"."EFFDT" "EFFDT","X"."LAST_NAME" "LAST_NAME" FROM "SYSADM"."PS_NAMES" 
"X" WHERE "X"."LAST_NAME"='Smith' AND "X"."NAME_TYPE"='PRI' AND "X"."EFFDT"= (SELECT /*+ UNNEST */ MAX("X1"."EFFDT") 
"MAX(X1.EFFDT)" FROM "SYSADM"."PS_NAMES" "X1" WHERE "X1"."EMPLID"="X"."EMPLID" AND "X1"."NAME_TYPE"="X"."NAME_TYPE" AND
"X1"."EFFDT"<=SYSDATE@!)
Here the sub-query has been transformed into an in-line view.  I have reformatted it to make it easier to read.
CVM:   Merging complex view SEL$683B0107 (#2) into SEL$C772B8D1 (#1).
qbcp:******* UNPARSED QUERY IS *******
SELECT "X"."EMPLID" "EMPLID","X"."NAME" "NAME","X"."EFFDT" "EFFDT","X"."LAST_NAME" "LAST_NAME" 
FROM  (SELECT /*+ UNNEST */ MAX("X1"."EFFDT") "MAX(X1.EFFDT)","X1"."EMPLID" "ITEM_0","X1"."NAME_TYPE" "ITEM_1" 
       FROM   "SYSADM"."PS_NAMES" "X1" 
       WHERE  "X1"."EFFDT"<=SYSDATE@! 
       GROUP BY "X1"."EMPLID","X1"."NAME_TYPE") "VW_SQ_1"
      ,"SYSADM"."PS_NAMES" "X" 
WHERE "X"."LAST_NAME"='Smith' 
AND   "X"."NAME_TYPE"='PRI' 
AND   "X"."EFFDT"="VW_SQ_1"."MAX(X1.EFFDT)" 
AND   "VW_SQ_1"."ITEM_0"="X"."EMPLID" 
AND   "VW_SQ_1"."ITEM_1"="X"."NAME_TYPE"
This is the final form of the statement that was executed and that produced the execution plan.  The in-line view has been merged into the parent query.  There will only be a final query section if any transformations have occurred. Again, I have reformatted it to make it easier to read.  
Final query after transformations:******* UNPARSED QUERY IS *******
SELECT /*+ UNNEST */ "X"."EMPLID" "EMPLID","X"."NAME" "NAME","X"."EFFDT" "EFFDT",'Smith' "LAST_NAME" 
FROM  "SYSADM"."PS_NAMES" "X1"
     ,"SYSADM"."PS_NAMES" "X" 
WHERE "X"."LAST_NAME"='Smith' 
AND   "X"."NAME_TYPE"='PRI' 
AND   "X1"."EMPLID"="X"."EMPLID" 
AND   "X1"."NAME_TYPE"="X"."NAME_TYPE" 
AND   "X1"."EFFDT"<=SYSDATE@! 
AND   "X1"."NAME_TYPE"='PRI' 
GROUP BY "X1"."NAME_TYPE","X".ROWID,"X"."EFFDT","X"."NAME","X"."EMPLID" 
HAVING "X"."EFFDT"=MAX("X1"."EFFDT")
…
  • PS_NAMES X1 has been moved from the subquery into the main from clause. Instead of a correlated subquery, we now have a two-table join. 
  • The query is grouped by the ROWID on row source X and the other selected columns. 
  • Instead of joining the tables on NAME_TYPE, the literal criterion has been duplicated in X1 
  • A having clause is used to join X.EFFDT to the maximum value of X1.EFFDT. 
  • Instead of selecting LAST_NAME from X, the literal value in the predicate has been put in the select clause. 
If we look at the execution plan for the unnested statement we can see X and X1 are now in query block SEL$841DDE77 that has been unnested and merged.
…
----- Explain Plan Dump -----
…
----------------------------------------+-----------------------------------+
| Id  | Operation             | Name    | Rows  | Bytes | Cost  | Time      |
----------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT      |         |       |       |   139 |           |
| 1   |  FILTER               |         |       |       |       |           |
| 2   |   SORT GROUP BY       |         |     1 |    77 |   139 |  00:00:02 |
| 3   |    NESTED LOOPS       |         |     3 |   231 |   138 |  00:00:02 |
| 4   |     TABLE ACCESS FULL | PS_NAMES|     2 |   112 |   136 |  00:00:02 |
| 5   |     INDEX RANGE SCAN  | PS_NAMES|     1 |    21 |     1 |  00:00:01 |
----------------------------------------+-----------------------------------+
Query Block Name / Object Alias (identified by operation id):
------------------------------------------------------------
 1 - SEL$841DDE77
 4 - SEL$841DDE77         / "X"@"SEL$1"
 5 - SEL$841DDE77         / "X1"@"SEL$2"
------------------------------------------------------------
Predicate Information:
----------------------
1 - filter("EFFDT"=MAX("X1"."EFFDT"))
4 - filter(("X"."LAST_NAME"='Smith' AND "X"."NAME_TYPE"='PRI'))
5 - access("X1"."EMPLID"="X"."EMPLID" AND "X1"."NAME_TYPE"='PRI' AND "X1"."EFFDT"<=SYSDATE@!)
…
The new query block name is a hash value based on the names of other blocks.  The presence of such a block name is an indication of query transformation occurring.  The query block name is stable and it is referenced in the outline of hints. 
"A question that we could ask about the incomprehensible query block names that Oracle generates is: 'are they deterministic?' – is it possible for the same query to give you the same plan while generating different query block names on different versions of Oracle (or different days of the week). The answer is (or should be) no; when Oracle generates a query block name (after supplying the initial defaults of sel$1, sel$2 etc.) it applies a hashing function to the query block names that have gone INTO a transformation to generate the name that it will use for the block that comes OUT of the transformation." - Jonathan Lewis: Query Blocks and Inline Views 
As Jonathan points out "the 'Outline Data' section of the report tells us that query block" in my example SEL$841DDE77 "is an 'outline_leaf', in other words, it is a 'final' query block that has actually been subject to independent optimization". We can also see other query block names referenced in OUTLINE hints.
  Outline Data:
  /*+
    BEGIN_OUTLINE_DATA
…
      OUTLINE_LEAF(@"SEL$841DDE77")
      MERGE(@"SEL$683B0107" >"SEL$C772B8D1")
      OUTLINE(@"SEL$C772B8D1")
      UNNEST(@"SEL$2")
      OUTLINE(@"SEL$683B0107")
      OUTLINE(@"SEL$7511BFD2")
      OUTLINE(@"SEL$2")
      OUTLINE(@"SEL$1")
      FULL(@"SEL$841DDE77" "X"@"SEL$1")
      INDEX(@"SEL$841DDE77" "X1"@"SEL$2" ("PS_NAMES"."EMPLID" "PS_NAMES"."NAME_TYPE" "PS_NAMES"."EFFDT"))
      LEADING(@"SEL$841DDE77" "X"@"SEL$1" "X1"@"SEL$2")
      USE_NL(@"SEL$841DDE77" "X1"@"SEL$2")
    END_OUTLINE_DATA
  */
We can see these query block names being generated in the trace as a number of transformations are applied with some description of the transformation.
Registered qb: SEL$683B0107 0xfc6e3030 (SUBQ INTO VIEW FOR COMPLEX UNNEST SEL$2)
Registered qb: SEL$7511BFD2 0xfc6c5c68 (VIEW ADDED SEL$1)
Registered qb: SEL$C772B8D1 0xfc6c5c68 (SUBQUERY UNNEST SEL$7511BFD2; SEL$2)
Registered qb: SEL$841DDE77 0xfc6d91e0 (VIEW MERGE SEL$C772B8D1; SEL$683B0107; SEL$C772B8D1)

Wednesday, September 14, 2022

Obtaining Trace Files without Access to the Database Server

Why Trace? 

For many years, I used database SQL Trace to investigate SQL performance problems. I would trace a process, obtain the trace file, profile it (with Oracle's TKPROF or another profiling tool such as the Method R profiler, TVD$XTAT, or OraSRP), and analyse the profile. 
Active Session History (ASH) was introduced in Oracle 10g.  Today, it is usually where I start to investigate performance problems. It has the advantage that it is always on, and I can just query ASH data from the Automatic Workload Repository (AWR). However, ASH is only available on Enterprise Edition and requires the Diagnostics Pack licence. 
Sometimes, even if available, ASH isn't enough. ASH is based on sampling database activity, while trace is a record of all the SQL activity in a session. Some short-lived behaviour, that doesn't generate many samples, is difficult to investigate with ASH. Sometimes, it is necessary to dig deeper and use SQL trace. 
On occasion, you might want to generate other forms trace.  For example, an optimizer trace (event 10053) in order to understand how an execution plan was arrived at.

Where is my Trace File? 

A trend that I have observed over the years is that is is becoming ever more difficult to get hold of the trace files. If you are not the production DBA, you are unlikely to get access to the database server. Frequently, I find that pre-production performance test databases, which are often clones of the production database, are treated as production systems. After all, they contain production data. The move to the cloud has accelerated that trend. On some cloud services, you have no access to the database server at all! 
In the past, I have blogged about using an external table from which the trace file can be queried, a variation of a theme others had also written about. It required certain privileges, a new external table was required for each trace file, and you had to know the name of the trace file, and on which RAC instance it was located. 
However, in version 12.2, it is much easier.  Oracle has provided some new views that report what trace files are available and then query their contents. 

Where Is This Session Writing Its Trace File?

The Automatic Diagnostic Repository (ADR) was first documented in 11g. The view V$DIAG_INFO was introduced in 12c, from which you can query the state of the ADR. This includes the various directory paths to which files are written and the name of the current trace file.
select dbid, con_dbid, name from v$database;
column inst_id format 99 heading 'Inst|ID'
column con_id format 99 heading 'Con|ID'
column name format a22
column value format a95
select * from v$diag_info;

Inst                                                                                                                        Con
  ID NAME                   VALUE                                                                                            ID
---- ---------------------- ----------------------------------------------------------------------------------------------- ---
   1 Diag Enabled           TRUE                                                                                              0
   1 ADR Base               /opt/oracle/psft/db/oracle-server                                                                 0
   1 ADR Home               /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM                                        0
   1 Diag Trace             /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM/trace                                  0
   1 Diag Alert             /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM/alert                                  0
   1 Diag Incident          /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM/incident                               0
   1 Diag Cdump             /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM/cdump                                  0
   1 Health Monitor         /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM/hm                                     0
   1 Default Trace File     /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM/trace/CDBHCM_ora_27009_unnest.trc      0
   1 Active Problem Count   0                                                                                                 0
   1 Active Incident Count  0                                                                                                 0
   1 ORACLE_HOME            /opt/oracle/psft/db/oracle-server/19.3.0.0                                                        0

What files have been written? 

The available files are reported by V$DIAG_TRACE_FILE
column adr_home format a60
column trace_filename format a40
column change_time format a32
column modify_time format a32
column con_id format 999
select *
from v$DIAG_TRACE_FILE
where adr_home = '&adr_Home'
order by modify_time
/

ADR_HOME                                                     TRACE_FILENAME                           CHANGE_TIME                      MODIFY_TIME                      CON_ID
------------------------------------------------------------ ---------------------------------------- -------------------------------- -------------------------------- ------
…
/opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM   CDBHCM_ora_27674_no_unnest.trc           13-SEP-22 02.06.10.000 PM +00:00 13-SEP-22 02.06.10.000 PM +00:00      3
/opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM   CDBHCM_ora_27674_unnest.trc              13-SEP-22 02.06.11.000 PM +00:00 13-SEP-22 02.06.11.000 PM +00:00      3

What is in the file? 

I can then extract the contents of the file from V$DIAG_TRACE_FILE_CONTENTS. Each line of the trace is returned in a different row. 
This script spools the contents of the current trace file from SQL Plus locally to a file of the same name. It stores the name of the ADR home and its file path and the trace file name to SQL*Plus variables and then uses these to query the trace file contents. 
I can generate a trace and then run this script to extract it locally.
REM spooltrc.sql

clear screen
set heading on pages 99 lines 180 verify off echo off trimspool on termout on feedback off
column value format a95
column value new_value adr_home heading 'ADR Home' 
select value from v$diag_info where name = 'ADR Home';
column value new_value diag_trace heading 'Diag Trace'
select value from v$diag_info where name = 'Diag Trace';
column value new_value trace_filename heading 'Trace File'
select SUBSTR(value,2+LENGTH('&diag_trace')) value from v$diag_info where name = 'Default Trace File'
/
column adr_home format a60
column trace_filename format a40
column change_time format a32
column modify_time format a32
column con_id format 999
select *
from v$DIAG_TRACE_FILE
where adr_home = '&adr_home'
and trace_filename = '&trace_filename'
/
set head off pages 0 lines 5000 verify off echo off timi off termout off feedback off long 5000
spool &trace_filename
select payload
from v$diag_trace_file_contents
where adr_home = '&adr_home'
and trace_filename = '&trace_filename'
order by line_number
/
spool off
set head on pages 99 lines 180 verify on echo on termout on feedback on
The spooltrc.sql script is available on Github.  In a subsequent blog, I will demonstrate how to use it.
The payload is a VARCHAR2 column, so it is easy to search one or several trace files for specific text. This is useful if you are having trouble identifying the trace file of interest. 
See also: