Monday, November 14, 2016

ASH Analysis: Detecting and Profiling Adaptive Plans in 12c

Oracle 12c introduced adaptive SQL execution plans, this post by Maria Colgan introduced the concept in blog post: "The goal of adaptive plans is to avoid catastrophic behavior of bad plans on the first execution. If we detect during execution that the optimizer’s cardinality estimates were wrong, then some plan choices can be changed “on the fly” to better options. While we can’t completely change the plan during execution, there are certain local decisions that can be changed, like join method."
However, in this post, I will to look at how you detect whether a plan has adapted. I am going to look at a real-life example of a particular statement that I had to investigate. Plan adaption may or may not be a problem. In this case I think it wasn't, but I do think you need to know it is happening. There are some new columns on v$active_session_history and DBA_HIST_ACTIVE_SESS_HISTORY.
SQL> desc v$active_session_history
Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
…
 SQL_ADAPTIVE_PLAN_RESOLVED                         NUMBER
 SQL_FULL_PLAN_HASH_VALUE                           NUMBER
…
Different adaptive sub-plans for the same overall plan will have different SQL_PLAN_HASH_VALUEs, but will share the same SQL_FULL_PLAN_HASH_VALUE.
The next query shows that a particular SQL_ID has two values of SQL_PLAN_HASH_VALUES. The column SQL_ADAPTIVE_PLAN_RESOLVED indicates that not only is this an adaptive plan, but both sub-plans have been used.
SELECT sql_plan_hash_Value, sql_full_plan_hash_Value, sql_adaptive_plan_resolved
, COUNT(DISTINCT sql_exec_id), COUNT(*)
FROM dba_hist_active_Sess_history
WHERE sql_id = '4dszd9dysry0c'
GROUP BY sql_plan_hash_value, sql_full_plan_hash_value, sql_adaptive_plan_resolved
/

SQL_PLAN_HASH_VALUE SQL_FULL_PLAN_HASH_VALUE SQL_ADAPTIVE_PLAN_RESOLVED COUNT(DISTINCTSQL_EXEC_ID)   COUNT(*)
------------------- ------------------------ -------------------------- -------------------------- ----------
         4114868852               4059585501                          1                     253           283
         3412983073               4059585501                          1                     302           309
                  0               4059585501                          0                       0             5
Note that you must pass the SQL_PLAN_HASH_VALUE to DBMS_XPLAN. If you pass SQL_FULL_PLAN_HASH_VALUE you will not find the plan.
Error: cannot fetch plan for statement_id '4dszd9dysry0c'
I will now extract the execution plans from AWR with DBMS_XPLAN.  I am using DISPLAY rather DISPLAY_AWR because I have imported the AWR from another database, and created a view of the plan data called ASH_PLAN_TABLE).
Note that this plan only has line IDs 0 to 7.
select * from table(dbms_xplan.display('ASH_PLAN_TABLE','4dszd9dysry0c',null,'dbid=2783210685 and plan_hash_value = 3412983073'));

Plan hash value: 3412983073
---------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                |       |       |    13 (100)|          |
|   1 |  SORT ORDER BY                         |                |     9 |   747 |    13   (8)| 00:00:01 |
|   2 |   HASH JOIN                            |                |     9 |   747 |    12   (0)| 00:00:01 |
|   3 |    JOIN FILTER CREATE                  | :BF0000        |     9 |   747 |    12   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| PGRELS         |     9 |   585 |     5   (0)| 00:00:01 |
|   5 |      INDEX RANGE SCAN                  | LINKSOURCE_201 |     9 |       |     3   (0)| 00:00:01 |
|   6 |    JOIN FILTER USE                     | :BF0000        |   122 |  2196 |     7   (0)| 00:00:01 |
|   7 |     TABLE ACCESS STORAGE FULL          | USERGROUPS     |   122 |  2196 |     7   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Note
-----
   - this is an adaptive plan
If I used +ADAPTIVE option. DBMS_XPLAN will emit all the lines in the full plan, and mark those inactive in this adaptive plan with a minus in the ID column.  The statistics collector operation at line 6 shows where the optimizer collected statistics at run time.
select * from table(dbms_xplan.display('ASH_PLAN_TABLE','4dszd9dysry0c','+ADAPTIVE','dbid=2783210685 and plan_hash_value = 3412983073'));

Plan hash value: 3412983073
------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                |       |       |    13 (100)|          |
|   1 |  SORT ORDER BY                            |                |     9 |   747 |    13   (8)| 00:00:01 |
|   2 |   HASH JOIN                               |                |     9 |   747 |    12   (0)| 00:00:01 | 
|   3 |    JOIN FILTER CREATE                     | :BF0000        |     9 |   747 |    12   (0)| 00:00:01 |
|-  4 |     NESTED LOOPS                          |                |     9 |   747 |    12   (0)| 00:00:01 |
|-  5 |      NESTED LOOPS                         |                |       |       |            |          |
|-  6 |       STATISTICS COLLECTOR                |                |       |       |            |          |
|   7 |        TABLE ACCESS BY INDEX ROWID BATCHED| PGRELS         |     9 |   585 |     5   (0)| 00:00:01 |
|   8 |         INDEX RANGE SCAN                  | LINKSOURCE_201 |     9 |       |     3   (0)| 00:00:01 |
|-  9 |       INDEX UNIQUE SCAN                   | SYS_C008784    |       |       |            |          |
|- 10 |      TABLE ACCESS BY INDEX ROWID          | USERGROUPS     |     1 |    18 |     7   (0)| 00:00:01 |
|  11 |    JOIN FILTER USE                        | :BF0000        |   122 |  2196 |     7   (0)| 00:00:01 |
|  12 |     TABLE ACCESS STORAGE FULL             | USERGROUPS     |   122 |  2196 |     7   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------

…
Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)
And now I will produce the plan for the other hash value. The lines in the execution plan are the same, and the costs are the same, but the minus signs are now in different places indicating different lines are inactive.  The plan has flipped between indexed lookups in a nested loop joins and hash joined full scans.
select * from table(dbms_xplan.display('ASH_PLAN_TABLE','4dszd9dysry0c','+ADAPTIVE','dbid=2783210685 and plan_hash_value = 4114868852'));

Plan hash value: 4114868852
------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                |       |       |    13 (100)|          |
|   1 |  SORT ORDER BY                            |                |     8 |   664 |    13   (8)| 00:00:01 |
|-  2 |   HASH JOIN                               |                |     8 |   664 |    12   (0)| 00:00:01 |
|   3 |    JOIN FILTER CREATE                     | :BF0000        |     8 |   664 |    12   (0)| 00:00:01 |
|   4 |     NESTED LOOPS                          |                |     8 |   664 |    12   (0)| 00:00:01 |
|   5 |      NESTED LOOPS                         |                |       |       |            |          |
|-  6 |       STATISTICS COLLECTOR                |                |       |       |            |          |
|   7 |        TABLE ACCESS BY INDEX ROWID BATCHED| PGRELS         |     8 |   520 |     5   (0)| 00:00:01 |
|   8 |         INDEX RANGE SCAN                  | LINKSOURCE_201 |     9 |       |     3   (0)| 00:00:01 |
|   9 |       INDEX UNIQUE SCAN                   | SYS_C008784    |       |       |            |          |
|  10 |      TABLE ACCESS BY INDEX ROWID          | USERGROUPS     |     1 |    18 |     7   (0)| 00:00:01 |
|- 11 |    JOIN FILTER USE                        | :BF0000        |   122 |  2196 |     7   (0)| 00:00:01 |
|- 12 |     TABLE ACCESS STORAGE FULL             | USERGROUPS     |   122 |  2196 |     7   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------

Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)
So now I want to look at how much DB time has been spent where in which plan. The following query profiles the DB time by plan and line within the plan.
select sql_plan_hash_Value, sql_full_plan_hash_Value, sql_plan_line_id, sql_adaptive_plan_resolved
, sum(10) db_secs
from dba_hist_active_Sess_history
where sql_id = '4dszd9dysry0c'
group by dbid, sql_plan_hash_value, sql_full_plan_hash_Value, sql_plan_line_id, sql_adaptive_plan_resolved
order by 1,2,4
/
We can see that the plan with hash value 3412983073 spent time in line 12, but we saw earlier that the plan only has 7 lines if it was not generated with the +ADAPTIVE option.
NB: The line numbers reported in the ASH data correspond to the full adaptive plan, so be careful to compare profiles by line back to expanded +ADAPTIVE execution plans.
SQL_PLAN_HASH_VALUE SQL_FULL_PLAN_HASH_VALUE SQL_PLAN_LINE_ID SQL_ADAPTIVE_PLAN_RESOLVED    DB_SECS
------------------- ------------------------ ---------------- -------------------------- ----------
         3412983073               4059585501                1                          1         60
         3412983073               4059585501                2                          1       1540
         3412983073               4059585501                3                          1         80
         3412983073               4059585501                7                          1         50
         3412983073               4059585501                8                          1         50
         3412983073               4059585501               12                          1        750
         3412983073               4059585501                                           1        560

         4114868852               4059585501                                           0         10
         4114868852               4059585501                1                          1        230
         4114868852               4059585501                2                          1         20
         4114868852               4059585501                3                          1        200
         4114868852               4059585501                4                          1         40
         4114868852               4059585501                7                          1        140
         4114868852               4059585501                8                          1        150
         4114868852               4059585501                9                          1         70
         4114868852               4059585501               10                          1        800
         4114868852               4059585501                                           1       1180
The next query profiles database time by plan for this SQL statement.
break on sample_time skip 1
alter session set nls_date_Format = 'dd.mm.yy';
select sql_full_plan_hash_Value, sql_plan_hash_value, sql_adaptive_plan_resolved, trunc(sample_time) sample_time
, count(*)
from dba_hist_active_Sess_history
where sql_id = '4dszd9dysry0c'
and sql_plan_hash_value > 0
group by dbid, sql_plan_hash_value, sql_full_plan_hash_value, sql_adaptive_plan_resolved, trunc(sample_time)
order by sample_time
/
We can see that the adaptive plan of this statement is flipping backwards and forwards over several days.
SQL_FULL_PLAN_HASH_VALUE SQL_PLAN_HASH_VALUE SQL_ADAPTIVE_PLAN_RESOLVED SAMPLE_T   COUNT(*)
------------------------ ------------------- -------------------------- -------- ----------
              4059585501          4114868852                          1 06.11.16         35
              4059585501          3412983073                          1                  78

              4059585501          4114868852                          1 08.11.16        236

              4059585501          3412983073                          1 09.11.16        226
              4059585501          4114868852                          0                   1
              4059585501          4114868852                          1                  12

              4059585501          4114868852                          1 10.11.16         21
              4059585501          3412983073                          1                  74

              4059585501          3412983073                          1 11.11.16         20
              4059585501          4114868852                          1                   1
Then we would have to dig down further to work out why this might be happening.
 This is the statement. It is a vanilla query from a packaged system.
SELECT  item_t1.PK  
FROM pgrels item_t0 
JOIN usergroups item_t1 
ON  item_t0.TargetPK = item_t1.PK  
WHERE (    item_t0.Qualifier  = :1  
      AND  item_t0.SourcePK  = :2
      AND  item_t0.languagepk  IS NULL) 
AND   (item_t1.TypePkString IN  (:3 ,:4 ,:5 ,:6 ,:7 ) ) 
order by  item_t0.SequenceNumber  ASC , item_t0.PK  ASC
The changing values of the bind variables supplied to the SQL statement have a part to play in causing the plan to change, and in this case it is not a problem and the optimizer is functioning as designed.

No comments :