Monday, November 14, 2016

ASH Analysis: Detecting and Profiling Adaptive Plans in 12c

Oracle 12c introduced adaptive SQL execution plans, this post by Maria Colgan introduced the concept in blog post: "The goal of adaptive plans is to avoid catastrophic behavior of bad plans on the first execution. If we detect during execution that the optimizer’s cardinality estimates were wrong, then some plan choices can be changed “on the fly” to better options. While we can’t completely change the plan during execution, there are certain local decisions that can be changed, like join method."
However, in this post, I will to look at how you detect whether a plan has adapted. I am going to look at a real-life example of a particular statement that I had to investigate. Plan adaption may or may not be a problem. In this case I think it wasn't, but I do think you need to know it is happening. There are some new columns on v$active_session_history and DBA_HIST_ACTIVE_SESS_HISTORY.
SQL> desc v$active_session_history
Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
…
 SQL_ADAPTIVE_PLAN_RESOLVED                         NUMBER
 SQL_FULL_PLAN_HASH_VALUE                           NUMBER
…
Different adaptive sub-plans for the same overall plan will have different SQL_PLAN_HASH_VALUEs, but will share the same SQL_FULL_PLAN_HASH_VALUE.
The next query shows that a particular SQL_ID has two values of SQL_PLAN_HASH_VALUES. The column SQL_ADAPTIVE_PLAN_RESOLVED indicates that not only is this an adaptive plan, but both sub-plans have been used.
SELECT sql_plan_hash_Value, sql_full_plan_hash_Value, sql_adaptive_plan_resolved
, COUNT(DISTINCT sql_exec_id), COUNT(*)
FROM dba_hist_active_Sess_history
WHERE sql_id = '4dszd9dysry0c'
GROUP BY sql_plan_hash_value, sql_full_plan_hash_value, sql_adaptive_plan_resolved
/

SQL_PLAN_HASH_VALUE SQL_FULL_PLAN_HASH_VALUE SQL_ADAPTIVE_PLAN_RESOLVED COUNT(DISTINCTSQL_EXEC_ID)   COUNT(*)
------------------- ------------------------ -------------------------- -------------------------- ----------
         4114868852               4059585501                          1                     253           283
         3412983073               4059585501                          1                     302           309
                  0               4059585501                          0                       0             5
Note that you must pass the SQL_PLAN_HASH_VALUE to DBMS_XPLAN. If you pass SQL_FULL_PLAN_HASH_VALUE you will not find the plan.
Error: cannot fetch plan for statement_id '4dszd9dysry0c'
I will now extract the execution plans from AWR with DBMS_XPLAN.  I am using DISPLAY rather DISPLAY_AWR because I have imported the AWR from another database, and created a view of the plan data called ASH_PLAN_TABLE).
Note that this plan only has line IDs 0 to 7.
select * from table(dbms_xplan.display('ASH_PLAN_TABLE','4dszd9dysry0c',null,'dbid=2783210685 and plan_hash_value = 3412983073'));

Plan hash value: 3412983073
---------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                |       |       |    13 (100)|          |
|   1 |  SORT ORDER BY                         |                |     9 |   747 |    13   (8)| 00:00:01 |
|   2 |   HASH JOIN                            |                |     9 |   747 |    12   (0)| 00:00:01 |
|   3 |    JOIN FILTER CREATE                  | :BF0000        |     9 |   747 |    12   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| PGRELS         |     9 |   585 |     5   (0)| 00:00:01 |
|   5 |      INDEX RANGE SCAN                  | LINKSOURCE_201 |     9 |       |     3   (0)| 00:00:01 |
|   6 |    JOIN FILTER USE                     | :BF0000        |   122 |  2196 |     7   (0)| 00:00:01 |
|   7 |     TABLE ACCESS STORAGE FULL          | USERGROUPS     |   122 |  2196 |     7   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Note
-----
   - this is an adaptive plan
If I used +ADAPTIVE option. DBMS_XPLAN will emit all the lines in the full plan, and mark those inactive in this adaptive plan with a minus in the ID column.  The statistics collector operation at line 6 shows where the optimizer collected statistics at run time.
select * from table(dbms_xplan.display('ASH_PLAN_TABLE','4dszd9dysry0c','+ADAPTIVE','dbid=2783210685 and plan_hash_value = 3412983073'));

Plan hash value: 3412983073
------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                |       |       |    13 (100)|          |
|   1 |  SORT ORDER BY                            |                |     9 |   747 |    13   (8)| 00:00:01 |
|   2 |   HASH JOIN                               |                |     9 |   747 |    12   (0)| 00:00:01 | 
|   3 |    JOIN FILTER CREATE                     | :BF0000        |     9 |   747 |    12   (0)| 00:00:01 |
|-  4 |     NESTED LOOPS                          |                |     9 |   747 |    12   (0)| 00:00:01 |
|-  5 |      NESTED LOOPS                         |                |       |       |            |          |
|-  6 |       STATISTICS COLLECTOR                |                |       |       |            |          |
|   7 |        TABLE ACCESS BY INDEX ROWID BATCHED| PGRELS         |     9 |   585 |     5   (0)| 00:00:01 |
|   8 |         INDEX RANGE SCAN                  | LINKSOURCE_201 |     9 |       |     3   (0)| 00:00:01 |
|-  9 |       INDEX UNIQUE SCAN                   | SYS_C008784    |       |       |            |          |
|- 10 |      TABLE ACCESS BY INDEX ROWID          | USERGROUPS     |     1 |    18 |     7   (0)| 00:00:01 |
|  11 |    JOIN FILTER USE                        | :BF0000        |   122 |  2196 |     7   (0)| 00:00:01 |
|  12 |     TABLE ACCESS STORAGE FULL             | USERGROUPS     |   122 |  2196 |     7   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------

…
Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)
And now I will produce the plan for the other hash value. The lines in the execution plan are the same, and the costs are the same, but the minus signs are now in different places indicating different lines are inactive.  The plan has flipped between indexed lookups in a nested loop joins and hash joined full scans.
select * from table(dbms_xplan.display('ASH_PLAN_TABLE','4dszd9dysry0c','+ADAPTIVE','dbid=2783210685 and plan_hash_value = 4114868852'));

Plan hash value: 4114868852
------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                |       |       |    13 (100)|          |
|   1 |  SORT ORDER BY                            |                |     8 |   664 |    13   (8)| 00:00:01 |
|-  2 |   HASH JOIN                               |                |     8 |   664 |    12   (0)| 00:00:01 |
|   3 |    JOIN FILTER CREATE                     | :BF0000        |     8 |   664 |    12   (0)| 00:00:01 |
|   4 |     NESTED LOOPS                          |                |     8 |   664 |    12   (0)| 00:00:01 |
|   5 |      NESTED LOOPS                         |                |       |       |            |          |
|-  6 |       STATISTICS COLLECTOR                |                |       |       |            |          |
|   7 |        TABLE ACCESS BY INDEX ROWID BATCHED| PGRELS         |     8 |   520 |     5   (0)| 00:00:01 |
|   8 |         INDEX RANGE SCAN                  | LINKSOURCE_201 |     9 |       |     3   (0)| 00:00:01 |
|   9 |       INDEX UNIQUE SCAN                   | SYS_C008784    |       |       |            |          |
|  10 |      TABLE ACCESS BY INDEX ROWID          | USERGROUPS     |     1 |    18 |     7   (0)| 00:00:01 |
|- 11 |    JOIN FILTER USE                        | :BF0000        |   122 |  2196 |     7   (0)| 00:00:01 |
|- 12 |     TABLE ACCESS STORAGE FULL             | USERGROUPS     |   122 |  2196 |     7   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------

Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)
So now I want to look at how much DB time has been spent where in which plan. The following query profiles the DB time by plan and line within the plan.
select sql_plan_hash_Value, sql_full_plan_hash_Value, sql_plan_line_id, sql_adaptive_plan_resolved
, sum(10) db_secs
from dba_hist_active_Sess_history
where sql_id = '4dszd9dysry0c'
group by dbid, sql_plan_hash_value, sql_full_plan_hash_Value, sql_plan_line_id, sql_adaptive_plan_resolved
order by 1,2,4
/
We can see that the plan with hash value 3412983073 spent time in line 12, but we saw earlier that the plan only has 7 lines if it was not generated with the +ADAPTIVE option.
NB: The line numbers reported in the ASH data correspond to the full adaptive plan, so be careful to compare profiles by line back to expanded +ADAPTIVE execution plans.
SQL_PLAN_HASH_VALUE SQL_FULL_PLAN_HASH_VALUE SQL_PLAN_LINE_ID SQL_ADAPTIVE_PLAN_RESOLVED    DB_SECS
------------------- ------------------------ ---------------- -------------------------- ----------
         3412983073               4059585501                1                          1         60
         3412983073               4059585501                2                          1       1540
         3412983073               4059585501                3                          1         80
         3412983073               4059585501                7                          1         50
         3412983073               4059585501                8                          1         50
         3412983073               4059585501               12                          1        750
         3412983073               4059585501                                           1        560

         4114868852               4059585501                                           0         10
         4114868852               4059585501                1                          1        230
         4114868852               4059585501                2                          1         20
         4114868852               4059585501                3                          1        200
         4114868852               4059585501                4                          1         40
         4114868852               4059585501                7                          1        140
         4114868852               4059585501                8                          1        150
         4114868852               4059585501                9                          1         70
         4114868852               4059585501               10                          1        800
         4114868852               4059585501                                           1       1180
The next query profiles database time by plan for this SQL statement.
break on sample_time skip 1
alter session set nls_date_Format = 'dd.mm.yy';
select sql_full_plan_hash_Value, sql_plan_hash_value, sql_adaptive_plan_resolved, trunc(sample_time) sample_time
, count(*)
from dba_hist_active_Sess_history
where sql_id = '4dszd9dysry0c'
and sql_plan_hash_value > 0
group by dbid, sql_plan_hash_value, sql_full_plan_hash_value, sql_adaptive_plan_resolved, trunc(sample_time)
order by sample_time
/
We can see that the adaptive plan of this statement is flipping backwards and forwards over several days.
SQL_FULL_PLAN_HASH_VALUE SQL_PLAN_HASH_VALUE SQL_ADAPTIVE_PLAN_RESOLVED SAMPLE_T   COUNT(*)
------------------------ ------------------- -------------------------- -------- ----------
              4059585501          4114868852                          1 06.11.16         35
              4059585501          3412983073                          1                  78

              4059585501          4114868852                          1 08.11.16        236

              4059585501          3412983073                          1 09.11.16        226
              4059585501          4114868852                          0                   1
              4059585501          4114868852                          1                  12

              4059585501          4114868852                          1 10.11.16         21
              4059585501          3412983073                          1                  74

              4059585501          3412983073                          1 11.11.16         20
              4059585501          4114868852                          1                   1
Then we would have to dig down further to work out why this might be happening.
 This is the statement. It is a vanilla query from a packaged system.
SELECT  item_t1.PK  
FROM pgrels item_t0 
JOIN usergroups item_t1 
ON  item_t0.TargetPK = item_t1.PK  
WHERE (    item_t0.Qualifier  = :1  
      AND  item_t0.SourcePK  = :2
      AND  item_t0.languagepk  IS NULL) 
AND   (item_t1.TypePkString IN  (:3 ,:4 ,:5 ,:6 ,:7 ) ) 
order by  item_t0.SequenceNumber  ASC , item_t0.PK  ASC
The changing values of the bind variables supplied to the SQL statement have a part to play in causing the plan to change, and in this case it is not a problem and the optimizer is functioning as designed.

Monday, November 07, 2016

DBMS_LOCK: How to ensure only one instance of a PL/SQL procedure can execute concurrently.

I recently had a need to ensure that only one instance of a PL/SQL procedure was running concurrently. My solution was to create a named lock with Oracle-supplied DBMS_LOCK package. There are lots of other blogs and posts on the internet on this subject, but this is how I used it.
Just as a row level lock will block another session from obtaining a lock on the same row, a session holding a named lock created with DBMS_LOCK can prevent another session obtaining a lock with the same name.
(updated 21.11.2016) An essential difference is that by default these locks will survive a commit or rollback.  Although you can choose to release them on commit.
In my package MY_PACKAGE, I have a procedure MY_PROCEDURE. I want to ensure that only a single instance of MY_PROCEDURE can run at any one time in my database.
CREATE OR REPLACE PACKAGE BODY my_package AS 
…
k_module     CONSTANT VARCHAR2(48 CHAR) := $$PLSQL_UNIT; --name of package for instrumentation
l_lockhandle VARCHAR2(128); -- lock handle for session lock object
…
PROCEDURE my_procedure
          (/*other parameters*/
          ,p_lock_timeout   INTEGER  DEFAULT 0 /*override lock timeout*/
) IS
  l_retcode      NUMBER;
…
BEGIN
  unlock;/*unlock any session lock from previous execution in this session*/
  DBMS_LOCK.ALLOCATE_UNIQUE(k_module,l_lockhandle); /*take out an exclusive lock*/
  l_retcode:=DBMS_LOCK.REQUEST(lockhandle=>l_lockhandle
                              ,lockmode=>DBMS_LOCK.x_mode
                              ,timeout=>p_lock_timeout
                              /*,release_on_commit=>FALSE by default*/
                              ); /*check we have the lock*/

  IF l_retcode<>0 THEN
    raise_application_error(-20000,k_module||'.MY_PROCEDURE is already running');
  END IF;
…
/*do rest of package*/unlock;
END my_procedure;
  • A package global constant is set to $$PLSQL_UNIT$$. This evaluates to the name of the package, in this case, MY_PACKAGE. I often use this during instrumentation to pass the name of the package to DBMS_APPLICATION_INFO. In this case, it will be used as the name of the lock.
  • I also have a package global variable L_LOCKHANDLE which will be used as the handle of the lock. The contents of this variable will be available to subsequent instances of the package in the same session.
  • A unique lock identifier is allocated with DBMS_LOCK.ALLOCATE_UNIQUE. The lock is available for the life of the session, up to a default timeout of 10 days. The name of the lock is the name of the package taken ultimately from $$PLSQL_UNIT$$.
  • The lock is acquired with the DBMS_LOCK.REQUEST function. It is possible to pass a timeout in section to this function for which it will wait if it cannot acquire the lock.
  • The lock is requested in exclusive mode.  Other lock modes are available, but I only want one session to be able to acquire this lock.  
  • If the request function returns a non-zero code then the lock has not been acquired and the package raises an exception with a custom message. 
ERROR at line 1:
ORA-20000: MY_PACKAGE.MY_PROCEDURE is already running
If the package failed with an error it would continue to hold the lock in the session. Restarting the procedure in the same session would generate an error because the session could not acquire the lock again. I put the code to release the lock in a separate procedure.  The lock handle is stored in a package global variable so the handle will be available from previous executions of the package in the same session.  The first and last thing MY_PROCEDURE does is to call the unlock procedure.
I also found it useful to be able to call the unlock procedure during debugging, so I exposed it in the package header.
PROCEDURE unlock IS
  l_retcode NUMBER;
…
BEGIN
…
IF l_lockhandle IS NULL THEN 
    debug_msg('no lock handle',9);
  ELSE
    l_retcode:=DBMS_LOCK.RELEASE(l_lockhandle); /*release lock*/
    debug_msg('dbms_lock.release('||l_lockhandle||') return code '||l_retcode,8);
    IF l_retcode = 0 THEN 
      debug_msg('Lock Released',2);
    ELSIF l_retcode = 3 THEN
      debug_msg('Parameter Error',8);
    ELSIF l_retcode = 4 THEN
      debug_msg('Do not own lock specified by lockhandle '||l_lockhandle,8);
    ELSIF l_retcode = 5 THEN
      debug_msg('Illegal lock handle '||l_lockhandle,8);
    END IF;
  END IF;
…
END unlock;
DBMS_LOCK is a simple and robust mechanism that can easily be incorporated into a package.  Even if you can control execution concurrency via whatever you use to schedule the package, it is a useful safety mechanism.