DOAG 2019, Nuremburg

Tuesday, October 15, 2019

Reading the Active Session History Compressed Export File in eDB360/SQLd360 as an External Table

I am a regular user of SQLDB360 (the single distribution for Carlos Sierra's eBD360, and Mauro Pagano's SQLd360 tools) for remote performance analysis.  eDB360 reports on the whole database, SQLd360 reports on a single SQL ID.  eDB360 also runs SQLd360 reports for the top SQL statements.  Both tools extract ASH data to a flat-file.  Originally, it was intended that these were loaded with the eAdam utility, but eDB360 no longer executes the eAdam scripts itself.
The eDB360/SQLd360 output zip files contain an export of the ASH data that is spooled from the database in SQL Plus, compressed with gzip, and put into a tarball.  You should find a single .tar file that contains several gzipped files.  In eDB360, dba_hist_active_sess_history.txt.gz is simply the result of:
SELECT * FROM dba_hist_active_sess_history
In SQLd360 it will be for just the specified SQL_ID:
SELECT * FROM dba_hist_active_sess_history
WHERE sql_id = '…'
The exact structure of the file varies with the version of Oracle, as more columns have been added to the view with successive releases.  There used to be an eAdam structure control file from which the DDL for a repository can be generated, but it is no longer generated.  However, it is easy to deal with this manually because many of the reports query DBA_HIST_ACTIVE_SESS_HISTORY and you will find a description of the view in each report.

Loading Compressed File as an External Table

Two database directories must be created in the database used to read the data file.
GRANT CREATE ANY DIRECTORY TO scott;
CREATE OR REPLACE DIRECTORY edb360 as '/media/sf temp';
CREATE OR REPLACE DIRECTORY exec_dir AS '/usr/bin';
GRANT EXECUTE ON DIRECTORY exec_dir TO scott;
  • Directory edb360 points to where the data file is located.  
  • The external table will load the compressed zip file directly, uncompressing it on the fly, without writing an uncompressed version of the file to disk, using the preprocessor facility.  Directory exec_dir is the location of the Unix zcat executable. 
    • N.B. It is important that this is the actual directory where the executable resides, not the linked /bin directory
Now you can create an external table.  I have created it in the SCOTT schema rather than SYS.
There are a few things to bear in mind
  1. DISABLE_DIRECTORY_LINK_CHECK is used to suppress the check that the data file being loaded is not be referenced from a directory link.  In my case, this is necessary because I am referencing a directory on a host machine from within a Linux VM running in VirtualBox.  However, this does not suppress the check for files executed by the preprocessor.  exec_dir must not point to a linked directory.
  2. The compressed data file is pre-processed (effectively piped) through the zcat command.  This is like importing a compressed dump file through a Unix FIFO
  3. The # character has had to be removed from all column names.
  4. The format of date fields in the data file must be specified explicitly to convert them properly.
  5. The name of the compressed file to be imported is specified in the LOCATION clause.
  6. The reject limit has been deliberately set to 100.  A small non-zero value.  The file is exported from DBA_HIST_ACTIVE_SESS_HISTORY.  The actual columns change from version to version, so if the definition of the external table does not match the definition in the database you may get errors.  Limiting the rejections to 100 means that the .log and .bad files are easier to manage when resolving the error.  When the definitions match no rows should be rejected.
DROP TABLE scott.ash_load PURGE;

CREATE TABLE scott.ash_load
(SNAP_ID NUMBER
,DBID NUMBER
,INSTANCE_NUMBER NUMBER
,SAMPLE_ID NUMBER
,SAMPLE_TIME TIMESTAMP(3)
,SESSION_ID NUMBER
,SESSION_SERIAL NUMBER
,SESSION_TYPE VARCHAR2(10)
,FLAGS NUMBER
,USER_ID NUMBER
,SQL_ID VARCHAR2(13)
,IS_SQLID_CURRENT VARCHAR2(1)
,SQL_CHILD_NUMBER NUMBER
,SQL_OPCODE NUMBER
,SQL_OPNAME VARCHAR2(64)
,FORCE_MATCHING_SIGNATURE NUMBER
,TOP_LEVEL_SQL_ID VARCHAR2(13)
,TOP_LEVEL_SQL_OPCODE NUMBER
,SQL_PLAN_HASH_VALUE NUMBER
,SQL_PLAN_LINE_ID NUMBER
,SQL_PLAN_OPERATION VARCHAR2(64)
,SQL_PLAN_OPTIONS VARCHAR2(64)
,SQL_EXEC_ID NUMBER
,SQL_EXEC_START DATE
,PLSQL_ENTRY_OBJECT_ID NUMBER
,PLSQL_ENTRY_SUBPROGRAM_ID NUMBER
,PLSQL_OBJECT_ID NUMBER
,PLSQL_SUBPROGRAM_ID NUMBER
,QC_INSTANCE_ID NUMBER
,QC_SESSION_ID NUMBER
,QC_SESSION_SERIAL NUMBER
,PX_FLAGS NUMBER
,EVENT VARCHAR2(64)
,EVENT_ID NUMBER
,SEQ NUMBER
,P1TEXT VARCHAR2(64)
,P1 NUMBER
,P2TEXT VARCHAR2(64)
,P2 NUMBER
,P3TEXT VARCHAR2(64)
,P3 NUMBER
,WAIT_CLASS VARCHAR2(64)
,WAIT_CLASS_ID NUMBER
,WAIT_TIME NUMBER
,SESSION_STATE VARCHAR2(7)
,TIME_WAITED NUMBER
,BLOCKING_SESSION_STATUS VARCHAR2(11)
,BLOCKING_SESSION NUMBER
,BLOCKING_SESSION_SERIAL NUMBER
,BLOCKING_INST_ID NUMBER
,BLOCKING_HANGCHAIN_INFO VARCHAR2(1)
,CURRENT_OBJ NUMBER
,CURRENT_FILE NUMBER
,CURRENT_BLOCK NUMBER
,CURRENT_ROW NUMBER
,TOP_LEVEL_CALL NUMBER
,TOP_LEVEL_CALL_NAME VARCHAR2(64)
,CONSUMER_GROUP_ID NUMBER
,XID RAW(8)
,REMOTE_INSTANCE NUMBER
,TIME_MODEL NUMBER
,IN_CONNECTION_MGMT VARCHAR2(1)
,IN_PARSE VARCHAR2(1)
,IN_HARD_PARSE VARCHAR2(1)
,IN_SQL_EXECUTION VARCHAR2(1)
,IN_PLSQL_EXECUTION VARCHAR2(1)
,IN_PLSQL_RPC VARCHAR2(1)
,IN_PLSQL_COMPILATION VARCHAR2(1)
,IN_JAVA_EXECUTION VARCHAR2(1)
,IN_BIND VARCHAR2(1)
,IN_CURSOR_CLOSE VARCHAR2(1)
,IN_SEQUENCE_LOAD VARCHAR2(1)
,CAPTURE_OVERHEAD VARCHAR2(1)
,REPLAY_OVERHEAD VARCHAR2(1)
,IS_CAPTURED VARCHAR2(1)
,IS_REPLAYED VARCHAR2(1)
,SERVICE_HASH NUMBER
,PROGRAM VARCHAR2(64)
,MODULE VARCHAR2(64)
,ACTION VARCHAR2(64)
,CLIENT_ID VARCHAR2(64)
,MACHINE VARCHAR2(64)
,PORT NUMBER
,ECID VARCHAR2(64)
,TM_DELTA_TIME NUMBER
,TM_DELTA_CPU_TIME NUMBER
,TM_DELTA_DB_TIME NUMBER
,DELTA_TIME NUMBER
,DELTA_READ_IO_REQUESTS NUMBER
,DELTA_WRITE_IO_REQUESTS NUMBER
,DELTA_READ_IO_BYTES NUMBER
,DELTA_WRITE_IO_BYTES NUMBER
,DELTA_INTERCONNECT_IO_BYTES NUMBER
,PGA_ALLOCATED NUMBER
,TEMP_SPACE_ALLOCATED NUMBER
)
ORGANIZATION EXTERNAL
(TYPE ORACLE_LOADER
 DEFAULT DIRECTORY temp
 ACCESS PARAMETERS 
 (RECORDS DELIMITED BY newline 
  DISABLE_DIRECTORY_LINK_CHECK 
  PREPROCESSOR exec_dir:'zcat' 
  FIELDS TERMINATED BY '<,>'
  MISSING FIELD VALUES ARE NULL
  REJECT ROWS WITH ALL NULL FIELDS
  NULLIF = BLANKS
(SNAP_ID,DBID,INSTANCE_NUMBER,SAMPLE_ID
,SAMPLE_TIME CHAR(80) date_format TIMESTAMP MASK "YYYY-MM-DD/HH24:mi:ss.ff"
,SESSION_ID,SESSION_SERIAL,SESSION_TYPE
,FLAGS,USER_ID
,SQL_ID
,IS_SQLID_CURRENT
,SQL_CHILD_NUMBER
,SQL_OPCODE,SQL_OPNAME
,FORCE_MATCHING_SIGNATURE,TOP_LEVEL_SQL_ID,TOP_LEVEL_SQL_OPCODE
,SQL_PLAN_HASH_VALUE,SQL_PLAN_LINE_ID
,SQL_PLAN_OPERATION,SQL_PLAN_OPTIONS
,SQL_EXEC_ID
,SQL_EXEC_START CHAR(80) date_format TIMESTAMP MASK "YYYY-MM-DD/HH24:mi:ss.ff"
,PLSQL_ENTRY_OBJECT_ID,PLSQL_ENTRY_SUBPROGRAM_ID
,PLSQL_OBJECT_ID,PLSQL_SUBPROGRAM_ID
,QC_INSTANCE_ID,QC_SESSION_ID,QC_SESSION_SERIAL,PX_FLAGS
,EVENT,EVENT_ID,SEQ
,P1TEXT,P1,P2TEXT,P2,P3TEXT,P3
,WAIT_CLASS,WAIT_CLASS_ID,WAIT_TIME
,SESSION_STATE,TIME_WAITED
,BLOCKING_SESSION_STATUS,BLOCKING_SESSION,BLOCKING_SESSION_SERIAL
,BLOCKING_INST_ID,BLOCKING_HANGCHAIN_INFO
,CURRENT_OBJ,CURRENT_FILE,CURRENT_BLOCK,CURRENT_ROW
,TOP_LEVEL_CALL,TOP_LEVEL_CALL_NAME
,CONSUMER_GROUP_ID,XID,REMOTE_INSTANCE,TIME_MODEL
,IN_CONNECTION_MGMT,IN_PARSE,IN_HARD_PARSE,IN_SQL_EXECUTION
,IN_PLSQL_EXECUTION,IN_PLSQL_RPC,IN_PLSQL_COMPILATION
,IN_JAVA_EXECUTION,IN_BIND,IN_CURSOR_CLOSE,IN_SEQUENCE_LOAD
,CAPTURE_OVERHEAD,REPLAY_OVERHEAD,IS_CAPTURED,IS_REPLAYED
,SERVICE_HASH,PROGRAM,MODULE,ACTION,CLIENT_ID,MACHINE,PORT,ECID
,TM_DELTA_TIME,TM_DELTA_CPU_TIME,TM_DELTA_DB_TIME
,DELTA_TIME
,DELTA_READ_IO_REQUESTS,DELTA_WRITE_IO_REQUESTS
,DELTA_READ_IO_BYTES,DELTA_WRITE_IO_BYTES,DELTA_INTERCONNECT_IO_BYTES
,PGA_ALLOCATED,TEMP_SPACE_ALLOCATED
))
LOCATION ('dba_hist_active_sess_history.txt.gz')
) REJECT LIMIT 100
/
Now the external table can be queried in exactly the same way that you would query DBA_HIST_ACTIVE_SESS_HISTORY in order to profile DB time, although bear in mind than an external table can only be full scanned.
set timi on lines 200 pages 99
COLUMN force_matching_signature FORMAT 99999999999999999999
COLUMN min(sample_time) FORMAT a26
COLUMN max(sample_time) FORMAT a26
SELECT min(sample_time), max(sample_time), COUNT(*) samples
FROM scott.ash_load
/

MIN(SAMPLE_TIME)           MAX(SAMPLE_TIME)              SAMPLES
-------------------------- -------------------------- ----------
21-JUN-19 05.00.36.115 AM  21-JUL-19 01.00.00.823 PM     1213333

Wednesday, October 09, 2019

Purging SQL Statements and Execution Plans from AWR

A previous blog discussed the problem posed by old execution plans that were previously captured by AWR being used by dbms_xplan.display_awr() to describe current executions plans for the same execution plan but displaying the old costs.
Tanel Poder commented that there is a TIMESTAMP column on both V$SQL_PLAN and DBA_HIST_SQL_PLAN so you know when the plans were generated, but this is not used to purge plans from AWR.  It prompted me to look into purging of statements and plans in more detail.
There has been a long-standing problem of old plans not being purged at all from AWR.  There are various notes on the Oracle support site that refer to this behaviour.   However, it appears to have been addressed by at least Oracle 19c. 
There is no snapshot ID (SNAP_ID) on either DBA_HIST_SQLTEXT or DBA_HIST_SQL_PLAN but this column does exist on the underlying tables WRH$_SQLTEXT and WRH$_SQL_PLAN, where it is used to restrict the purge process to rows created before the retention period.  SQL statements and plans collected prior to the AWR retention limit, for which there are no current SQL statistics, are purged.  However, SQL statements are not purged if SQL baselines are present for that snapshot.
Even so, an old plan can still be retained in the AWR, if the same SQL statement is still submitted and the same plan is still generated by the optimiser, and the statement is captured by the AWR snapshot either because it is in the top statements, or the SQL is 'colored'.  dbms_xplan.display_awr() will show the plan with the original costs.  It will not be purged because there are SQL statistics more recent than the AWR retention limit.

Oracle Notes

There are various references on the Oracle support website to the lack of purging resulting in significant growth in the size of the AWR tables, including bugs in old releases.

dbms_workload_repository.purge_sql_details()

SQL statements and plans are automatically purged along with the AWR snapshots when they reach the retention limit.  There is also a procedure for manual purge.  Session trace including bind variable values of the purge procedure reveals the behaviour of the purge process.  It purges WRH$_SQLTEXT, WRH$_SQL_PLAN and WRH$_SQL_BIND_METADATA for SNAP_IDs in the range 0 to one less than the oldest snapshot not beyond the retention limit.  When purging manually, it is also possible to set a maximum number of rows to be purged from each table.
select min(snap_id) 
from   wrm$_snapshot  
where  dbid = :dbid 
and    (cast(SYSTIMESTAMP as TIMESTAMP) - end_interval_time) <= :retention
SQL Statements are not purged if there are either SQL statistics for it in AWR or a SQL Baseline on it.
delete from (
  select *       
  from  wrh$_sqltext  tab  
  where tab.dbid = :dbid 
  and :beg_snap <= tab.snap_id and tab.snap_id < :end_snap 
  and not exists (
    select 1
    from   wri$_sqltext_refcount  refc  
    where  refc.dbid = tab.dbid 
    and    refc.sql_id = tab.sql_id
    and    refc.con_dbid = tab.con_dbid and refc.ref_count > 0) 
  and not exists (
    select 1  
    from   wrh$_sqlstat  s         
    where  s.dbid = :dbid and s.con_dbid = tab.con_dbid 
    and    s.sql_id = tab.sql_id))
where rownum <= :bind1
Execution plans are not purged if there are SQL statistics for it in AWR.  Statistics older than the retention limit will have been purged by this stage.
delete from (
  select *       
  from  wrh$_sql_plan  tab  
  where tab.dbid = :dbid 
  and :beg_snap <= tab.snap_id and tab.snap_id < :end_snap 
  and not exists (
    select 1           
    from   wrh$_sqlstat s          
    where  s.dbid = :dbid and s.con_dbid = tab.con_dbid 
    and    s.sql_id = tab.sql_id))
where rownum <= :bind1
dbms_workload_repository.drop_snapshot_range() behaves similarly but does not need to test for SQL statistics because it is deleting them in that range.

Comparison of Optimizer Cost of Captured Plans with Optimizer Cost of SQL Statistics

It is possible to compare the optimizer cost on SQL statistics with the optimizer cost of the captured plan and see where they have diverged.  In this example, I am looking for an absolute difference in cost of 10 and also of at least 10%.
set lines 140 pages 99
col costpctdiff heading 'Cost|%Diff' format 99999 
col costdiff heading 'Cost|Diff' format 99999999 
col plan_hash_value heading 'SQL Plan|Hash Value'
col child_number heading 'Child|No.' format 9999
col inst_id heading 'Inst|ID' format 999
col hcost heading 'AWR|Cost' format 9999999
col ccost heading 'Cursor|Cost' format 9999999
col htimestamp heading 'AWR|Timestamp'
col ctimestamp heading 'Cursor|Timestamp'
col end_interval_time format a26
col snap_id heading 'Snap|ID' format 99999
col awr_cost format 99999999
col optimizer_Cost heading 'Opt.|Cost' format 99999999
col optimizer_env_hash_value heading 'Opt. Env.|Hash Value'
col num_stats heading 'Num|Stats' format 9999
alter session set nls_date_format = 'hh24:mi:ss dd.mm.yy';
break on plan_hash_value skip 1 on sql_id on dbid
ttitle 'compare AWR/recent plan costs'

with h as ( /*captured plan outside retention limit*/
select  p.dbid, p.sql_id, p.plan_hash_Value, max(cost) cost
,       max(p.timestamp) timestamp
from    dba_hist_sql_plan p 
,       dba_hist_wr_control c
where   p.dbid = c.dbid
and     p.cost>0
and     (p.object_owner != 'SYS' OR p.object_owner IS NULL)  --omit SYS owned objects
and     p.timestamp < sysdate-c.retention 
group by p.dbid, p.sql_id, p.plan_hash_value
), s as ( /*SQL statistics*/
select  t.dbid, t.sql_id, t.plan_hash_value, t.optimizer_env_hash_value
,       t.optimizer_cost
,       MIN(t.snap_id) snap_id
,       MIN(s.end_interval_time) end_interval_time
,       COUNT(*) num_stats
from    dba_hist_snapshot s
,       dba_hist_sqlstat t
where   s.dbid = t.dbid
and     s.snap_id = t.snap_id
and     t.optimizer_cost > 0
GROUP BY t.dbid, t.sql_id, t.plan_hash_value, t.optimizer_env_hash_value
,       t.optimizer_cost
), x as (
Select  NVL(h.dbid,s.dbid) dbid
,       NVL(h.sql_id,s.sql_id) sql_id
,       NVL(h.plan_hash_value,s.plan_hash_value) plan_hash_value
,       h.cost hcost, h.timestamp htimestamp
,       s.snap_id, s.end_interval_time
,       s.optimizer_env_hash_value, s.optimizer_cost
,       s.num_stats
,       s.optimizer_cost-h.cost costdiff
,       100*s.optimizer_cost/NULLIF(h.cost,0) costpctdiff
From    h join s
on      h.plan_hash_value = s.plan_hash_value
and     h.sql_id = s.sql_id
and     h.dbid = s.dbid
), y as (
SELECT  x.*
,       MAX(ABS(costpctdiff)) OVER (PARTITION BY dbid, sql_id, plan_hash_value) maxcostpctdiff
,       MAX(ABS(costdiff)) OVER (PARTITION BY dbid, sql_id, plan_hash_value) maxcostabsdiff
FROM    x
)
SELECT  dbid, sql_id, plan_hash_value, hcost, htimestamp
,       snap_id, end_interval_time, optimizer_env_hash_value, optimizer_cost, num_stats, costdiff, costpctdiff
FROM    y
WHERE   maxcostpctdiff>=10
And     maxcostabsdiff>=10
order by plan_hash_value,sql_id,end_interval_time
/
break on report
ttitle off
We can see that according to some statement statistics, for the same execution plan, costs vary over time.  They can go up or down.  This can be caused by different object statistics or different bind variables.  It is also perfectly possible to have different cursors produce the same execution plan but with a different cost.
                                                       compare AWR/recent plan costs

                           SQL Plan      AWR AWR                 Snap                             Opt. Env.      Opt.   Num      Cost   Cost
      DBID SQL_ID        Hash Value     Cost Timestamp             ID END_INTERVAL_TIME          Hash Value      Cost Stats      Diff  %Diff
---------- ------------- ---------- -------- ----------------- ------ -------------------------- ---------- --------- ----- --------- ------
…
3719366004 1852wpfdvbbdb  428098007     5759 06:04:57 07.01.19  66189 18-JUL-19 01.00.29.158 PM  2275008064     15760    22     10001    274
                                        5759 06:04:57 07.01.19  66191 18-JUL-19 02.00.30.019 PM  2275008064      6434    40       675    112
                                        5759 06:04:57 07.01.19  66196 18-JUL-19 04.30.41.024 PM  2275008064      6435    24       676    112
                                        5759 06:04:57 07.01.19  66197 18-JUL-19 05.00.26.793 PM  2275008064      6448    10       689    112
                                        5759 06:04:57 07.01.19  66217 19-JUL-19 03.00.09.245 AM  2275008064      6444     2       685    112
3719366004 dnxpk0a7tmk2c                5761 14:25:15 07.01.19  66247 19-JUL-19 06.00.59.483 PM  2275008064      4258     2     -1503     74
…
3719366004 882x55gh06a0k 4034560521   321903 09:02:37 01.04.19  66208 18-JUL-19 10.30.45.355 PM  2275008064    471900     6    149997    147
                                      321903 09:02:37 01.04.19  66285 20-JUL-19 01.00.01.061 PM  2275008064    471848     2    149945    147
                                      321903 09:02:37 01.04.19  66326 21-JUL-19 09.30.18.514 AM  2275008064    473557     2    151654    147
…
3719366004 fgyja8zfdf4s6 4057647380      188 22:02:32 11.02.19  66214 19-JUL-19 01.30.10.479 AM    50414560       496     4       308    264

           9ud99dpmkdtq6 4059558786        3 03:56:20 01.11.18  66191 18-JUL-19 02.00.30.019 PM    50414560       170   226       167   5667

           3d6ph39wbq0zq 4078589457    62946 06:23:42 01.05.18  66270 20-JUL-19 05.30.15.218 AM  3570113222    196673     2    133727    312

           9ud99dpmkdtq6 4082397478        3 20:36:53 03.10.18  66189 18-JUL-19 01.00.29.158 PM    50414560      2296   174      2293  76533

           6rkn5ty3mpdg8 4086831979     1096 11:54:01 22.01.19  66201 18-JUL-19 07.00.07.631 PM    50414560       549    16      -547     50
                                        1096 11:54:01 22.01.19  66212 19-JUL-19 12.30.00.506 AM    50414560      6055    16      4959    552
…

Summary

Old execution plans will not be purged from AWR if there are recent SQL statistics for the same SQL ID and execution plan hash value.  You will see the old plan costs if you extract the plan with
dbms_xplan.display_awr().

Is that a problem?
  • This does not have an impact on the behaviour of the database.  The data in AWR tells you what happened in the past.
  • I think it is most important to know how far you can trust display_awr().  The plan can be used as a guide to the path of execution, but not to the amount of work done.  It will show you the correct operations in the execution plan, but costs, bind variables, partition IDs and anything that does not contribute to the plan hash value should be approached with caution.  This is why I often profile DB time for a SQL statement by SQL_PLAN_LINE_ID so I can see how much time was spent in each operation.
Oracle Doc ID 1478615.1 implies that it is permitted to manually purge specific plans.  A subsequent AWR snapshot might then capture a new version of the same execution plan.  However, there are potential problems with this approach.
You may have an example where the old estimates for a plan don't make sense for a recent execution, but if you replace the plan with an updated one to get recent estimates that may not make sense for some activity in the past.
You can have different child cursors with different costs for the same execution plan.  Therefore, it is effectively arbitrary as to which plan would be captured into AWR, so there is no guarantee that you will get a more representative plan.  It is not even certain that costs for one cursor make sense for another.

I am not sure that there is any material benefit that is worth the effort of implementing something to additionally purge old execution plans.  I would welcome comments and opinions on this subject.

Friday, September 06, 2019

Working with AWR: Old Statistics and Costs in Current Execution Plans

This is a blog about a little trap, into which it is easy to fall when analysing performance problems using ASH data and execution plans from AWR.
I will illustrate it with a real example taken from a production PeopleSoft system. A number of sessions were waiting on blocking locks.  I used ASH data to find statements running while holding a blocking lock.  That led me to a particular SQL ID and plan have value.
SQL_ID        SQL_PLAN_HASH_VALUE   ASH_SECS
------------- ------------------- ----------
                                       29210
5st32un4a2y92          2494504609      10670
652mx4tffq415          1888029394       7030
artqgxug4z0f1             8450529        580
7qxdrwcn4yzhh          3723363341        270
So, I used DBMS_XPLAN.DISPLAY_AWR to extract the execution plan from the AWR repository.
SELECT * FROM table(dbms_xplan.display_awr('5st32un4a2y92',2494504609,NULL,'ADVANCED'));
I can see the statement, the execution plan, the costs of each operation in the plan, and the value of the bind variable. This is a statement that has consumed a lot of execution time, it isn't executing that frequently, yet the cost of statement is quite low (only 22). That, in itself, does not mean there is a problem here, but it is perhaps a suggestion of an inconsistency.
SQL_ID 5st32un4a2y92
--------------------
SELECT 'X' FROM PS_CDM_LIST  WHERE CONTENTID = :1

Plan hash value: 2494504609

------------------------------------------------------------------------------------
| Id  | Operation            | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |       |       |    22 (100)|          |
|   1 |  INDEX FAST FULL SCAN| PS_CDM_LIST |     1 |     5 |    22  (10)| 00:00:01 |
------------------------------------------------------------------------------------
…
Peeked Binds (identified by position):
--------------------------------------
   1 - :1 (NUMBER): 17776
Familiarity with the application is an advantage here. The value of CONTENTID comes from a sequence number (albeit generated from a table, not an Oracle sequence because this is PeopleSoft). The system has been running for a while, and Process Instances and Context IDs were well in excess of 1,000,000, so why is the bind variable only 17776? That is another alarm bell.
I can extract the current execution plan from the library cache, if the statement is in memory, using DBMS_XPLAN.DISPLAY_CURSOR().
SELECT * FROM table(dbms_xplan.display_cursor('5st32un4a2y92',NULL,'ADVANCED')); 

Plan hash value: 2494504609

------------------------------------------------------------------------------------
| Id  | Operation            | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |     1 |     6 |  3178   (9)| 00:00:05 |
|*  1 |  INDEX FAST FULL SCAN| PS_CDM_LIST |     1 |     6 |  3178   (9)| 00:00:05 |
------------------------------------------------------------------------------------
…
Peeked Binds (identified by position):
--------------------------------------
   1 - :1 (NUMBER): 10897657

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("CONTENTID"=TO_NUMBER(:1))
DBMS_XPLAN.DISPLAY_CURSOR() shows the current bind variable value. It is immediately obvious it has moved on since the plan was captured by AWR. If the statement is not in memory, I could generate a new execution plan using EXPLAIN PLAN FOR, that reparses the statement, and then extract the execution plan with DBMS_XPLAN.DISPLAY(). However, this will not give me the bind variable value.
explain plan for SELECT 'X' FROM PS_CDM_LIST WHERE CONTENTID = :1
/

Explained.

select * from table(dbms_xplan.display(null,null,'ADVANCED'));
…
The execution plan has the same plan hash value, so it is the same plan. However, the cost has increased significantly (from 22 to 3178). This reflects how the table has grown over time, and that is the problem. The full scan of the index was fast when the table was small, but as the table has grown over time, the scan is also taking longer. By the time I came to look at it, it was quite significant.
In this case, we created a new index on PS_CDM_LIST leading on CONTENTID to satisfy this query. The result was to improve the performance of this statement, and a reduce in time spent holding blocking locks which has further benefits.

What is going on here? 

This is SQL statement is a regular part of the application. The SQL ID doesn't change because it uses a bind variable. The execution plan hasn't changed either because the optimizer can't find a better option. The execution plan we obtained from AWR was the first such plan captured for that combination of SQL ID and plan hash value. AWR may, in fact, is very likely to, have captured that plan again subsequently. However, it did not update the plan in AWR, and the old plan was never purged.
The original cost (22) was the cost then, not now. The value of the bind variable (17776) was the value then not now!
So how do you know you are looking at a plan with old costs and bind variables? You don't.  There is nothing explicit to warn you to go and check.
However, if things are not making sense, see if the cost for the same plan is close to the costs stored in AWR.
If you use EXPLAIN PLAN FOR you may get a different plan, in which case you will need to use the hints in the outline section of the execution plan (not shown above) to force it to produce the same execution plan before you can compare costs.

Monday, June 24, 2019

Importing and Working with Exported AWR/ASH data in an Oracle database in a VirtualBox VM

A lot of my performance tuning work involves analysis of ASH and AWR data.  Frequently, I do not have direct access to the databases in question.  Sometimes, I ask clients to run EDB360 on their databases and send me the results, but sometimes I also want to work directly with ASH or AWR metrics.  So, I ask for an export of their AWR repository.
Oracle distributes a pair of scripts in $ORACLE_HOME/rdbms/admin.
  • awrextr.sql exports the AWR repository (including the ASH data), optionally for a range of AWR snapshot IDs. 
  • awrload.sql the export into a staging table and then moves it into the AWR repository in the current database from where you can reference it via the usual DBA_HIST% views and work with it in the usual way.
I import this into an Oracle database in a virtual machine in VirtualBox and work on it there.  This note explains how I create and set up that machine and then work with the data.

Getting Started

1. Build Oracle VirtualBox Linux VM

I use Frits Hoogland's vagrant builder to create a standard Oracle database on a Linux VM in VirtualBox.  I always build the latest version of Oracle because I can import an AWR export into a higher version of the database, but not a lower one.

2. Make Life Easier For Yourself

I sometimes install other utilities into the virtual machine, such as git, using yum.
yum install git
The VirtualBox is only going to run a single Oracle database, and I am very lazy.  So, I set up the environment in the shell profile by adding the following commands the ~oracle/.bash_profile.
export ORACLE_SID=oracle
export ORAENV_ASK=NO
. oraenv
export ORAENV_ASK=YES

3. Share a Folder on the Host Machine

I usually leave the AWR export file on my host machine rather than copy it into the virtual machine.  I create a VirtualBox shared folder so that the VirtualBox VM can reference a folder on the host machine.  In this case, I have shared c:\temp on my host Windows laptop and called the filesystem temp.  Note that it is auto-mounted and permanent.

In the VirtualBox, the shared folder has been mounted as /media/sf_temp.
[oracle@oracle-database sql]$ df /media/sf_temp
Filesystem     1K-blocks      Used Available Use% Mounted on
temp           974661916 340278844 634383072  35% /media/sf_temp
The sf_temp directory is owned by root and accessible only to the vboxsf group.  Therefore, you must add any user who needs to reference this directory to that group.  In this case, Oracle must be made a member of the vboxsf group.  For safety, this should be done with the usermod command rather than by editing /etc/group directly.
usermod -G vboxsf -a oracle
You can see the result of this command in the /etc/group file
grep oracle /etc/group
vboxsf:x:994:oracle
When logged in as Oracle, the id command will also report the groups of which it is a member.
[oracle@oracle-database ~]$ id
uid=54321(oracle) gid=54321(oinstall) groups=54321(oinstall),100(users),994(vboxsf),54322(dba)
Create a directory in the database to reference this directory on the filesystem.
CREATE OR REPLACE DIRECTORY TEMP as '/media/sf_temp';

4. Oracle FILESYSTEMIO_OPTIONS Parameter

Initially, I got the following error when trying to run a data pump import from the shared folder:
ORA-39000: bad dump file specification
ORA-31640: unable to open dump file "/media/sf_temp/awrdat_xxxx_xxxx.dmp" for read
ORA-27041: unable to open file
Linux-x86_64 Error: 22: Invalid argument
Additional information: 2
On VirtualBox, the Oracle parameter FILESYSTEMIO_OPTIONS will be set to setall by default.  To work around the import error set it to none and restart the database.
ALTER SYSTEM SET FILESYSTEMIO_OPTIONS=none SCOPE=SPFILE;
SHUTDOWN IMMEDIATE
STARTUP

5. Data Pump Import

Now you can import an AWR export (previously exported with awrextr.sql) with the script awrload.sql.
cd $ORACLE_HOME/rdbms/admin
sqlplus / as sysdba
@awrload

Working with Imported Data

The AWR data exported from another database will be visible through all the usual database views.  It can be distinguished by having the database ID of the original database.  If, like me, you are used to mining ASH data with SQL scripts, then they will work if you specify the DBID.  However, I have a couple of tips for working with imported AWR data.

SQLDB360 (EDB360 and SQLD360)

Sometimes I run an EDB360 report on the imported data.  The reports and graphs of ASH data and the system time model are particularly useful.
I can generate AWR reports for any pair of snapshots in the imported repository, not just the ones generated by the client and provided to me.  EDB360 generates AWR reports for peak periods by various criteria.
Having installed git, I can use it to download the latest version of SQLDB360.
cd ~/
git clone https://github.com/sqldb360/sqldb360.git
cd ~/sqldb360
Although, I could have used curl instead.
cd ~/
curl -LO https://github.com/sqldb360/sqldb360/archive/unstable.zip
unzip unstable.zip
Obtain the DBID of the database that you have just imported by looking at one of the AWR views
col max(startup_time) format a26
col min(begin_interval_time) format a26
col max(end_interval_time) format a26
SELECT dbid, db_name, count(*), max(startup_time)
FROM   dba_hist_database_instance
GROUP BY dbid, db_name
/
SELECT dbid, min(begin_interval_time), max(end_interval_time)
FROM   dba_hist_snapshot
GROUP BY dbid
/
In this case, my database is ORACLE, and XXXXXXX is the database that I have just imported.
      DBID DB_NAME     COUNT(*) MAX(STARTUP_TIME)
---------- --------- ---------- --------------------------
1884370030 ORACLE             9 20-JUN-19 10.51.50.000 AM
3743596248 XXXXXXX            4 05-MAY-19 01.03.57.000 AM

      DBID MIN(BEGIN_INTERVAL_TIME)   MAX(END_INTERVAL_TIME)
---------- -------------------------- --------------------------
1884370030 18-JUN-19 07.03.44.000 PM  20-JUN-19 11.02.49.684 AM
3743596248 18-JUN-19 05.00.13.342 PM  19-JUN-19 11.00.20.020 AM
Create a custom configuration file in the sqldb360/sql directory
  • Set the database ID with the DBID from the export.
  • Specify edb360_conf_date_from/to parameters to match the range of dates in the import.
- this is a sample custom configuration requesting to produce just one section
-- for all possible DEF values please review edb360_00_config.sql

-- section to report. null means all (default)
-- report column, or section, or range of columns or range of sections i.e. 3, 3-4, 3a, 3a-4c, 3-4c, 3c-4
DEF edb360_sections = '';

-- range of dates below superceed history days when values are other than YYYY-MM-DD
-- default values YYYY-MM-DD mean: use edb360_conf_days
-- actual values sample: 2016-04-26
DEF edb360_conf_date_from = '2019-06-18';
DEF edb360_conf_date_to = '2019-06-19';

-- use if you need tool to act on a dbid stored on AWR, but that is not the current v$database.dbid
DEF edb360_config_dbid = '3743596248';
Run EDB360 referencing the custom config file
@edb360 T custom_config_01.sql
@sqld360 5dcbnw1d9yfna T custom_config_01.sql
NB: EDB360 and SQLD360 reports generated in this way will contain a mixture of data local database and the database from which the AWR export was taken.
  • Some reports are based on v$ and x$ views or data dictionary views (DBA_TABLES etc.).  These will report on the database in the virtual machine.
  • Only the reports based on DBA_HIST% views will report on the database from which the AWR data was exported.
It is important to keep this distinction in mind.
Nearly every section of the EDB360 report contains reports based on the AWR data, so it is not worth trying to exclude any sections when running EDB360 imported data.

Extracting Execution Plans from AWR

We would normally extract the execution plan of a particular statement from the AWR for the local database using the DISPLAY_AWR function in DBMS_XPLAN package as follows.
SELECT * FROM table(dbms_xplan.display_awr('burz0q5qjtgmv'));
You can specify the database ID to DISPLAY_AWR(), but if it is not the DBID of the current database, then it returns nothing.
SELECT * FROM table(dbms_xplan.display_awr('5dcbnw1d9yfna',db_id=>3743596248));
The various DISPLAY% functions in DBMS_XPLAN are formatted report generators for data held in a particular data structure.  DISPLAY_CURSOR() references V$SQL_PLAN, DISPLAY_AWR() references DBA_HIST_SQL_PLAN, and you can pass a table name to DISPLAY so that it can reference any table or view.  That is useful if you have your own private plan table.
In this case, instead of DISPLAY_AWR(), we can use the DISPLAY() function directly on the DBA_HIST_SQLPLAN view, thus:
set long 5000 lines 150 pages 999
column sql_text format a150
SELECT sql_text from dba_hist_sqltext
where dbid=3743596248 AND sql_id = '5dcbnw1d9yfna'
/
SELECT * FROM table(dbms_xplan.display(
  table_name=>'DBA_HIST_SQL_PLAN'
, format=>'ADVANCED +ADAPTIVE'
, filter_preds=>'dbid=3743596248 AND sql_id = ''5dcbnw1d9yfna'''))
/
Note that:
  • The DBID and SQL_ID are passed as quoted literals to the function.
  • DBMS_XPLAN.DISPLAY() does not report the SQL statement, so that must be queried separately.

Monday, February 04, 2019

Regular Expression Functions are Considered To Be Non-Deterministic from Oracle 12.2

You cannot put non-deterministic functions into virtual columns or function-based index.  Regular expressions have always been sensitive to National Language Support (NLS) settings because what is a character is different in different languages.  However, from Oracle 12.2. regular expressions are considered to be non-deterministic because Oracle has fixed bug 20804063.

Background 

A developer came to me with the following query, complaining it was slow.
SELECT …
FROM   …
AND    LOWER(REGEXP_REPLACE(ASCIISTR(my_column), '[^a-zA-Z0-9]')) ='2732018'
…
Obviously, functions on columns in predicates prevent the use of indexes, so I suggested creating and indexing a virtual column (slightly more elegant and almost the same as a function-based index but without needing the function in the code to match the index).
This works fine in Oracle 12.1, but they came back saying it didn't in 12.2.  Note the error messages in bold below.
DROP TABLE t PURGE
/
CREATE TABLE t AS 
SELECT TO_CHAR(sysdate-level,'YYYY/DDD') a
FROM   dual
CONNECT BY LEVEL < 1000
/

CREATE INDEX t1 on t (REGEXP_REPLACE(a,'[^a-zA-Z0-9]'))
/
ORA-01743: only pure functions can be indexed

DROP INDEX t1
/
ALTER TABLE t 
ADD b AS (REGEXP_REPLACE(a,'[^a-zA-Z0-9]')) VIRTUAL
/
ORA-54002: only pure functions can be specified in a virtual column expression
I eventually came across this question and answer by Connor McDonald on AskTom: Adding constraint with REGEXP_REPLACE fails that explains that Oracle has fixed bug 20804063 in 12.2.  This fix affects virtual columns, function-based indexes (because they also create virtual columns), and constraints.
There are a number of workarounds, but none are completely satisfactory.

Workaround 1: Deterministic Function 

As Connor suggests, you can work around this unwanted behaviour by creating your own PL/SQL function and telling the database it is deterministic. This is not simply lying to the database.  Not all regular expressions are actually NLS sensitive, it depends what you are doing.  However, it does appear that the validation is applied regardless. If the function is NLS sensitive that might have unwanted consequences, including incorrect query results. You would have to decide whether you can live with the risk depending on what is actually in the column and function in question.
CREATE OR REPLACE FUNCTION my_regexp_replace(p1 VARCHAR2, p2 VARCHAR2) 
RETURN varchar2 
DETERMINISTIC IS
BEGIN
RETURN REGEXP_REPLACE(p1, p2);
END;
/
show error

CREATE INDEX t1 ON t (my_regexp_replace(a,'[^a-zA-Z0-9]'))
/
DROP INDEX t1
/
ALTER TABLE t 
ADD b AS (my_REGEXP_REPLACE(a,'[^a-zA-Z0-9]')) VIRTUAL
/
CREATE INDEX t1 ON t (b)
/

Workaround 2: Refresh on Commit Materialized View 

Another possible workaround would be a materialized view that refreshes on commit, with the expression creating an additional column that is then indexed.
CREATE TABLE t AS 
SELECT rownum n
,      TO_CHAR(sysdate-level,'YYYY/DDD') a
FROM  dual
CONNECT BY LEVEL < 1000
/
ALTER TABLE t ADD CONSTRAINT t_pk PRIMARY KEY (n)
/
CREATE MATERIALIZED VIEW LOG ON t
/

CREATE MATERIALIZED VIEW T1_MV 
REFRESH ON COMMIT 
FAST 
WITH PRIMARY KEY 
ENABLE QUERY REWRITE AS 
SELECT t.*
,REGEXP_REPLACE(a,'[^a-zA-Z0-9]') b
FROM t
/

CREATE INDEX t1_mv_b ON t1_mv(b);

INSERT INTO t
SELECT ROWNUM+1000 n
,      TO_CHAR(sysdate-level-1000,'YYYY/DDD') a
FROM  dual
CONNECT BY LEVEL < 1000
/
commit
/

set autotrace on
SELECT * FROM t
WHERE REGEXP_REPLACE(a,'[^a-zA-Z0-9]') like '201720%';
And Oracle does indeed rewrite the query to use the materialised view and then uses the index on the materialized view.
Plan hash value: 3543552962

----------------------------------------------------------------------------------------------------------
| Id  | Operation                                      | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                               |         |     1 |    21 |     3   (0)| 00:00:01 |
|   1 |  MAT_VIEW REWRITE ACCESS BY INDEX ROWID BATCHED| T1_MV   |     1 |    21 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                             | T1_MV_B |     2 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------
However, this solution has the same weakness of being dependent on NLS settings – it is just that there is no validation to stop you! There is the further threat that the refresh on commit could become a performance problem if there is intense DDL on the underlying table.

Workaround 3: Use a Non-NLS Dependent Function 

Ideally, it would be better to use a non-NLS dependent function instead of a REGEXP% function. However, this may be easier said than done.  Regular expressions solve problems that are hard to do with basic string handling functions and may require a PL/SQL function anyway to hold procedural code.  Although in this relatively simple example it is possible to remove the unwanted characters with the translate and replace functions. There is no NLS restriction here.
SELECT a
,      REGEXP_REPLACE(a,'[^0-9]')
,      REPLACE(TRANSLATE(a,'/',' '),' ','')
FROM   t
WHERE rownum <= 10
/

A        REGEXP_REPLACE(A,'[^0-9]')       REPLACE(TRANSLATE(A,'/',''),'','
-------- -------------------------------- --------------------------------
2019/031 2019031                          2019031
2019/030 2019030                          2019030
2019/029 2019029                          2019029
2019/028 2019028                          2019028
2019/027 2019027                          2019027
2019/026 2019026                          2019026
2019/025 2019025                          2019025
2019/024 2019024                          2019024
2019/023 2019023                          2019023
2019/022 2019022                          2019022
…
And you can put this into a function-based index or virtual column thus
DROP INDEX t1 
/ 
ALTER TABLE t ADD b AS (REPLACE(TRANSLATE(a,'/',' '),' ','')) VIRTUAL 
/ 
CREATE INDEX t1 ON t (b) 
/