Tuesday, April 17, 2018

Choosing the Right Compression

Choosing the right database table compression matters.  It affects the size of the objects and also the performance of your application.  Compression makes objects smaller, and for many systems, this is an objective in itself, but it also affects performance.  As compressed objects are smaller, less physical I/O is required to access them.  However, that saving is paid for in CPU.  More CPU is required to compress the data when it is written, and then again (albeit usually less CPU) to decompress it when you read it.  The question is whether the saving in I/O pays for the additional CPU.

There are a number of forms of compression to choose between.  They have different characteristics.  The resulting objects are different sizes, and they require different amounts of CPU to compress and decompress.
  • Basic Compression:  This is not really compression, but a form of block-level de-duplication.  It has been available since Oracle 9i.  It is available on all platforms.
  • Hybrid Columnar Compression is only available on Engineered Systems, FS Flash Storage System and ZFS Storage Appliance (ZFSSA).  4 different levels of compression that use 3 different compression algorithms.  HCC is frequently positioned as being highly effective at reducing the size of data warehouses, but it also has implications for performance.
    • Query Low: LZO algorithm
    • Query High: ZLIB algorithm
    • Archive Low: ZLIB algorithm higher level with a larger block size.
    • Archive High: BZIP2 algorithm.  The higher compression ratio comes at the expense of significantly increased CPU overhead on decompression.
Further Reading:
Which one is right for your application?  It depends.  Ultimately, you need to test it for yourself.  In this blog, I am going to relate my experience of one particular system, including how I tested it.  Our objective was optimising performance, rather than reducing database size.

My Test System

  • SuperCluster M7 (i.e. an Engineered System), Oracle 12.1.0.2, 2-node RAC, running PeopleSoft Financials 9.2.
  • In particular, this system runs a large number of complex PeopleSoft General Ledger nVision reports that use a number of reporting views on summary ledgers. There are materialized views built on the reporting views.  The summary ledgers are updated nightly.  Then the materialized views are fully refreshed in non-atomic mode so the tables are truncated and repopulated in direct-path mode.  Hence, they can be Hybrid Columnar Compressed without needing the Advanced Compression licence.  The materialized views are not indexed.  Thus they can only be full scanned, and these scans can be offloaded to the storage cells (i.e. smart scan).
The metrics presented here are extracted from test system where the summary ledger tables are updated, materialised view refreshed and then 136 General Ledger (GL) report books are run on regular a daily schedule.  This approach ensures that each test is, as far as was possible, run with the same initial conditions and only known parameters are changed each time.  In each test, a different compression level was set on the materialized views.
  • I have not investigated ARCHIVE compression.

Writing

Description Materialized Views
Size (MB) Comp. % Comp. Ratio Elapsed Total DB Time
No Compression 151,868 0% 1.0 1:01 18:13
Basic Compression 34,720 77% 4.4 1:12 22:54
Query Low 22,666 85% 6.7 1:00 18:27
Query High 14,086 91% 10.8 1:05 22:56
all times expressed in hours:minutes
  • While basic compression does significantly reduce table sizes (4.4x), Hybrid Columnar Compression does better (6.7x – 10.8x).  
  • HCC Query Low only has a very small overhead while writing data.  The materialized views are refreshed in parallel, and while the total DB time is higher, the elapsed time is lower!
  • Both basic compression and HCC Query High take about 20% longer to write the data in direct path mode.

Reading

Description nVision DB Time
On CPU+CPU Wait cell smart table scan Others Total On CPU+CPU Wait cell smart table scan Others
No Compression 201:26 97:12 18:15 316:54 64% 31% 6%
Basic Compression 466:13 66:03 31:07 563:24 83% 12% 6%
Query Low 241:13 12:34 44:57 298:45 81% 4% 15%
Query High 278:20 4:43 11:38 294:42 94% 2% 4%
The nVision GL reports only read the materialized views (although they do write some temporary working storage tables).
  • As the level of compression of the materialized views increases, the time spent on cell smart table scan (physical read) decreases, this is mainly because the objects are smaller.
  • The slightly surprising result is that the CPU overhead of basic compression is higher than either form of HCC, and more than twice that of using uncompressed tables.  Hence it significantly increases the total database response time
  • Both forms of query HCC perform better than the uncompressed table.  The increase in CPU time is more than compensated by the reduction in time spent on physical I/O.
  • The Query low test was affected by other factors external to the test (a lot more time spent in other wait events).  Otherwise, it would have outperformed query high.  If we look at just CPU and physical read time it is clearly the best option.
In an offloaded (i.e. smart) scan on an engineered system, decompression as well as the physical I/O is done in the storage cell.  The saving in time spent on physical between query low and query high is small because the disk subsystem performs so well on the engineered system.
Whereas on a non-engineered system compression and I/O must be done by the database server, there is nowhere else.  You may well find that a different level of compression is optimal.

Conclusion

  • Basic compression exhibits a surprisingly high CPU overhead.  In my test, the performance was poorer than with uncompressed tables.
  • HCC Query low was generally the best option, saving in I/O more than paid for the increase in CPU.  
  • I also analysed ASH data for the different tests broken down by materialized view.  I found it was better to leave smaller materialized views uncompressed but to compress the larger ones.  This is not surprising since HCC adds additional structures.
These results and conclusions are from one set of test on one particular application, on one particular 'engineered' hardware configuration using one particular data set.  So you need to conduct your own comparative test and decide what is best for your system.

Monday, March 05, 2018

Profiling Execution Plans

In my previous blog post, I demonstrated how to identify the active lines in an adaptive execution plan on DBA_HIST_SQL_PLAN so that I could profile behaviour. This post demonstrates a practical application of that technique. This statement comes out of the PeopleSoft Financials Multi-currency Processing. A process extracts groups of rows in elimination sets into a working storage table and joins that table to the ledger table for further processing. The number of rows extracted from the elimination set selector table (PS_ELIM_CF_SEL2001 in this case) can vary in a single process from a few to a few thousand. However, the process does regather optimizer statistics each time.
SQL_ID fwcdxjy41j23n
--------------------
SELECT L.BUSINESS_UNIT,L.ACCOUNT,L.DEPTID,L.PRODUCT,L.BUDGET_REF,L.AFFIL
IATE,L.CHARTFIELD1,L.BOOK_CODE,L.GL_ADJUST_TYPE,L.DATE_CODE,SUM(L.POSTED
_TOTAL_AMT) FROM PS_ELIM_CF_SEL2001 S, PS_LEDGER L WHERE
S.ELIMINATION_SET='UK_OTHER' AND S.TREE_NODE_NUM=1234567890 AND
S.PROCESS_INSTANCE=0001234567 AND L.FISCAL_YEAR=2018 AND
L.LEDGER=S.LEDGER AND L.BUSINESS_UNIT=S.BUSINESS_UNIT AND
L.AFFILIATE=S.AFFILIATE AND L.ACCOUNT=S.ACCOUNT AND L.CURRENCY_CD='GBP'
AND L.STATISTICS_CODE=' ' AND (L.ACCOUNTING_PERIOD<=001 AND
((L.ACCOUNTING_PERIOD>=0 AND S.BALANCE_FWD_SW='Y') OR
L.ACCOUNTING_PERIOD>=1 AND S.BALANCE_FWD_SW='N')) GROUP BY
L.BUSINESS_UNIT,L.ACCOUNT,L.DEPTID,L.PRODUCT,L.BUDGET_REF,L.AFFILIATE,L.
CHARTFIELD1,L.BOOK_CODE,L.GL_ADJUST_TYPE,L.DATE_CODE
The changing literal values mean that every SQL statement has a different SQL_ID. SQL Baselines are not viable here. No one plan produces optimal performance, so a traditional SQL profile proscribing a particular plan is not the answer. Oracle moves between three adaptive plans depending at least partially upon the number of rows in the elimination table.
Plan hash value: 42743070
-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name               | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                    |       |       |   237K(100)|       |          |       |
|   1 |  SORT GROUP BY                         |                    |    13 |  1716 |   237K  (1)| 00:00:10 |       |       |
|-  2 |   HASH JOIN                            |                    |    13 |  1716 |   237K  (1)| 00:00:10 |       |       |
|   3 |    NESTED LOOPS                        |                    |    13 |  1716 |   237K  (1)| 00:00:10 |       |       |
|-  4 |     STATISTICS COLLECTOR               |                    |       |       |            |       |          |       |
|   5 |      TABLE ACCESS STORAGE FULL         | PS_ELIM_CF_SEL2001 |    13 |   728 |   138   (2)| 00:00:01 |       |       |
|   6 |     PARTITION RANGE OR                 |                    |     1 |    76 | 18249   (1)| 00:00:01 |KEY(OR)|KEY(OR)|
|   7 |      PARTITION LIST ITERATOR           |                    |     1 |    76 | 18249   (1)| 00:00:01 |   KEY |   KEY |
|   8 |       TABLE ACCESS BY LOCAL INDEX ROWID| PS_LEDGER          |     1 |    76 | 18249   (1)| 00:00:01 |   KEY |   KEY |
|   9 |        INDEX RANGE SCAN                | PSCLEDGER          |     1 |       | 18228   (1)| 00:00:01 |   KEY |   KEY |
|- 10 |    PARTITION RANGE OR                  |                    |     1 |    76 | 18249   (1)| 00:00:01 |KEY(OR)|KEY(OR)|
|- 11 |     PARTITION LIST JOIN-FILTER         |                    |     1 |    76 | 18249   (1)| 00:00:01 |:BF0000|:BF0000|
|- 12 |      TABLE ACCESS STORAGE FULL         | PS_LEDGER          |     1 |    76 | 18249   (1)| 00:00:01 |   KEY |   KEY |
-----------------------------------------------------------------------------------------------------------------------------

Plan hash value: 3968216348
-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name               | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                    |       |       |  1679K(100)|       |          |       |
|   1 |  SORT GROUP BY                         |                    |    92 | 12236 |  1679K  (1)| 00:01:06 |       |       |
|   2 |   HASH JOIN                            |                    |    92 | 12236 |  1679K  (1)| 00:01:06 |       |       |
|-  3 |    NESTED LOOPS                        |                    |    92 | 12236 |  1679K  (1)| 00:01:06 |       |       |
|-  4 |     STATISTICS COLLECTOR               |                    |       |       |            |       |          |       |
|   5 |      TABLE ACCESS STORAGE FULL         | PS_ELIM_CF_SEL2001 |    92 |  5244 |   138   (2)| 00:00:01 |       |       |
|-  6 |     PARTITION RANGE OR                 |                    |     1 |    76 | 18249   (1)| 00:00:01 |KEY(OR)|KEY(OR)|
|-  7 |      PARTITION LIST ITERATOR           |                    |     1 |    76 | 18249   (1)| 00:00:01 |   KEY |   KEY |
|-  8 |       TABLE ACCESS BY LOCAL INDEX ROWID| PS_LEDGER          |     1 |    76 | 18249   (1)| 00:00:01 |   KEY |   KEY |
|-  9 |        INDEX RANGE SCAN                | PSCLEDGER          |     1 |       | 18228   (1)| 00:00:01 |   KEY |   KEY |
|  10 |    PARTITION RANGE OR                  |                    |     1 |    76 | 18249   (1)| 00:00:01 |KEY(OR)|KEY(OR)|
|  11 |     PARTITION LIST JOIN-FILTER         |                    |     1 |    76 | 18249   (1)| 00:00:01 |:BF0000|:BF0000|
|  12 |      TABLE ACCESS STORAGE FULL         | PS_LEDGER          |     1 |    76 | 18249   (1)| 00:00:01 |   KEY |   KEY |
-----------------------------------------------------------------------------------------------------------------------------

Plan hash value: 386495123
------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name               | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                    |       |       |  1763K(100)|       |          |       |
|   1 |  SORT GROUP BY                          |                    |     1 |   132 |  1763K  (2)| 00:01:09 |       |       |
|   2 |   HASH JOIN                             |                    |  2105 |   271K|  1763K  (2)| 00:01:09 |       |       |
|   3 |    PART JOIN FILTER CREATE              | :BF0000            |  2105 |   271K|  1763K  (2)| 00:01:09 |       |       |
|-  4 |     NESTED LOOPS                        |                    |  2105 |   271K|  1763K  (2)| 00:01:09 |       |       |
|-  5 |      STATISTICS COLLECTOR               |                    |       |       |            |       |          |       |
|   6 |       TABLE ACCESS STORAGE FULL         | PS_ELIM_CF_SEL2001 |  2113 |   115K|   138   (2)| 00:00:01 |       |       |
|-  7 |      PARTITION RANGE OR                 |                    |     1 |    76 |  1763K  (2)| 00:01:09 |KEY(OR)|KEY(OR)|
|-  8 |       PARTITION LIST ITERATOR           |                    |     1 |    76 |  1763K  (2)| 00:01:09 |   KEY |   KEY |
|-  9 |        TABLE ACCESS BY LOCAL INDEX ROWID| PS_LEDGER          |     1 |    76 |  1763K  (2)| 00:01:09 |   KEY |   KEY |
|- 10 |         INDEX RANGE SCAN                | PSCLEDGER          |       |       |            |       |      KEY |   KEY |
|  11 |    PARTITION RANGE OR                   |                    |    24M|  1743M|  1763K  (2)| 00:01:09 |KEY(OR)|KEY(OR)|
|  12 |     PARTITION LIST JOIN-FILTER          |                    |    24M|  1743M|  1763K  (2)| 00:01:09 |:BF0000|:BF0000|
|  13 |      TABLE ACCESS STORAGE FULL          | PS_LEDGER          |    24M|  1743M|  1763K  (2)| 00:01:09 |   KEY |   KEY |
------------------------------------------------------------------------------------------------------------------------------
Only a small proportion of statements run by the process are captured during AWR snapshots. With PeopleSoft, because there is so much dynamically generated non-sharable SQL, I generally recommend reducing the snapshot interval to 15 minutes and collecting the top 100 statements (the default is hourly and the top 30 statements), and even so we only have a limited number of samples with the larger statements more likely to be sampled.
We are already using SQL Profiles to apply optimizer parameters to this statement so I can identify entries in DBA_HIST_SQLSTAT by the matched profile, though I could also use the FORCE_MATCHING_SIGNATURE. I will start by extracting the SQL statistics for all matching statements into a temporary working storage table, including the OTHER_XML from the SQL Plan
drop table my_stats purge;
create table my_stats as 
with s as (
 select s.dbid, s.sql_id, s.plan_hash_value, s.sql_profile, s.action
 ,      max(s.optimizer_cost) optimizer_cost
 ,      max(s.PX_SERVERS_EXECS_DELTA) px_servers
 ,      sum(ELAPSED_TIME_DELTA)/1e6 elapsed_time
 from   dba_hist_snapshot x
 ,      dba_hist_sqlstat s
 WHERE  s.SNAP_id = X.SNAP_id
 AND    s.dbid = x.dbid
 AND    s.instance_number = x.instance_number
 AND    s.module = 'GLPOCONS'
 and    s.sql_profile LIKE 'GLPOCONS_AD%B2%'
 and    s.plan_hash_value > 0
 group by s.dbid, s.sql_id, s.plan_hash_value, s.sql_profile, s.action
)
SELECT s.*
,      xmltype(other_xml) other_xml
FROM   s, dba_hist_sql_plan p
WHERE  p.dbid = s.dbid
and    p.sql_id = s.sql_id
and    p.plan_hash_value = s.plan_hash_value
and    p.other_xml IS NOT NULL
and    p.id = 1
/
select count(*) from my_stats 
/
I found it was necessary to break this up into two stages.  Otherwise, I got ORA-32036: unsupported case for inlining of query name in WITH clause. This could be a manifestation of Bug 19684504 - ORA-32036 executing WITH statement with XML (Doc ID 19684504.8)
set pages 99 long 10
break on sql_profile skip 1
column sql_profile     format a17
column op              format 999
column dis             format 999
column par             format 999
column dep             format 999
column skp             format 999
column plan_hash_value format 9999999999 heading 'PHV'
column px_servers      format 9999       heading 'PX|Serv'
column ecost           format 9999       heading 'Elim|Cost'
column lcost           format 99999999   heading 'Ledger|Cost'
column ecard           format 99999      heading 'Elim|Card'
column lcard           format 99999999   heading 'Ledger|Card'
column optimizer_cost  format 99999999   heading 'Opt|Cost'
column elapsed_time    format 9999.99    heading 'Elapsed|Time'
column object_name     format a18 
column elim_obj        format a7         heading 'Bloom?' 
column epoptions       format a14        heading 'Join|Options'
column epoperation     format a16        heading 'Join|Operation'
column eoptions        format a14        heading 'Elim|Options'
column loptions        format a20        heading 'Ledger|Options'
column options         format a20
spool dmk
with y as (
select /*+MATERIALIZE*/ x.*
,      xt.*
,      p.object_type, p.object_name
,      p.operation, p.options, p.cost, p.cardinality
from   my_stats x
,      xmltable('/other_xml/display_map/row' PASSING x.other_xml
       COLUMNS "OP" NUMBER path '@op',
               "DIS" NUMBER path '@dis',
               "PAR" NUMBER path '@par',
               "DEP" NUMBER path '@dep',
               "SKP" NUMBER path '@skp'
       ) xt
,      dba_hist_sql_plan p
where  xt.skp = 0
and    p.dbid = x.dbid
and    p.sql_id = x.sql_id
and    p.plan_hash_Value = x.plan_hash_value
and    p.id = xt.op
)
select x.sql_profile, x.sql_id, x.plan_hash_value, x.px_servers
,      e.object_name, e.options eoptions, e.cardinality ecard, e.cost ecost
,      l.options loptions, l.cardinality lcard, l.cost lcost
,      ep.operation epoperation, ep.object_name elim_obj
,      x.optimizer_cost, x.elapsed_time
From   my_stats x
,      y e
,      y l
,      y ep
WHERE  e.dbid = x.dbid
And    e.sql_id = x.sql_id
And    e.plan_hash_Value = x.plan_hash_value
And    e.object_name like 'PS_ELIM_CF_SEL%'
And    e.object_type = 'TABLE'
AND    l.dbid = x.dbid
And    l.sql_id = x.sql_id
And    l.plan_hash_Value = x.plan_hash_value
And    l.object_name = 'PS_LEDGER'
And    l.object_type = 'TABLE'
And    ep.dbid = x.dbid
And    ep.sql_id = x.sql_id
And    ep.plan_hash_Value = x.plan_hash_value
And    ep.dis = e.par /*parent of display line*/
ORDER by x.sql_profile, e.cardinality, x.optimizer_cost, x.elapsed_time
/
spool off
The SQL query to profile the plan will be different for each set of plans that I want to investigate. The query showing in this blog was developed for the plan I am investigating here. I am interested in three aspects of the execution plan, the access method for each of the two tables, and the method by which they are joined. So, I have to join each SQL stats sub-query (X) back to the execution plan sub-query (Y) three times, selecting a different row from the plan each time.
  • E: table access of PS_ELIM_CF_SEL% table, either by a FULL scan or it might be by ROWID if an index was used.
  • L: table access of PS_LEDGER table. This will either be a FULL scan or by ROWID if an index was used.
  • EP: parent of table access of PS_ELIM_CF_SEL%. This will either be a HASH JOIN or the creation of the Bloom Filter (:BF000). If I order the result by the cardinality of the PS_ELIM_CF_SEL% I can see how and when the plan changes as the number of rows retrieved from it (the cardinality) change.
                                               PX                    Elim             Elim  Elim Ledger                  Ledger    Ledger Join                           Opt Elapsed
SQL_PROFILE       SQL_ID                PHV  Serv OBJECT_NAME        Options          Card  Cost Options                   Card      Cost Operation        Bloom?       Cost    Time
----------------- ------------- ----------- ----- ------------------ -------------- ------ ----- -------------------- --------- --------- ---------------- ------- --------- -------
GLPOCONS_ADPB2001 b7hy0jfppdvn2    42743070     0 PS_ELIM_CF_SEL2001 STORAGE FULL      13   138 BY LOCAL INDEX ROWID         1     18249 NESTED LOOPS                237379      .40
…
                  9yz896xyn5aky    42743070     0 PS_ELIM_CF_SEL2001 STORAGE FULL      51   138 BY LOCAL INDEX ROWID         1     18249 NESTED LOOPS                930851     2.62
                  cbnz1jumm0x23    42743070     0 PS_ELIM_CF_SEL2001 STORAGE FULL      52   138 BY LOCAL INDEX ROWID         1     18249 NESTED LOOPS                949101      .66
…
                  cmajgmuyc3hc3    42743070     0 PS_ELIM_CF_SEL2001 STORAGE FULL      92   138 BY LOCAL INDEX ROWID         1     18249 NESTED LOOPS               1679071    21.89
                  3sbkyfwy6xk02  3968216348     0 PS_ELIM_CF_SEL2001 STORAGE FULL      92   138 STORAGE FULL                 1     18249 HASH JOIN                  1679071    51.95
                  27pxj6qu49zzr  3968216348     0 PS_ELIM_CF_SEL2001 STORAGE FULL      92   138 STORAGE FULL                 1     18249 HASH JOIN                  1679071    52.49
                  8nfbzan3hbzdg  3968216348     0 PS_ELIM_CF_SEL2001 STORAGE FULL      92   138 STORAGE FULL                 1     18249 HASH JOIN                  1679071    67.49
                  827sykfucq01t  3968216348     0 PS_ELIM_CF_SEL2001 STORAGE FULL      93   138 STORAGE FULL                 1     18249 HASH JOIN                  1697321    87.65
                  fmd3r4848nypk  3968216348     0 PS_ELIM_CF_SEL2001 STORAGE FULL      93   138 STORAGE FULL                 1     18249 HASH JOIN                  1697321    89.39
                  34h7j9rqf8y73  2660510816     0 PS_ELIM_CF_SEL2001 STORAGE FULL      97   138 BY LOCAL INDEX ROWID         1   1763685 NESTED LOOPS               1763970     1.54
                  0cfqyuz79qd09   386495123     0 PS_ELIM_CF_SEL2001 STORAGE FULL      98   138 STORAGE FULL          24054654   1763685 PART JOIN FILTER :BF0000   1763970    21.32
                  ff5hf8zpwhmd2   386495123     0 PS_ELIM_CF_SEL2001 STORAGE FULL      98   138 STORAGE FULL          24054654   1763685 PART JOIN FILTER :BF0000   1763970    26.64
                  9j0m5gq5aqssz   386495123     0 PS_ELIM_CF_SEL2001 STORAGE FULL      99   138 STORAGE FULL          24054654   1763685 PART JOIN FILTER :BF0000   1763970    34.28
                  bzbqrc7bw768q    42743070     0 PS_ELIM_CF_SEL2001 STORAGE FULL     102   138 BY LOCAL INDEX ROWID         1     18249 NESTED LOOPS               1861564      .76
                  bjjx0fxmtqzzt    42743070     0 PS_ELIM_CF_SEL2001 STORAGE FULL     102   138 BY LOCAL INDEX ROWID         1     18249 NESTED LOOPS               1861564     3.66
                  aqcxt7par2a22  2660510816     0 PS_ELIM_CF_SEL2001 STORAGE FULL     104   138 BY LOCAL INDEX ROWID         1   1763685 NESTED LOOPS               1763970     9.57
                  4cfsq52xzftpz   386495123     0 PS_ELIM_CF_SEL2001 STORAGE FULL     104   138 STORAGE FULL          24054654   1763685 PART JOIN FILTER :BF0000   1763970    25.39
                  76hq9zbfrm599   386495123     0 PS_ELIM_CF_SEL2001 STORAGE FULL     104   138 STORAGE FULL          24054654   1763685 PART JOIN FILTER :BF0000   1763970    26.11
                  2b8ck8kpyrvhg   386495123     0 PS_ELIM_CF_SEL2001 STORAGE FULL     104   138 STORAGE FULL          24054654   1763685 PART JOIN FILTER :BF0000   1763970    35.71
                  6gtu9tauhp55y   386495123     0 PS_ELIM_CF_SEL2001 STORAGE FULL     104   138 STORAGE FULL          24054654   1763685 PART JOIN FILTER :BF0000   1763970    39.77
…
                  88kyxqx2qbjxg   386495123     0 PS_ELIM_CF_SEL2001 STORAGE FULL    1021   138 STORAGE FULL          24054654   1763685 PART JOIN FILTER :BF0000   1763970    21.73
…
                  dh39mtw8rga5s   386495123     0 PS_ELIM_CF_SEL2001 STORAGE FULL    2113   138 STORAGE FULL          24054654   1763685 PART JOIN FILTER :BF0000   1763970    40.26
I can see that the optimizer goes backwards and forwards between different plans at different times as the cardinality increases. Nevertheless, I can make some informed decisions about how to adjust the optimizer parameters for this statement with hints introduced by a SQL Profile.
  • I can change the point where the optimizer switches from to the full scan and bloom filter to a lower cardinality by increasing the cost of the index with a higher value optimizer_index_cost_adj.
  • I can enable automatic parallelism with parallel_degree_policy, scale the calculated degree of parallelism with parallel_degree_level, but limit the degree of parallelism with parallel_degree_limit. The threshold for parallelism can be set with parallel_min_time_threshold. The parallelism threshold can be set at a slightly lower cardinality than the switch to the full scan.
  • The settings can be tested using EXPLAIN PLAN FOR with the cardinality hint to set the number of rows returned from the PS_ELIM_CF_SEL% table
explain plan for
SELECT /*+cardinality(s 300)
NO_EXPAND 
OPT_PARAM('parallel_degree_policy','AUTO')
OPT_PARAM('parallel_min_time_threshold',220)
OPT_PARAM('parallel_degree_limit',40)
OPT_PARAM('optimizer_index_cost_adj',10912)
*/
L.BUSINESS_UNIT,L.ACCOUNT,L.DEPTID,L.PRODUCT,L.BUDGET_REF
…
/
set echo off trimspool on pages 999 lines 200 autotrace off
select * from table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION -PREDICATE -OUTLINE'));
When satisfactory values have been determined, they can be passed back into the SQL Profile.
…
h := SYS.SQLPROF_ATTR(
q'[BEGIN_OUTLINE_DATA]',
q'[OPT_PARAM('parallel_degree_policy','AUTO')]',
q'[OPT_PARAM('parallel_min_time_threshold',60)]',
q'[OPT_PARAM('parallel_degree_limit',40)]',
q'[OPT_PARAM('optimizer_index_cost_adj',10912)]',
q'[NO_EXPAND]',
…
q'[END_OUTLINE_DATA]');
…
DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
sql_text    => sql_txt,
profile     => h,
name        => 'GLPOCONS_'||i.id,
…
replace     => TRUE,
force_match => TRUE /* TRUE:FORCE (match even when different literals in SQL). FALSE:EXACT (similar to CURSOR_SHARING) */ 
);
…

Friday, March 02, 2018

How to Identify Active Lines in an Adaptive Execution Plan in DBA_HIST_SQL_PLAN

When investigating performance problems I spend a lot of my time profiling ASH and AWR data. I sometimes want to join back to captured plans in DBA_HIST_SQL_PLAN to see how an object was access, or which index was used, or how many rows the optimizer thought it would get.
From Oracle 12c, we can get adaptive execution plans where the optimizer considers different plans at runtime. Mostly it is a choice between either a nested loop or a hash join of tables.
For example, in the below execution plan, DBMS_XPLAN includes a note to say that inactive plans with a dash.  Thus we can see that the nested loop lookup of PS_LEDGER using the PSCLEDGER index was rejected in favour of a Bloom filter on a full scan of the ledger table.
SQL_ID bk26thygs9c8n

Plan hash value: 38031479
-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name               | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                    |       |       | 18275 (100)|       |          |       |
|   1 |  SORT GROUP BY                         |                    |     1 |   135 | 18275   (1)| 00:00:01 |       |       |
|-  2 |   HASH JOIN                            |                    |     1 |   135 | 18274   (1)| 00:00:01 |       |       |
|   3 |    NESTED LOOPS                        |                    |     1 |   135 | 18274   (1)| 00:00:01 |       |       |
|-  4 |     STATISTICS COLLECTOR               |                    |       |       |            |       |          |       |
|   5 |      TABLE ACCESS STORAGE FULL         | PS_ELIM_CF_SEL2002 |     1 |    59 |    25   (0)| 00:00:01 |       |       |
|   6 |     PARTITION RANGE OR                 |                    |     1 |    76 | 18249   (1)| 00:00:01 |KEY(OR)|KEY(OR)|
|   7 |      PARTITION LIST ITERATOR           |                    |     1 |    76 | 18249   (1)| 00:00:01 |   KEY |   KEY |
|   8 |       TABLE ACCESS BY LOCAL INDEX ROWID| PS_LEDGER          |     1 |    76 | 18249   (1)| 00:00:01 |   KEY |   KEY |
|   9 |        INDEX RANGE SCAN                | PSCLEDGER          |     1 |       | 18228   (1)| 00:00:01 |   KEY |   KEY |
|- 10 |    PARTITION RANGE OR                  |                    |     1 |    76 | 18249   (1)| 00:00:01 |KEY(OR)|KEY(OR)|
|- 11 |     PARTITION LIST JOIN-FILTER         |                    |     1 |    76 | 18249   (1)| 00:00:01 |:BF0000|:BF0000|
|- 12 |      TABLE ACCESS STORAGE FULL         | PS_LEDGER          |     1 |    76 | 18249   (1)| 00:00:01 |   KEY |   KEY |
-----------------------------------------------------------------------------------------------------------------------------
…
Note
-----
…
   - this is an adaptive plan (rows marked '-' are inactive)
When I come to look at this plan in DBA_HIST_SQL_PLAN, I will find all 14 lines but there is no column in the view that indicates whether the line is active in the adaptive plan.
select id, operation, options, object_name, cardinality, cost from dba_hist_sql_plan 
where sql_id = 'bk26thygs9c8n' and plan_hash_value = 38031479

        ID OPERATION                      OPTIONS                        OBJECT_NAME        CARDINALITY       COST
---------- ------------------------------ ------------------------------ ------------------ ----------- ----------
         0 SELECT STATEMENT                                                                                1529564
         1 SORT                           GROUP BY                                                    1    1529564
         2 HASH JOIN                                                                                359    1529563
         3 PART JOIN FILTER               CREATE                         :BF0000                    359    1529563
         4 NESTED LOOPS                                                                             359    1529563
         5 STATISTICS COLLECTOR
         6 TABLE ACCESS                   STORAGE FULL                   PS_ELIM_CF_SEL2002         360        172
         7 PARTITION RANGE                OR                                                          1    1529289
         8 PARTITION LIST                 ITERATOR                                                    1    1529289
         9 TABLE ACCESS                   BY LOCAL INDEX ROWID           PS_LEDGER                    1    1529289
        10 INDEX                          RANGE SCAN                     PSCLEDGER
        11 PARTITION RANGE                OR                                                   16845834    1529289
        12 PARTITION LIST                 JOIN-FILTER                                          16845834    1529289
        13 TABLE ACCESS                   STORAGE FULL                   PS_LEDGER             16845834    1529289
The information about whether a line in the execution plan is active and whether it should be displayed by DBMS_XPLAN is encoded in the display map inside an XML structure stored the OTHER_XML column on line 1 of the plan.
<other_xml>
  <info type="nodeid/pflags">000000000001</info>
…
  <info type="adaptive_plan" note="y">yes</info>
  <outline_data>
    <hint><![CDATA[IGNORE_OPTIM_EMBEDDED_HINTS]]></hint>
    <hint><![CDATA[OPTIMIZER_FEATURES_ENABLE('12.1.0.2')]]></hint>
    <hint><![CDATA[DB_VERSION('12.1.0.2')]]></hint>
    <hint><![CDATA[OPT_PARAM('_unnest_subquery' 'false')]]></hint>
…
  </outline_data>
  <display_map>
    <row op="1" dis="1" par="0" prt="0" dep="1" skp="0"/>
    <row op="2" dis="2" par="1" prt="0" dep="2" skp="0"/>
    <row op="3" dis="3" par="2" prt="0" dep="3" skp="0"/>
    <row op="4" dis="3" par="3" prt="0" dep="3" skp="1"/>
    <row op="5" dis="3" par="3" prt="0" dep="3" skp="1"/>
    <row op="6" dis="4" par="3" prt="0" dep="4" skp="0"/>
    <row op="7" dis="4" par="3" prt="4" dep="3" skp="1"/>
    <row op="8" dis="4" par="3" prt="4" dep="3" skp="1"/>
    <row op="9" dis="4" par="3" prt="3" dep="3" skp="1"/>
    <row op="10" dis="4" par="3" prt="3" dep="3" skp="1"/>
    <row op="11" dis="5" par="2" prt="5" dep="3" skp="0"/>
    <row op="12" dis="6" par="5" prt="6" dep="4" skp="0"/>
    <row op="13" dis="7" par="6" prt="6" dep="5" skp="0"/>
  </display_map>
</other_xml>
OP is the plan line ID. SKP=0 indicates an active line. This data can be extracted as a set of rows and columns with the XMLTABLE function.
My thanks to Tim Hall. I was only able to work this out with the aid of his excellent article XMLTABLE: Convert XML Data into Rows and Columns using SQL and associated video.
(Update 5.3.2018): Since publishing this I have also found Martin Bach's Blog Adaptive plans and v$sql_plan and related views.
column plan_hash_value format 9999999999 heading 'PHV'
column op          format 999
column dis         format 999
column par         format 999
column dep         format 999
column skp         format 999
column object_name format a20
column operation   format a20
column options     format a20
column partition_start format a10 heading 'PStart'

with x as (
select  sql_id, plan_hash_value
,  xmltype(other_xml) xml_data
from dba_hist_sql_plan p
where sql_id IN ('bk26thygs9c8n')
and id = 1
and other_xml IS NOT NULL
), y as (
select sql_id, plan_hash_value
, xt.*
from x
,    xmltable('/other_xml/display_map/row' PASSING x.xml_data
     COLUMNS 
       "OP" NUMBER path '@op',
       "DIS" NUMBER path '@dis',
       "PAR" NUMBER path '@par',
       "DEP" NUMBER path '@dep',
       "SKP" NUMBER path '@skp'
     ) xt
where xt.skp = 0
)
select y.* 
, p.object_name, p.operation, p.options, p.partition_start
from y
, dba_hist_sql_plan p
where p.sql_id = y.sql_id
and p.plan_hash_value = y.plan_hash_value
and y.op = p.id
/
  • OTHER_XML is a CLOB, so it needs to be converted to an XMLTYPE before being passed to XML table.
  • We effectively make a Cartesian product between the row source of SQL plans in sub-query x, and XMLTABLE().
  • The XML structure is attributed rather than tagged, so you need the '@' in the path in the columns clause.
  • Inactive plan lines are filtered out with a criterion on SKP.
Now I have a SQL query that returns only the active rows in the execution plan. I can use this technique in other places to profile statements by aspects of the execution plan.
SQL_ID                PHV   OP  DIS  PAR  DEP  SKP OBJECT_NAME          OPERATION            OPTIONS              PStart
------------- ----------- ---- ---- ---- ---- ---- -------------------- -------------------- -------------------- ----------
bk26thygs9c8n    38031479    1    1    0    1    0                      SORT                 GROUP BY
bk26thygs9c8n    38031479    2    2    1    2    0                      HASH JOIN
bk26thygs9c8n    38031479    3    3    2    3    0 :BF0000              PART JOIN FILTER     CREATE
bk26thygs9c8n    38031479    6    4    3    4    0 PS_ELIM_CF_SEL2002   TABLE ACCESS         STORAGE FULL
bk26thygs9c8n    38031479   11    5    2    3    0                      PARTITION RANGE      OR                   KEY(OR)
bk26thygs9c8n    38031479   12    6    5    4    0                      PARTITION LIST       JOIN-FILTER          :BF0000
bk26thygs9c8n    38031479   13    7    6    5    0 PS_LEDGER            TABLE ACCESS         STORAGE FULL         KEY
I demonstrate a practical use of this technique in my next blog.

Monday, December 18, 2017

Hints, Patches, Force Matching and SQL Profiles

Sometimes, when all else fails, it is necessary to add hints to a piece of SQL in order to make the optimizer use a particular execution plan. We might directly add the hint to the code. However, even if it is possible to access the code directly, that may not be a good idea. In the future, if we need to change the hint, possibly due to a change in optimizer behaviour on database upgrade, then we would again need to make a code change.
Instead, we could look at Oracle's plan stability technologies.  There are 4 plan stability features in the database. They are different, but conceptually they all involve associating a set of hints with a statement that will be picked up by the optimizer at runtime.
  • SQL Outlines can be collected by the database at runtime. They can exactly match a SQL text or force match. However, they are deprecated from 11g. 
  • SQL Baselines can be collected by the database at runtime either on request or automatically. They feed real experience of previous executions of a statement back into subsequent executions. They match by SQL_ID. 
  • SQL Patches are a mechanism for manually inserting specific hints into a specific SQL ID 
  • SQL Profiles are also a mechanism for inserting specific hints into a specific SQL statement, but can also do forced matching. We are probably most used to creating profiles with Carlos Sierra's coe_xfr_sql_profile.sql script (part of Oracle support's SQLTXPLAIN), but as this blog will show that is not the only option. However, they require the database tuning pack to be licenced, which is not the case with the other mechanisms. 
There are several posts on Oracle's optimizer team that blog explains how to use SQL patches to inject a hint or disable an embedded hint.
However, the SQL patches work by attaching to a specific SQL_ID. If your application has literal values that change instead of bind variables you will not be able to inject hints with SQL patches.
The following example SQL was generated by PeopleSoft GL reporting tool, nVision. The code is dynamically generated by the tool, and there is no way to add a hint directly. The generated code contains literal values, in particular, the SELECTOR_NUM will be different for every execution. Therefore the SQL_ID will be different for every execution. Therefore, it is not possible to use a SQL Patch to inject a hint.
SELECT L2.TREE_NODE_NUM,L3.TREE_NODE_NUM,SUM(A.POSTED_TOTAL_AMT) 
FROM PS_XX_SUM_CONSOL_VW A, PSTREESELECT05 L2, PSTREESELECT10 L3 
WHERE A.LEDGER='S_USMGT' 
AND A.FISCAL_YEAR=2017 
AND A.ACCOUNTING_PERIOD BETWEEN 0 AND 12 
AND (A.DEPTID BETWEEN 'A0000' AND 'A8999' OR A.DEPTID BETWEEN 'B0000' AND 'B9149' 
OR A.DEPTID='B9156' OR A.DEPTID='B9158' OR A.DEPTID BETWEEN 'B9165' AND 'B9999' 
OR A.DEPTID BETWEEN 'C0000' AND 'C9999' OR A.DEPTID BETWEEN 'D0000' AND 'D9999' 
OR A.DEPTID BETWEEN 'G0000' AND 'G9999' OR A.DEPTID BETWEEN 'H0000' AND 'H9999' 
OR A.DEPTID='B9150' OR A.DEPTID=' ') 
AND L2.SELECTOR_NUM=10228 
AND A.BUSINESS_UNIT=L2.RANGE_FROM_05 
AND L3.SELECTOR_NUM=10231 
AND A.ACCOUNT=L3.RANGE_FROM_10 
AND A.CHARTFIELD1='0012345' 
AND A.CURRENCY_CD='GBP' 
GROUP BY L2.TREE_NODE_NUM,L3.TREE_NODE_NUM
/
If only the literal values differ, then the SQL statements will have the same force matching signature and one SQL profile will match all similar statements. Although, statements will not force match if the number of predicates changes. Nonetheless, I can inject hints with a SQL profile. This is the plan I get initially without a profile. It doesn't perform well, and it is not the plan I want.
Plan hash value: 808840077
-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                     | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                              |                   |       |       | 10408 (100)|          |       |       |
|   1 |  HASH GROUP BY                                |                   |   517 | 50666 | 10408   (1)| 00:00:01 |       |       |
|   2 |   HASH JOIN                                   |                   |   517 | 50666 | 10407   (1)| 00:00:01 |       |       |
|   3 |    PARTITION RANGE SINGLE                     |                   |   731 | 13158 |     3   (0)| 00:00:01 | 10228 | 10228 |
|   4 |     INDEX FAST FULL SCAN                      | PSAPSTREESELECT05 |   731 | 13158 |     3   (0)| 00:00:01 | 10228 | 10228 |
|   5 |    HASH JOIN                                  |                   |   518 | 41440 | 10404   (1)| 00:00:01 |       |       |
|   6 |     PARTITION RANGE SINGLE                    |                   |   249 |  5727 |     2   (0)| 00:00:01 | 10231 | 10231 |
|   7 |      INDEX FAST FULL SCAN                     | PSAPSTREESELECT10 |   249 |  5727 |     2   (0)| 00:00:01 | 10231 | 10231 |
|   8 |     PARTITION RANGE ITERATOR                  |                   |  7785 |   433K| 10402   (1)| 00:00:01 |    28 |    40 |
|   9 |      TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| PS_X_LEDGER_ACCTS |  7785 |   433K| 10402   (1)| 00:00:01 |    28 |    40 |
|  10 |       SORT CLUSTER BY ROWID BATCHED           |                   |  5373 |       |  5177   (1)| 00:00:01 |       |       |
|  11 |        INDEX SKIP SCAN                        | PS_X_LEDGER_ACCTS |  5373 |       |  5177   (1)| 00:00:01 |    28 |    40 |
-----------------------------------------------------------------------------------------------------------------------------------
These are the hints I want to introduce. I want to force materialize view rewrite, invoke parallelism if the statement is estimated to run for at least 2 seconds, and use a Bloom filter on the materialized view.
SELECT /*+OPT_PARAM('parallel_degree_policy','AUTO') OPT_PARAM('parallel_min_time_threshold',2) 
OPT_PARAM('parallel_degree_limit',4) REWRITE PX_JOIN_FILTER(PS_XX_SUM_GCNSL_MV)*/…
This application produces many statements that I may want to control with a profile, but with forced matching this comes down to a not unmanageable number.  I have created a data-driven framework to create the profiles. I have created working storage table into which I will populate it with details of each force matching signature for which I want to create a profile.
CREATE TABLE dmk_fms_profiles
(force_matching_signature    NUMBER NOT NULL
,sql_id                      VARCHAR2(13) 
,plan_hash_value             NUMBER
,module                      VARCHAR2(64)
,report_id                   VARCHAR2(32) /*Application Specific*/
,tree_list                   CLOB         /*Application Specific*/
,sql_profile_name            VARCHAR2(30)
,parallel_min_time_threshold NUMBER
,parallel_degree_limit       NUMBER
,other_hints                 CLOB
,delete_profile              VARCHAR2(1)
,sql_text                    CLOB
,CONSTRAINT dmk_fms_profiles_pk PRIMARY KEY (force_matching_signature)
,CONSTRAINT dmk_fms_profiles_u1 UNIQUE (sql_id)
,CONSTRAINT dmk_fms_profiles_u2 UNIQUE (sql_profile_name)
)
/
Using conditional parallelism with the PARALLEL_MIN_TIME_THRESHOLD, but limited with PARALLEL_DEGREE_LIMIT is a frequently used tactic with this application, so I have specified columns for those hints, otherwise, hints are injected via a string. I identified the problematic SQL by analysis with ASH, and hence I also obtained the FORCE_MATCHING_SIGNATURE for them. The metadata is keyed by FORCE_MATCHING_SIGNATURE. I am specifying an additional hint to force materialized view rewrite and use a Bloom filter on the ledger table. I have specified a meaningful name for the SQL profile.
INSERT INTO dmk_fms_profiles (force_matching_signature, parallel_min_time_threshold, parallel_degree_limit, other_hints, sql_profile_name) 
VALUES (16625752171077499412, 1, 4, 'REWRITE PX_JOIN_FILTER(PS_FT_SUM_GCNSL_MV)', 'NVS_GBGL123I_BU_CONSOL_ACCOUNT');
COMMIT;
Profiles are created on the text of a SQL rather than the SQL_ID or FORCE_MATCHING_SIGNATURE directly. Therefore the SQL_TEXT must be extracted from the AWR, so this method also requires that the statement has been captured by an AWR snapshot.
UPDATE dmk_fms_profiles a
SET    (module, action, sql_id, plan_hash_value, sql_text)
=      (SELECT s.module, s.action, s.sql_id, s.plan_hash_value, t.sql_text
        FROM   dba_hist_sqlstat s
        ,      dba_hist_sqltext t
        WHERE  t.dbid = s.dbid
        AND    t.sql_id = s.sql_id
        AND    s.force_matching_signature = a.force_matching_signature
        AND    s.snap_id = (
   SELECT MAX(s1.snap_id)
  FROM   dba_hist_sqlstat s1
  WHERE  s1.force_matching_signature = a.force_matching_signature
  AND    s1.module = 'RPTBOOK'     /*Application Specific*/
   AND    s1.action LIKE 'PI=%:%:%' /*Application Specific*/)
  AND    s.module = 'RPTBOOK'             /*Application Specific*/
  AND    s.action LIKE 'PI=%:%:%'         /*Application Specific*/
  AND    ROWNUM = 1)
WHERE sql_id IS NULL
/

MERGE INTO dmk_fms_profiles u
USING (
SELECT a.sql_id, a.force_matching_signature, p.name
FROM   dmk_fms_profiles a
,      dba_sql_profiles p
WHERE  p.signature = a.force_matching_signature
) s
ON (s.force_matching_signature = u.force_matching_signature)
WHEN MATCHED THEN UPDATE
SET u.sql_profile_name = s.name
/
Columns REPORT_ID and TREE_LIST contain application specific information extracted from the application instrumentation and logging. They only mean something in the context of the application.
/*Application Specific - extract report ID from ACTION*/
UPDATE dmk_fms_profiles a
SET    report_id = substr(regexp_substr(s.action,':([A-Za-z0-9_-])+',1,1),2)
WHERE  report_id IS NULL
AND    action IS NOT NULL
/
Now I can produce a simple report of the metadata in order to see what profiles should be created.
column sql_text word_wrapped on format a110
column module format a8
column report_id heading 'nVision|Report ID'
column tree_list word_wrapped on format a20
column plan_hash_value             heading 'SQL Plan|Hash Value'         format 9999999999
column parallel_min_time_threshold heading 'Parallel|Min Time|Threshold' format 999
column parallel_degree_limit       heading 'Parallel|Degree|Limit'       format 999
set long 500
SELECT * FROM dmk_fms_profiles
/

                                          SQL Plan                                                                           
FORCE_MATCHING_SIGNATURE SQL_ID         Hash Value MODULE   ACTION                                                           
------------------------ ------------- ----------- -------- ---------------------------------------------------------------- 
                                                                                      Parallel Parallel
nVision                                                                               Min Time   Degree
Report ID                        TREE_LIST            SQL_PROFILE_NAME               Threshold    Limit D
-------------------------------- -------------------- ------------------------------ --------- -------- -
OTHER_HINTS
--------------------------------------------------------------------------------
SQL_TEXT
--------------------------------------------------------------------------------------------------------------
    12803175998948432502 5pzxhha3392cs   988048519 RPTBOOK  PI=3186222:USGL233I:10008                                        
USGL233I                         BU_GAAP_CONSOL,      NVS_GBGL123I_BU_CONSOL_ACCOUNT                   1        4
                                 GAAP_ACCOUNT
REWRITE PX_JOIN_FILTER(PS_XX_SUM_GCNSL_MV)
SELECT L2.TREE_NODE_NUM,A.ACCOUNT,SUM(A.POSTED_TOTAL_AMT) FROM PS_LEDGER A, PSTREESELECT05 L2, PSTREESELECT10 L3
WHERE A.LEDGER='S_GBMGT' AND A.FISCAL_YEAR=2017 AND A.ACCOUNTING_PERIOD BETWEEN 0 AND 12 AND (A.DEPTID BETWEEN
'A0000' AND 'A8999' OR A.DEPTID BETWEEN 'B0000' AND 'B9149' OR A.DEPTID='B9156' OR A.DEPTID='B9158' OR A.DEPTID
BETWEEN 'B9165' AND 'B9999' OR A.DEPTID BETWEEN 'C0000' AND 'C9999' OR A.DEPTID BETWEEN 'D0000' AND 'D9999' OR
A.DEPTID BETWEEN 'G0000' AND 'G9999' OR A.DE
Next, this PL/SQL block will create or recreate SQL profiles from the metadata. The various hints can be concatenated into a single string and passed as a parameter to SQLPROF_ATTR. The SQL text is passed as a parameter when the profile is created.
set serveroutput on
DECLARE
  l_signature NUMBER;
  h       SYS.SQLPROF_ATTR;
  e_no_sql_profile EXCEPTION;
  PRAGMA EXCEPTION_INIT(e_no_sql_profile, -13833);
  l_description CLOB;
BEGIN

FOR i IN (
  SELECT f.*, s.name
  FROM   dmk_fms_profiles f
    LEFT OUTER JOIN dba_sql_profiles s
    ON f.force_matching_signature = s.signature
) LOOP

  BEGIN 
    IF i.name IS NOT NULL AND i.delete_profile = 'Y' THEN
      dbms_sqltune.drop_sql_profile(name => i.name);
    END IF;
    EXCEPTION WHEN e_no_sql_profile THEN NULL;
  END;

  IF i.delete_profile = 'Y' THEN 
    NULL;
  ELSIF i.sql_text IS NOT NULL THEN
    h := SYS.SQLPROF_ATTR(
q'[BEGIN_OUTLINE_DATA]',
CASE WHEN i.parallel_min_time_threshold>=0 THEN 'OPT_PARAM(''parallel_degree_policy'',''AUTO'') ' END||
CASE WHEN i.parallel_degree_limit      >=0 THEN 'OPT_PARAM(''parallel_degree_limit'','      ||i.parallel_degree_limit      ||') ' END||
CASE WHEN i.parallel_min_time_threshold>=0 THEN 'OPT_PARAM(''parallel_min_time_threshold'','||i.parallel_min_time_threshold||') ' END||
i.other_hints,
q'[END_OUTLINE_DATA]');

    l_signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(i.sql_text);
    l_description := 'coe nVision '||i.report_id||' '||i.tree_list||' '||i.force_matching_signature||'='||l_signature;
    dbms_output.put_line(i.sql_profile_name||' '||l_description);

    DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
sql_text    => i.sql_text,
profile     => h,
name        => i.sql_profile_name,
description => l_description,
category    => 'DEFAULT',
validate    => TRUE,
replace     => TRUE,
force_match => TRUE /* TRUE:FORCE (match even when different literals in SQL). FALSE:EXACT (similar to CURSOR_SHARING) */ );

  END IF;
END LOOP;
END;
/
I can verify that the profile has been created, and the hints that it contains, thus:
SELECT profile_name,
       xmltype(comp_data) as xmlval
FROM   dmk_fms_profiles p
,      dbmshsxp_sql_profile_attr  x
WHERE  x.profile_name = p.sql_profile_name
AND    p.status = 'ENABLED'
ORDER BY 1
/

PROFILE_NAME                                                                                                                                                                                            
------------------------------
XMLVAL                                                                                                                                                                                                  
------------------------------------------------------------------------------------------------
NVS_GBGL123I_BU_CONSOL_ACCOUNT
<![CDATA[BEGIN_OUTLINE_DATA]]>                                                                                                                                                           
  <![CDATA[OPT_PARAM('parallel_degree_policy','AUTO') OPT_PARAM('parallel_degree_limit',4) OPT_PARAM('parallel_min_time_threshold',1) REWRITE PX_JOIN_FILTER(PS_FT_SUM_GCNSL_MV)]]>                                           
  <![CDATA[END_OUTLINE_DATA]]>
And now when the application runs, I get the plan that I wanted.
  • The query runs in parallel.
  • The SQL is rewritten to use materialized view.
  • There are no indexes so it must full scan it.
  • It generates a bloom filter from PSTREESELECT10 and applies it to the materialized view.
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                         | Name               | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                                  |                    |       |       |  2219 (100)|          |       |       |        |      |            |
|   1 |  PX COORDINATOR                                   |                    |       |       |         |             |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)                             | :TQ10004           |   111 |  9879 |  2219   (6)| 00:00:01 |       |       |  Q1,04 | P->S | QC (RAND)  |
|   3 |    HASH GROUP BY                                  |                    |   111 |  9879 |  2219   (6)| 00:00:01 |       |       |  Q1,04 | PCWP |            |
|   4 |     PX RECEIVE                                    |                    |   111 |  9879 |  2219   (6)| 00:00:01 |       |       |  Q1,04 | PCWP |            |
|   5 |      PX SEND HASH                                 | :TQ10003           |   111 |  9879 |  2219   (6)| 00:00:01 |       |       |  Q1,03 | P->P | HASH       |
|   6 |       HASH GROUP BY                               |                    |   111 |  9879 |  2219   (6)| 00:00:01 |       |       |  Q1,03 | PCWP |            |
|   7 |        HASH JOIN                                  |                    |   536 | 47704 |  2218   (6)| 00:00:01 |       |       |  Q1,03 | PCWP |            |
|   8 |         PX RECEIVE                                |                    |   536 | 38056 |  2215   (6)| 00:00:01 |       |       |  Q1,03 | PCWP |            |
|   9 |          PX SEND HYBRID HASH                      | :TQ10002           |   536 | 38056 |  2215   (6)| 00:00:01 |       |       |  Q1,02 | P->P | HYBRID HASH|
|  10 |           STATISTICS COLLECTOR                    |                    |       |       |         |             |       |       |  Q1,02 | PCWC |            |
|  11 |            HASH JOIN                              |                    |   536 | 38056 |  2215   (6)| 00:00:01 |       |       |  Q1,02 | PCWP |            |
|  12 |             BUFFER SORT                           |                    |       |       |         |             |       |       |  Q1,02 | PCWC |            |
|  13 |              JOIN FILTER CREATE                   | :BF0000            |   236 |  3776 |     2   (0)| 00:00:01 |       |       |  Q1,02 | PCWP |            |
|  14 |               PX RECEIVE                          |                    |   236 |  3776 |     2   (0)| 00:00:01 |       |       |  Q1,02 | PCWP |            |
|  15 |                PX SEND BROADCAST                  | :TQ10000           |   236 |  3776 |     2   (0)| 00:00:01 |       |       |        | S->P | BROADCAST  |
|  16 |                 PARTITION RANGE SINGLE            |                    |   236 |  3776 |     2   (0)| 00:00:01 | 36774 | 36774 |        |      |            |
|  17 |                  INDEX FAST FULL SCAN             | PSAPSTREESELECT10  |   236 |  3776 |     2   (0)| 00:00:01 | 36774 | 36774 |        |      |            |
|  18 |             JOIN FILTER USE                       | :BF0000            |  8859 |   475K|  2213   (6)| 00:00:01 |       |       |  Q1,02 | PCWP |            |
|  19 |              PX BLOCK ITERATOR                    |                    |  8859 |   475K|  2213   (6)| 00:00:01 |    29 |    41 |  Q1,02 | PCWC |            |
|  20 |               MAT_VIEW REWRITE ACCESS STORAGE FULL| PS_XX_SUM_GCNSL_MV |  8859 |   475K|  2213   (6)| 00:00:01 |    29 |    41 |  Q1,02 | PCWP |      
|  21 |         BUFFER SORT                               |                    |       |       |         |             |       |       |  Q1,03 | PCWC |            |
|  22 |          PX RECEIVE                               |                    |   731 | 13158 |     3   (0)| 00:00:01 |       |       |  Q1,03 | PCWP |            |
|  23 |           PX SEND HYBRID HASH                     | :TQ10001           |   731 | 13158 |     3   (0)| 00:00:01 |       |       |        | S->P | HYBRID HASH|
|  24 |            PARTITION RANGE SINGLE                 |                    |   731 | 13158 |     3   (0)| 00:00:01 | 36773 | 36773 |        |      |            |
|  25 |             INDEX FAST FULL SCAN                  | PSAPSTREESELECT05  |   731 | 13158 |     3   (0)| 00:00:01 | 36773 | 36773 |        |      |            |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Conclusion SQL 

Profiles can be used in much the same way as SQL Patches to introduce hints into application SQL without changing the code, the difference being that SQL Profiles can force match SQL. However, SQL Profiles do require the Tuning pack to be licenced, whereas SQL Patches do not. Some hints do not work because the profile is applied to late in the parse process. For example, if you use cursor sharing then the CURSOR_SHARING_EXACT hint will not take effect.

Friday, December 08, 2017

Changing Physical Index Attributes without Down Time

Normally, we make an index invisible before we drop it, in order to see whether we can manage without it, and if not we can make it visible again.  In this blog, I will demonstrate how to use index invisibility to introduce an index change that I cannot do with an online operation.  I am also able to reverse it immediately, whereas an on-line operation would take time.

Problem Statement

I have a large partitioned table, but the primary key index on it was not partitioned.  Testing has shown that performance would improve if the index was locally partitioned.  It is not possible to introduce the partitioning by rebuilding the index online.  I cannot afford the downtime to drop and rebuild the index, and anyway I want an easy way back to the original state in case of a problem.

Demonstration

I encountered this problem in a PeopleSoft system with a unique index, but here I will demonstrate it with a primary key constraint also.  I will create and populate a simple range partitioned table with a primary key constraint.
DROP TABLEt PURGE;

CREATE TABLE t
(a number not null
,b number not null
,c number
,d number) 
partition by range (a) interval (42)
(partition values less than (42))
/
ALTER TABLE t ADD CONSTRAINT t_pk PRIMARY KEY (a,b)
/
TRUNCATE TABLE t
/
INSERT INTO t
WITH x AS (select rownum n from dual connect by level <= 100)
SELECT a.n, b.n, a.n*b.n, a.n+b.n
FROM x a, x b
/
Note that table is partitioned, but the unique index is not. I haven't had to explicitly build it without partitioning.
set long 5000
select dbms_metadata.get_ddl('INDEX','T_PK')
from dual
/
CREATE UNIQUE INDEX "SCOTT"."T_PK" ON "SCOTT"."T" ("A", "B")
PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
TABLESPACE "USERS"
I cannot just create the new index because the column list is already indexed.
CREATE UNIQUE INDEX SCOTT.T_PK_NEW ON SCOTT.T (A, B)
                                               *
ERROR at line 1:
ORA-01408: such column list already indexed
So, now I will create a new unique index on the same columns, but I will create it invisible. I will also create it online so that I do not block the application from performing DML on the table while the index is built.
CREATE UNIQUE INDEX SCOTT.T_PK_NEW ON SCOTT.T (A, B)
LOCAL INVISIBLE ONLINE
/
So now I have two unique indexes, one visible, one invisible
set autotrace off
column constraint_name format a20
column table_name format a10
column index_name format a10
SELECT index_name, visibility FROM user_indexes WHERE table_name = 'T'
/

INDEX_NAME VISIBILIT
---------- ---------
T_PK       VISIBLE
T_PK_NEW   INVISIBLE
I cannot make the new index visible while the original index is also visible because I cannot have two unique indexes
ALTER INDEX SCOTT.T_PK_NEW VISIBLE
/
*
ERROR at line 1:
ORA-14147: There is an existing VISIBLE index defined on the same set of columns.
Instead, I have to make the original index invisible first. However, even if both indexes are invisible the unique constraint is still enforced.
ALTER INDEX SCOTT.T_PK INVISIBLE
/
INSERT INTO t VALUES (1,2,3,4)
/
*
ERROR at line 1:
ORA-00001: unique constraint (SCOTT.T_PK) violated

ALTER INDEX SCOTT.T_PK_NEW VISIBLE
/
Now my new index is visible.
SELECT index_name, visibility FROM user_indexes WHERE table_name = 'T'
/

INDEX_NAME VISIBILIT
---------- ---------
T_PK       INVISIBLE
T_PK_NEW   VISIBLE
The constraint still points to the original index.
INSERT INTO t VALUES(1,2,3,4)
/
*
ERROR at line 1:
ORA-00001: unique constraint (SCOTT.T_PK) violated

SELECT constraint_name, table_name, index_name
FROM   user_constraints
WHERE  constraint_name = 'T_PK'
/

CONSTRAINT_NAME      TABLE_NAME INDEX_NAME
-------------------- ---------- ----------
T_PK                 T          T_PK
However, queries now use the new index.
set autotrace on lines 160
SELECT * FROM t WHERE a = 1 and b=2
/
Plan hash value: 3024084754
---------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |          |     1 |    52 |     2   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE SINGLE            |          |     1 |    52 |     2   (0)| 00:00:01 |     1 |     1 |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| T        |     1 |    52 |     2   (0)| 00:00:01 |     1 |     1 |
|*  3 |    INDEX UNIQUE SCAN               | T_PK_NEW |     1 |       |     1   (0)| 00:00:01 |     1 |     1 |
---------------------------------------------------------------------------------------------------------------
The original index can now be removed. However, unique indexes used by primary key constraints cannot be dropped directly. Instead the modifying the primary key constraint drops the original index.
ALTER TABLE scott.t 
MODIFY CONSTRAINT t_pk USING INDEX t_pk_new
/

SELECT constraint_name, table_name, index_name
FROM   user_constraints
WHERE  constraint_name = 'T_PK'
/
CONSTRAINT_NAME      TABLE_NAME INDEX_NAME
-------------------- ---------- ----------
T_PK                 T          T_PK_NEW

SELECT index_name, visibility FROM user_indexes WHERE table_name = 'T'
/

INDEX_NAME VISIBILIT
---------- ---------
T_PK_NEW   VISIBLE
Finally, the new index can be renamed.
ALTER INDEX t_pk_new RENAME TO t_pk
/