Monday, December 16, 2019

Extended Column Group Statistics, Composite Index Statistics, Histograms and an EDB360 Enhancement to Detect the Coincidence

In this post:
  • A simple demonstration to show the behaviour of extended statistics and how it can be disabled by the presence of histograms.  None of this is new, there are many other blog posts on this topic. I provide links to some of them.
  • I have added an enhancement to the EDB360 utility to detect histograms on columns in extended statistics.

Introduction

'Extended statistics were introduced in Oracle 11g to allow statistics to be gathered on groups of columns, to highlight the relationship between them, or on expressions. Oracle 11gR2 makes the process of gathering extended statistics for column groups easier'. [Tim Hall: https://oracle-base.com/articles/11g/extended-statistics-enhancements-11gr2]

Example 1: Cardinality from Extended Statistics

Without extended statistics, Oracle will simply multiply column cardinalities together.  Here is a simple example.  I will create a table with 10000 rows, where two columns each have the same 100 rows of 100 values, so they correlate perfectly.  I will gather statistics, but no histograms.
create table t
(k number
,a number
,b number
,x varchar2(1000)
);

insert into t
with n as (select rownum n from dual connect by level <= 100)
select rownum, a.n, a.n, TO_CHAR(TO_DATE(rownum,'J'),'Jsp')
from n a, n b
order by a.n, b.n;

exec dbms_stats.gather_table_stats(user,'T',method_opt=>'FOR ALL COLUMNS SIZE 1');
I will deliberately disable optimizer feedback so that Oracle cannot learn from experience about the cardinality misestimates.
alter session set statistics_level=ALL;
alter session set "_optimizer_use_feedback"=FALSE;

select count(*) from t where a = 42 and b=42;

  COUNT(*)
----------
       100
Oracle estimates that it will get 1 row but actually gets 100.
It estimates 1 because it is 1/100 * 1/100 * 10000 rows
select * from table(dbms_xplan.display_cursor(null,null,format=>'ADVANCED +ALLSTATS LAST, IOSTATS -PROJECTION -OUTLINE'));

Plan hash value: 1071362934
---------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |       |    21 (100)|          |      1 |00:00:00.01 |      73 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |    26 |            |          |      1 |00:00:00.01 |      73 |
|*  2 |   TABLE ACCESS FULL| T    |      1 |      1 |    26 |    21   (0)| 00:00:01 |    100 |00:00:00.01 |      73 |
---------------------------------------------------------------------------------------------------------------------
Now I will create extended statistics on the column group.  I can do that in one of two ways:
  • either by explicitly creating the definition and then creating them by gathering statistics:
SELECT dbms_stats.create_extended_stats(ownname=>user, tabname=>'t', extension=>'(a,b)')
FROM dual;
exec dbms_stats.gather_table_stats(user,'T',method_opt=>'FOR ALL COLUMNS SIZE 1');
  • Or, I can create extended statistics directly in one go by specifying them in the method opt clause.
exec dbms_stats.gather_table_stats(user,'T',method_opt=>'FOR ALL COLUMNS SIZE 1, FOR COLUMNS SIZE 1 (A,B)');
Now Oracle correctly estimates that the same query will fetch 100 rows because it directly knows the cardinality for the two columns in the query.
Plan hash value: 1071362934
---------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |       |    21 (100)|          |      1 |00:00:00.01 |      73 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |     6 |            |          |      1 |00:00:00.01 |      73 |
|*  2 |   TABLE ACCESS FULL| T    |      1 |    100 |   600 |    21   (0)| 00:00:01 |    100 |00:00:00.01 |      73 |
---------------------------------------------------------------------------------------------------------------------

Example 2: Cardinality from Index Statistics

I can get exactly the same effect by creating an index on the two columns.
drop table t purge;

create table t
(k number
,a number
,b number
,x varchar2(1000)
);

insert into t
with n as (select rownum n from dual connect by level <= 100)
select rownum, a.n, a.n, TO_CHAR(TO_DATE(rownum,'J'),'Jsp')
from n a, n b
order by a.n, b.n;

create index t_ab on t(a,b) compress;
This time I have not collected any statistics on the table.  Statistics are automatically collected on the index when it is built.  I have used a hint to stop the query using the index to look up the rows, nonetheless, Oracle has correctly estimated that it will get 100 rows because it has used the number of distinct keys from the index statistics.
SQL_ID  711banpfgfa18, child number 0
-------------------------------------
select /*+FULL(t)*/ count(*) from t where a = 42 and b=42

Plan hash value: 1071362934
---------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |       |    21 (100)|          |      1 |00:00:00.01 |      74 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |    26 |            |          |      1 |00:00:00.01 |      74 |
|*  2 |   TABLE ACCESS FULL| T    |      1 |    100 |  2600 |    21   (0)| 00:00:01 |    100 |00:00:00.01 |      74 |
---------------------------------------------------------------------------------------------------------------------
Note, that there is nothing in the execution plan to indicate that the index statistics were used to estimate the number of rows returned!

Example 3: Histograms Disable the Use of Extended Statistics

There have long been blogs that refer to behaviour that Oracle has documented as Bug 6972291: Column group selectivity is not used when there is a histogram on one column:
'As of 10.2.0.4 CBO can use the selectivity of column groups but this option is disabled if there is a histogram defined on any of the columns of the column group.
Note:  This fix is disabled by default. To enable the fix set "_fix_control"="6972291:ON"
When ENABLED the code will use multi-column stats regardless of whether there is a histogram on one of the columns or not.  When DISABLED (default) CBO will not use multi-column stats if there is a histogram on one of the columns in the column group.'
  • Christian Antognini, 2014: https://antognini.ch/2014/02/extension-bypassed-because-of-missing-histogram/
  • Jonathan Lewis, 2012: https://jonathanlewis.wordpress.com/2012/04/11/extended-stats/
    • Maria Colgan also commented: 'This … was a deliberate design decision to prevent over-estimations when one of the values supplied is ‘out of range’. We can’t ignore the ‘out of range’ scenario just because we have a column group. Extended statistics do not contain the min, max values for the column group so we rely on the individual column statistics to check for ‘out of range’ scenarios like yours. When one of the columns is ‘out of range’, we revert back to the column statistics because we know it is going to generate a lower selectivity range and if one of the columns is ‘out of range’ then the number of rows returned will be lower or none at all, as in your example'
In this example, I explicitly create a histogram on one of the columns in my extended statistics.  However, in the real world that can happen automatically if the application references one column and not another.
exec dbms_stats.gather_table_stats(user,'T',method_opt=>'FOR ALL COLUMNS SIZE AUTO, FOR COLUMNS SIZE 100 A, FOR COLUMNS SIZE 1 B (A,B)');
My cardinality estimate goes back to 1 because Oracle does use the extended statistics in the presence of a histogram on any of the constituent columns.  Exactly the same happens if the number of distinct values on the combination of columns comes from composite index statistics.  A histogram similarly disables their use.
SQL_ID  8trj2kacqhm6f, child number 1
-------------------------------------
select count(*) from t where a = 42 and b=42

Plan hash value: 1071362934
---------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |       |    21 (100)|          |      1 |00:00:00.01 |      73 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |     6 |            |          |      1 |00:00:00.01 |      73 |
|*  2 |   TABLE ACCESS FULL| T    |      1 |      1 |     6 |    21   (0)| 00:00:01 |    100 |00:00:00.01 |      73 |
---------------------------------------------------------------------------------------------------------------------
This is likely to happen in real-life systems because histograms can be automatically created when statistics are gathered.

Out-of-Range Predicates

If you have one or more predicates on columns that are part of an extended statistics, and that predicate goes out of range when compared to the column statistics, then Oracle still doesn’t use the extended statistics (see also https://jonathanlewis.wordpress.com/2012/04/11/extended-stats/), irrespective of whether it has a histogram or not, or whether fix control 6972291 is set or not.
The extended histogram uses a virtual column whose value is derived from SYS_OP_COMBINED_HASH().  You can see this in the default data value for the column.  Therefore the optimizer cannot use the minimum/maximum value (see also https://jonathanlewis.wordpress.com/2018/08/02/extended-histograms-2/).
Instead, Oracle does the linear decay of the density of the column predicates, and if there is a frequency or top-frequency histogram then it uses half the density of the lowest frequency bucket and applies linear decay to that.

Example 4: Extended Histograms

This time I will create a histogram on my extended statistics as well as histograms on the underlying columns.
exec dbms_stats.gather_table_stats(user,'T',method_opt=>'FOR ALL COLUMNS SIZE AUTO, FOR COLUMNS SIZE 100 A B (A,B)');
I am back to getting the correct cardinality estimate.
SQL_ID  8trj2kacqhm6f, child number 0
-------------------------------------
select count(*) from t where a = 42 and b=42

Plan hash value: 1071362934
---------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |       |    21 (100)|          |      1 |00:00:00.01 |      73 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |     6 |            |          |      1 |00:00:00.01 |      73 |
|*  2 |   TABLE ACCESS FULL| T    |      1 |    100 |   600 |    21   (0)| 00:00:01 |    100 |00:00:00.01 |      73 |
---------------------------------------------------------------------------------------------------------------------
And this is also something that has been blogged about previously:

Threats

In this blog, Jonathan Lewis comments (https://jonathanlewis.wordpress.com/2012/04/11/extended-stats/) on certain weaknesses in the implementations.  He also references other bloggers.
Either creating or removing histograms on columns in either extended statistics or composite indexes may result in the execution plan changing because those extended statistics may change.  This could happen automatically when gathering statistics as data skew and predicate usage changes.
If I drop a composite index, maybe because it is not used, or maybe it is redundant because it is a subset of another index, then I should replace it with an extended histogram on the same set of columns.

Detecting the Problem

I have added a report to section 3c of EDB360 to detect the problem.  The SQL query is shown below.  It will report on histograms on columns in either:
  • composite indexes where there are no extended column group statistics, or
  • extended column group statistics for which there are no histograms.

3c.25. Columns with Histograms in Extended Statistics (DBA_STAT_EXTENSIONS)

#
Table
Owner
Table
Name
Object
Type
Index/Extension Name
Number of
Distinct 
Values
Number of
Buckets
EXTENSION
Column
Name
Column
Number of
Distinct
Values

Column
Number of
Buckets

Column
Histogram
Type
1
HR
JOB_HISTORY
Index
JHIST_EMP_ID_ST_DATE_PK
10
("EMPLOYEE_ID","START_DATE")
EMPLOYEE_ID
7
7
FREQUENCY
2
OE
INVENTORIES
Index
INVENTORY_IX
1112
("WAREHOUSE_ID","PRODUCT_ID")
PRODUCT_ID
208
208
FREQUENCY
3
OE
ORDER_ITEMS
Index
ORDER_ITEMS_PK
665
("ORDER_ID","LINE_ITEM_ID")
ORDER_ID
105
105
FREQUENCY
4
OE
ORDER_ITEMS
Index
ORDER_ITEMS_UK
665
("ORDER_ID","PRODUCT_ID")
ORDER_ID
105
105
FREQUENCY
5
OE
ORDER_ITEMS
Index
ORDER_ITEMS_UK
665
("ORDER_ID","PRODUCT_ID")
PRODUCT_ID
185
185
FREQUENCY
6
SCOTT
T
Extension
SYS_STUNA$6DVXJXTP05EH56DTIR0X
100
1
("A","B")
A
100
100
FREQUENCY
7
SCOTT
T
Extension
SYS_STUNA$6DVXJXTP05EH56DTIR0X
100
1
("A","B")
B
100
100
FREQUENCY
8
SOE
ORDERS
Index
ORD_WAREHOUSE_IX
10270
("WAREHOUSE_ID","ORDER_STATUS")
ORDER_STATUS
10
10
FREQUENCY
9
SOE
ORDER_ITEMS
Index
ORDER_ITEMS_PK
13758515
("ORDER_ID","LINE_ITEM_ID")
LINE_ITEM_ID
7
7
FREQUENCY

Just because something is reported by this test, does not necessarily mean that you need to change anything.
  • Providing fix control 6972291 is not enabled, should you wish to drop or alter any reported index, you at least know that it cannot be used to provide column group statistics.  Though you would still need to consider SQL that might use the index directly.
  • You might choose to add column group histograms, and sometimes that will involve adding column statistics.  However, the number of distinct values on the column group will usually be higher than on the individual columns and can easily be greater than the number of buckets you can have in a frequency histogram.  In such cases, from 12c, you may end up with either a Top-frequency histogram or a hybrid histogram.
  • Or you might choose to remove the histograms from the individual columns so that the column group statistics are used.
  • Or you might choose to enforce the status quo, by setting table statistics preferences to ensure currently existing histograms are preserved and currently, non-existent histograms are not introduced.
Whatever you choose to do regarding statistics and histogram collection, I would certainly recommend doing so declaratively, by defining a table statistic preference.  For example, here I will preserve the histograms on the columns in the column group, but I will also build a histogram on the column group:
exec dbms_stats.gather_table_stats(user,'T',method_opt=>'FOR ALL COLUMNS SIZE AUTO, FOR COLUMNS SIZE 254 A B (A,B)');
  • Or, you might even enable the fix_control. You can also do that at session level or even statement level (but beware of disabling any other fix controls that may be set). 
SQL_ID  16judk2v0uf7w, child number 0
-------------------------------------
select /*+FULL(t) OPT_PARAM('_fix_control','6972291:on')*/ count(*)
from t where a = 42 and b=42

Plan hash value: 1071362934
---------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |       |    21 (100)|          |      1 |00:00:00.01 |      73 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |     6 |            |          |      1 |00:00:00.01 |      73 |
|*  2 |   TABLE ACCESS FULL| T    |      1 |    100 |   600 |    21   (0)| 00:00:01 |    100 |00:00:00.01 |      73 |
---------------------------------------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      OPT_PARAM('_fix_control' '6972291:1')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "T"@"SEL$1")
      END_OUTLINE_DATA
  */

EDB360 Test Query

This is the SQL query that produces the report in EDB360.
WITH i as ( /*composite indexes*/
SELECT i.table_owner, i.table_name, i.owner index_owner, i.index_name, i.distinct_keys
,        '('||(LISTAGG('"'||c.column_name||'"',',') WITHIN GROUP (order by c.column_position))||')' column_list
FROM    dba_indexes i
,       dba_ind_columns c
WHERE   i.table_owner = c.table_owner
AND     i.table_name = c.table_name
AND     i.owner = c.index_owner
AND     i.index_name = c.index_name
AND     i.table_name NOT LIKE 'BIN$%'
AND     i.table_owner NOT IN ('ANONYMOUS','APEX_030200','APEX_040000','APEX_040200','APEX_SSO','APPQOSSYS','CTXSYS','DBSNMP','DIP','EXFSYS','FLOWS_FILES','MDSYS','OLAPSYS','ORACLE_OCM','ORDDATA','ORDPLUGINS','ORDSYS','OUTLN','OWBSYS')
AND     i.table_owner NOT IN ('SI_INFORMTN_SCHEMA','SQLTXADMIN','SQLTXPLAIN','SYS','SYSMAN','SYSTEM','TRCANLZR','WMSYS','XDB','XS$NULL','PERFSTAT','STDBYPERF','MGDSYS','OJVMSYS')
GROUP BY i.table_owner, i.table_name, i.owner, i.index_name, i.distinct_keys
HAVING COUNT(*) > 1 /*index with more than one column*/
), e as ( /*extended stats*/
SELECT  e.owner, e.table_name, e.extension_name
,       CAST(e.extension AS VARCHAR(1000)) extension
,       se.histogram, se.num_buckets, se.num_distinct
FROM    dba_stat_extensions e
,       dba_tab_col_statistics se
WHERE   e.creator = 'USER'
AND     se.owner = e.owner
AND     se.table_name = e.table_name
AND     se.column_name = e.extension_name
AND     e.table_name NOT LIKE 'BIN$%'
AND     e.owner NOT IN ('ANONYMOUS','APEX_030200','APEX_040000','APEX_040200','APEX_SSO','APPQOSSYS','CTXSYS','DBSNMP','DIP','EXFSYS','FLOWS_FILES','MDSYS','OLAPSYS','ORACLE_OCM','ORDDATA','ORDPLUGINS','ORDSYS','OUTLN','OWBSYS')
AND     e.owner NOT IN ('SI_INFORMTN_SCHEMA','SQLTXADMIN','SQLTXPLAIN','SYS','SYSMAN','SYSTEM','TRCANLZR','WMSYS','XDB','XS$NULL','PERFSTAT','STDBYPERF','MGDSYS','OJVMSYS')
)
SELECT  e.owner, e.table_name
,       'Extension' object_type
,       e.extension_name object_name, e.num_distinct, e.num_buckets, e.extension
,       sc.column_name
,       sc.num_distinct col_num_distinct
,       sc.num_buckets col_num_buckets
,       sc.histogram col_histogram
FROM    e
,       dba_tab_col_statistics sc
WHERE   e.histogram = 'NONE'
AND     e.extension LIKE '%"'||sc.column_name||'"%'
AND     sc.owner = e.owner
AND     sc.table_name = e.table_name
AND     sc.histogram != 'NONE'
AND     sc.num_buckets > 1 /*histogram on column*/
AND     e.num_buckets = 1 /*no histogram on extended stats*/
UNION ALL
SELECT  /*+  NO_MERGE  */ /* 3c.25 */
        i.table_owner, i.table_name
,       'Index' object_type
,       i.index_name object_name, i.distinct_keys, TO_NUMBER(null), i.column_list
,       sc.column_name
,       sc.num_distinct col_num_distinct
,       sc.num_buckets col_num_buckets
,       sc.histogram col_histogram
From    i
,       dba_ind_columns ic
,       dba_tab_col_statistics sc
WHERE   ic.table_owner = i.table_owner
AND     ic.table_name = i.table_name
AND     ic.index_owner = i.index_owner
AND     ic.index_name = i.index_name
AND     sc.owner = i.table_owner
AND     sc.table_name = ic.table_name
AND     sc.column_name = ic.column_name
AND     sc.histogram != 'NONE' 
AND     sc.num_buckets > 1 /*histogram on column*/
AND NOT EXISTS( /*report index if no extension*/
        SELECT 'x'
        FROM    e
        WHERE   e.owner = i.table_owner
        AND     e.table_name = i.table_name
        AND     e.extension = i.column_list)
ORDER BY 1,2,3,4;

Tuesday, December 03, 2019

Oracle 19c Real-Time and High-Frequency Automatic Statistics Collection

I gave this presentation at the UKOUG Techfest 19 conference.  This video was produced as a part of the preparation for that session.  The slide deck is also available on my website.

It takes a look at the pros and cons of these new 19c features.  They are only available on Engineered Systems.  Both features aim to address the challenge of using data that has been significantly updated before the statistics maintenance window has run again.
  • Real-Time Statistics uses table monitoring to augment existing statistics with simple corrections
  • High-Frequency Automatic Optimizer Statistics is an extra statistics maintenance window that runs regularly to update the stalest statistics.
As your statistics change, so there are opportunities for SQL execution plans, and therefore application performance to change. DBAs and developers need to be aware of the implications.

Monday, December 02, 2019

Practical Application Performance Tuning: Applying Theory in Practice

I gave this presentation at the UKOUG Techfest 19 conference.  It is closely based on a previous presentation about PeopleSoft nVision performance tuning, and uses the experience of a PeopleSoft project as a case study, so I am also posting here and on my PeopleSoft blog.
This video was produced as a part of the preparation for this session.  The slide deck is also available on my website.

Learning about and understanding the principles and mechanics of the Oracle database is fundamentally important for both DBAs and developers. It is one of the reasons we still attend physical conferences.
This presentation tells the story of a performance tuning project for the GL reporting on a Financials system on an Oracle engineered system. It required various database techniques and features to be brought to bear. Having a theoretical understanding of how the database and various features work allowed us to make reasonable predictions about whether they would be effective in our environment. Some ideas were discounted, some were taken forward.
We will look at instrumentation, ASH, statistics collection, partitioning, hybrid columnar compression, Bloom filtering, SQL profiles. All of them played a part in the solution, some were simple and effective, some will require periodic maintenance, some added further complications that had to be worked around, some had to be carefully configured in-line with the application, and some required some reconfiguration of the application into order to work properly.

Monday, November 11, 2019

Consumer Group Mapping Rules Use Pattern Matching from 12.1

I recently noticed a small, but I think significant, change in the way consumer group mapping rules behave from Oracle 11.2.04.  Session attributes can be matched to resource groups using LIKE expressions and simple regular expressions specified in the matching rules, though only for certain attributes.
(Updated 12.11.2019) I am grateful to Mikhail Velikikh for his comment.  It depends on which version of Oracle's documentation for 11.2 you read.  Pattern matching does work in 11.2.0.4 for the attributes listed in the 12.1 documentation. My testing indicates that pattern matching does not happen in 11.2.0.3.
You cannot pattern match the SERVICE_NAME in 11.2.  The attribute value is validated against the list of valid services.
exec DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING(DBMS_RESOURCE_MANAGER.SERVICE_NAME, 'WIBBLE', 'MYGROUP1');
ORA-00042: Unknown Service name WIBBLE
However, you can pattern match SERVICE_NAME in 12.1, although SERVICE_NAME is not in the list of attributes for which the documentation says pattern matching is available.  This may be a documentation bug (see Oracle support note 1992704.1).
The parameters to procedure DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING are:
DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING(
   attribute        IN VARCHAR2, 
   value            IN VARCHAR2, 
   consumer_group   IN VARCHAR2 DEFAULT NULL); 

11.2 Documentation (c)2010

You use the SET_CONSUMER_GROUP_MAPPING procedure to map a session attribute/value pair to a consumer group. The parameters for this procedure are the following:
Parameter
Description
attributeThe session attribute type, specified as a package constant
valueThe value of the attribute
consumer groupThe consumer group to map to for this attribute/value pair

11.2 Documentation (c)2015

This procedure adds, deletes, or modifies entries that map sessions to consumer groups, based on the session's login and runtime attributes.
Parameter
Description
attributeMapping attribute to add or modify. It can be one of the Constants listed.
valueAttribute value to match. This includes both absolute mapping and regular expressions.
consumer groupName of the mapped consumer group, or NULL to delete a mapping

Usage Notes 
If no mapping exists for the given attribute and value, a mapping to the given consumer group will be created. If a mapping already exists for the given attribute and value, the mapped consumer group will be updated to the one given. If the consumer_group argument is NULL, then any mapping from the given attribute and value will be deleted.
The subprogram supports simple regex expressions for the value parameter. It implements the same semantics as the SQL 'LIKE' operator. Specifically, it uses '%' as a multi-character wildcard and '_' as a single character wildcard. The '\' character can be used to escape the wildcards. Note that wildcards can only be used if the attribute is one of the following:

  • CLIENT_OS_USER
  • CLIENT_PROGRAM*
  • CLIENT_MACHINE
  • MODULE_NAME
  • MODULE_NAME_ACTION
  • SERVICE_MODULE
  • SERVICE_MODULE_ACTION 
*Consumer group mapping comparisons for DBMS_RESOURCE_MANAGER.CLIENT_PROGRAM are performed by stripping the @ sign and following characters from V$SESSION.PROGRAM before comparing it to the CLIENT_PROGRAM value supplied.

Autonomous Database Example

The mappings are reported in the view DBA_RSRC_GROUP_MAPPINGS. An example of using a LIKE expression in a mapping can be found in the standard configuration of autonomous cloud databases. There are 5 standard mappings for 5 services to 5 consumer groups. The automatically created service names are prefixed with the database name, then HIGH, LOW, MEDIUM, TP or TPURGENT., and are suffixed .atp.oraclecloud.com.
SELECT * FROM v$services ORDER BY 1;

NAME                                                            
----------------------------------------------------------------
KZPKSBZK3RK4G1X_GOFASTER1A_high.atp.oraclecloud.com
KZPKSBZK3RK4G1X_GOFASTER1A_low.atp.oraclecloud.com
KZPKSBZK3RK4G1X_GOFASTER1A_medium.atp.oraclecloud.com
KZPKSBZK3RK4G1X_GOFASTER1A_tp.atp.oraclecloud.com
KZPKSBZK3RK4G1X_GOFASTER1A_tpurgent.atp.oraclecloud.com
kzpksbzk3rk4g1x_gofaster1a
However, the mappings are defined using pattern matching rather than referencing the explicit name of the service, thus exactly the same mappings can be defined on every autonomous database.
select * from DBA_RSRC_GROUP_MAPPINGS order by 1,2;

                             Consumer 
ATTRIBUTE    VALUE           Group      STATUS
------------ --------------- ---------- ----------
ORACLE_USER  GGADMIN         LOW
SERVICE_NAME %\_HIGH.%       HIGH
SERVICE_NAME %\_LOW.%        LOW
SERVICE_NAME %\_MEDIUM.%     MEDIUM
SERVICE_NAME %\_TP.%         TP
SERVICE_NAME %\_TPURGENT.%   TPURGENT
Note:
  • By default, the ORACLE_USER attribute's priority 6 takes precedence over SERVICE_NAME whose priority is 7. Therefore, GGADMIN, the GoldenGate user, always uses the LOW consumer group irrespective of the service that it connects to. The default priorities can be changed.
  • Wildcards cannot be used for the ORACLE_USER attribute.

Tuesday, October 15, 2019

Reading the Active Session History Compressed Export File in eDB360/SQLd360 as an External Table

I am a regular user of SQLDB360 (the single distribution for Carlos Sierra's eBD360, and Mauro Pagano's SQLd360 tools) for remote performance analysis.  eDB360 reports on the whole database, SQLd360 reports on a single SQL ID.  eDB360 also runs SQLd360 reports for the top SQL statements.  Both tools extract ASH data to a flat-file.  Originally, it was intended that these were loaded with the eAdam utility, but eDB360 no longer executes the eAdam scripts itself.
The eDB360/SQLd360 output zip files contain an export of the ASH data that is spooled from the database in SQL Plus, compressed with gzip, and put into a tarball.  You should find a single .tar file that contains several gzipped files.  In eDB360, dba_hist_active_sess_history.txt.gz is simply the result of:
SELECT * FROM dba_hist_active_sess_history
In SQLd360 it will be for just the specified SQL_ID:
SELECT * FROM dba_hist_active_sess_history
WHERE sql_id = '…'
The exact structure of the file varies with the version of Oracle, as more columns have been added to the view with successive releases.  There used to be an eAdam structure control file from which the DDL for a repository can be generated, but it is no longer generated.  However, it is easy to deal with this manually because many of the reports query DBA_HIST_ACTIVE_SESS_HISTORY and you will find a description of the view in each report.

Loading Compressed File as an External Table

Two database directories must be created in the database used to read the data file.
GRANT CREATE ANY DIRECTORY TO scott;
CREATE OR REPLACE DIRECTORY temp as '/media/sf temp';
CREATE OR REPLACE DIRECTORY exec_dir AS '/usr/bin';
GRANT READ, EXECUTE ON DIRECTORY exec_dir TO scott;
  • Directory edb360 points to where the data file is located.  
  • The external table will load the compressed zip file directly, uncompressing it on the fly, without writing an uncompressed version of the file to disk, using the preprocessor facility.  Directory exec_dir is the location of the Unix zcat executable. 
    • N.B. It is important that this is the actual directory where the executable resides, not the linked /bin directory
Now you can create an external table.  I have created it in the SCOTT schema rather than SYS.
There are a few things to bear in mind
  1. DISABLE_DIRECTORY_LINK_CHECK is used to suppress the check that the data file being loaded is not be referenced from a directory link.  In my case, this is necessary because I am referencing a directory on a host machine from within a Linux VM running in VirtualBox.  However, this does not suppress the check for files executed by the preprocessor.  exec_dir must not point to a linked directory.
  2. The compressed data file is pre-processed (effectively piped) through the zcat command.  This is like importing a compressed dump file through a Unix FIFO
  3. The # character has had to be removed from all column names.
  4. The format of date fields in the data file must be specified explicitly to convert them properly.
  5. The name of the compressed file to be imported is specified in the LOCATION clause.
  6. The reject limit has been deliberately set to 100.  A small non-zero value.  The file is exported from DBA_HIST_ACTIVE_SESS_HISTORY.  The actual columns change from version to version, so if the definition of the external table does not match the definition in the database you may get errors.  Limiting the rejections to 100 means that the .log and .bad files are easier to manage when resolving the error.  When the definitions match no rows should be rejected.
CREATE TABLE scott.ash_load
(SNAP_ID NUMBER
,DBID NUMBER
,INSTANCE_NUMBER NUMBER
,SAMPLE_ID NUMBER
,SAMPLE_TIME TIMESTAMP(3)
,SESSION_ID NUMBER
,"SESSION_SERIAL#" NUMBER
,SESSION_TYPE VARCHAR2(10)
,FLAGS NUMBER
,USER_ID NUMBER
-----------------------------------------
,SQL_ID VARCHAR2(13)
,IS_SQLID_CURRENT VARCHAR2(1)
,SQL_CHILD_NUMBER NUMBER
,SQL_OPCODE NUMBER
,SQL_OPNAME VARCHAR2(64)
,FORCE_MATCHING_SIGNATURE NUMBER
,TOP_LEVEL_SQL_ID VARCHAR2(13)
,TOP_LEVEL_SQL_OPCODE NUMBER
,SQL_PLAN_HASH_VALUE NUMBER
,SQL_FULL_PLAN_HASH_VALUE NUMBER /*added 12.1*/
-----------------------------------------
,SQL_ADAPTIVE_PLAN_RESOLVED NUMBER /*added 12.1*/
,SQL_PLAN_LINE_ID NUMBER
,SQL_PLAN_OPERATION VARCHAR2(64)
,SQL_PLAN_OPTIONS VARCHAR2(64)
,SQL_EXEC_ID NUMBER
,SQL_EXEC_START DATE
,PLSQL_ENTRY_OBJECT_ID NUMBER
,PLSQL_ENTRY_SUBPROGRAM_ID NUMBER
,PLSQL_OBJECT_ID NUMBER
,PLSQL_SUBPROGRAM_ID NUMBER
-----------------------------------------
,QC_INSTANCE_ID NUMBER
,QC_SESSION_ID NUMBER
,"QC_SESSION_SERIAL#" NUMBER
,PX_FLAGS NUMBER
,EVENT VARCHAR2(64)
,EVENT_ID NUMBER
,SEQ NUMBER
,P1TEXT VARCHAR2(64)
,P1 NUMBER
,P2TEXT VARCHAR2(64)
-----------------------------------------
,P2 NUMBER
,P3TEXT VARCHAR2(64)
,P3 NUMBER
,WAIT_CLASS VARCHAR2(64)
,WAIT_CLASS_ID NUMBER
,WAIT_TIME NUMBER
,SESSION_STATE VARCHAR2(7)
,TIME_WAITED NUMBER
,BLOCKING_SESSION_STATUS VARCHAR2(11)
,BLOCKING_SESSION NUMBER
-----------------------------------------
,"BLOCKING_SESSION_SERIAL#" NUMBER
,BLOCKING_INST_ID NUMBER
,BLOCKING_HANGCHAIN_INFO VARCHAR2(1)
,"CURRENT_OBJ#" NUMBER
,"CURRENT_FILE#" NUMBER
,"CURRENT_BLOCK#" NUMBER
,"CURRENT_ROW#" NUMBER
,"TOP_LEVEL_CALL#" NUMBER
,TOP_LEVEL_CALL_NAME VARCHAR2(64)
,CONSUMER_GROUP_ID NUMBER
-----------------------------------------
,XID RAW(8)
,"REMOTE_INSTANCE#" NUMBER
,TIME_MODEL NUMBER
,IN_CONNECTION_MGMT VARCHAR2(1)
,IN_PARSE VARCHAR2(1)
,IN_HARD_PARSE VARCHAR2(1)
,IN_SQL_EXECUTION VARCHAR2(1)
,IN_PLSQL_EXECUTION VARCHAR2(1)
,IN_PLSQL_RPC VARCHAR2(1)
,IN_PLSQL_COMPILATION VARCHAR2(1)
-----------------------------------------
,IN_JAVA_EXECUTION VARCHAR2(1)
,IN_BIND VARCHAR2(1)
,IN_CURSOR_CLOSE VARCHAR2(1)
,IN_SEQUENCE_LOAD VARCHAR2(1)
,IN_INMEMORY_QUERY VARCHAR2(1) /*added 12.1*/
,IN_INMEMORY_POPULATE VARCHAR2(1) /*added 12.1*/
,IN_INMEMORY_PREPOPULATE VARCHAR2(1) /*added 12.1*/
,IN_INMEMORY_REPOPULATE VARCHAR2(1) /*added 12.1*/
,IN_INMEMORY_TREPOPULATE VARCHAR2(1) /*added 12.1*/
,CAPTURE_OVERHEAD VARCHAR2(1)
-----------------------------------------
,REPLAY_OVERHEAD VARCHAR2(1)
,IS_CAPTURED VARCHAR2(1)
,IS_REPLAYED VARCHAR2(1)
,SERVICE_HASH NUMBER
,PROGRAM VARCHAR2(64)
,MODULE VARCHAR2(64)
,ACTION VARCHAR2(64)
,CLIENT_ID VARCHAR2(64)
,MACHINE VARCHAR2(64)
,PORT NUMBER
-----------------------------------------
,ECID VARCHAR2(64)
,DBREPLAY_FILE_ID NUMBER /*added 12.1*/
,DBREPLAY_CALL_COUNTER NUMBER /*added 12.1*/
,TM_DELTA_TIME NUMBER
,TM_DELTA_CPU_TIME NUMBER
,TM_DELTA_DB_TIME NUMBER
,DELTA_TIME NUMBER
,DELTA_READ_IO_REQUESTS NUMBER
,DELTA_WRITE_IO_REQUESTS NUMBER
,DELTA_READ_IO_BYTES NUMBER
-----------------------------------------
,DELTA_WRITE_IO_BYTES NUMBER
,DELTA_INTERCONNECT_IO_BYTES NUMBER
,PGA_ALLOCATED NUMBER
,TEMP_SPACE_ALLOCATED NUMBER
,DBOP_NAME VARCHAR2(64) /*added 12.1*/
,DBOP_EXEC_ID NUMBER /*added 12.1*/
,CON_DBID NUMBER /*added 12.1*/
,CON_ID NUMBER /*added 12.1*/
)
ORGANIZATION EXTERNAL
(TYPE ORACLE_LOADER
 DEFAULT DIRECTORY temp
 ACCESS PARAMETERS 
 (RECORDS DELIMITED BY newline 
  DISABLE_DIRECTORY_LINK_CHECK 
  PREPROCESSOR exec_dir:'zcat' 
  FIELDS TERMINATED BY '<,>'
  MISSING FIELD VALUES ARE NULL
  REJECT ROWS WITH ALL NULL FIELDS
  NULLIF = BLANKS
(SNAP_ID,DBID,INSTANCE_NUMBER,SAMPLE_ID,SAMPLE_TIME CHAR(80) date_format TIMESTAMP MASK "YYYY-MM-DD/HH24:mi:ss.ff"
,SESSION_ID,"SESSION_SERIAL#",SESSION_TYPE,FLAGS,USER_ID
,SQL_ID,IS_SQLID_CURRENT,SQL_CHILD_NUMBER,SQL_OPCODE,SQL_OPNAME
,FORCE_MATCHING_SIGNATURE,TOP_LEVEL_SQL_ID,TOP_LEVEL_SQL_OPCODE,SQL_PLAN_HASH_VALUE,SQL_FULL_PLAN_HASH_VALUE
,SQL_ADAPTIVE_PLAN_RESOLVED,SQL_PLAN_LINE_ID,SQL_PLAN_OPERATION,SQL_PLAN_OPTIONS,SQL_EXEC_ID
,SQL_EXEC_START CHAR(80) date_format TIMESTAMP MASK "YYYY-MM-DD/HH24:mi:ss.ff",PLSQL_ENTRY_OBJECT_ID,PLSQL_ENTRY_SUBPROGRAM_ID,PLSQL_OBJECT_ID,PLSQL_SUBPROGRAM_ID
,QC_INSTANCE_ID,QC_SESSION_ID,"QC_SESSION_SERIAL#",PX_FLAGS,EVENT
,EVENT_ID,SEQ,P1TEXT,P1,P2TEXT,P2,P3TEXT,P3,WAIT_CLASS,WAIT_CLASS_ID
,WAIT_TIME,SESSION_STATE,TIME_WAITED,BLOCKING_SESSION_STATUS,BLOCKING_SESSION
,"BLOCKING_SESSION_SERIAL#",BLOCKING_INST_ID,BLOCKING_HANGCHAIN_INFO,"CURRENT_OBJ#","CURRENT_FILE#"
,"CURRENT_BLOCK#","CURRENT_ROW#","TOP_LEVEL_CALL#",TOP_LEVEL_CALL_NAME,CONSUMER_GROUP_ID
,XID,"REMOTE_INSTANCE#",TIME_MODEL,IN_CONNECTION_MGMT,IN_PARSE
,IN_HARD_PARSE,IN_SQL_EXECUTION,IN_PLSQL_EXECUTION,IN_PLSQL_RPC,IN_PLSQL_COMPILATION
,IN_JAVA_EXECUTION,IN_BIND,IN_CURSOR_CLOSE,IN_SEQUENCE_LOAD,IN_INMEMORY_QUERY
,IN_INMEMORY_POPULATE,IN_INMEMORY_PREPOPULATE,IN_INMEMORY_REPOPULATE,IN_INMEMORY_TREPOPULATE,CAPTURE_OVERHEAD
,REPLAY_OVERHEAD,IS_CAPTURED,IS_REPLAYED,SERVICE_HASH,PROGRAM
,MODULE,ACTION,CLIENT_ID,MACHINE,PORT
,ECID,DBREPLAY_FILE_ID,DBREPLAY_CALL_COUNTER,TM_DELTA_TIME,TM_DELTA_CPU_TIME
,TM_DELTA_DB_TIME,DELTA_TIME,DELTA_READ_IO_REQUESTS,DELTA_WRITE_IO_REQUESTS,DELTA_READ_IO_BYTES 
,DELTA_WRITE_IO_BYTES,DELTA_INTERCONNECT_IO_BYTES,PGA_ALLOCATED,TEMP_SPACE_ALLOCATED,DBOP_NAME
,DBOP_EXEC_ID,CON_DBID,CON_ID
))
LOCATION ('dba_hist_active_sess_history.txt.gz')
) REJECT LIMIT 100
/

Now the external table can be queried in exactly the same way that you would query DBA_HIST_ACTIVE_SESS_HISTORY in order to profile DB time, although bear in mind than an external table can only be full scanned.
set timi on lines 200 pages 99
COLUMN force_matching_signature FORMAT 99999999999999999999
COLUMN min(sample_time) FORMAT a26
COLUMN max(sample_time) FORMAT a26
SELECT min(sample_time), max(sample_time), COUNT(*) samples
FROM scott.ash_load
/

MIN(SAMPLE_TIME)           MAX(SAMPLE_TIME)              SAMPLES
-------------------------- -------------------------- ----------
21-JUN-19 05.00.36.115 AM  21-JUL-19 01.00.00.823 PM     1213333