Tuesday, December 19, 2023

Using Attribute Clustering to Improve Compression, Response Time and CPU Consumption: 2. An Example

Attribute Clustering reorders data in a table so that similar data values are clustered together.  This can improve both basic and columnar compression, resulting in better response time and lower CPU consumption.

This is the second of a two-part blog post.

  1. Introduction
  2. Example and Test Results

An Example of Attribute Clustering

This test illustrates the potential benefits of attribute clustering (the scripts are available on GitHub). It simulates the fact table in a data warehouse, or in my use case the General Ledger table in a Financials system. The table will have 20 million rows. Each dimension column will randomly have one of 256 distinct values, padded to 8 characters.  In this case, the distribution of data values is skewed by the square root function.  The alternative commented section produces uniform data. 
create table t0(a varchar2(8 char), b varchar2(8 char), c varchar2(8 char), x number);
truncate table t0;
BEGIN
  FOR i IN 1..2 LOOP
    insert  /*+APPEND PARALLEL*/ into t0
    select  /*+PARALLEL*/
/*--------------------------------------------------------------------------------------------------------------
            rPAD(LPAD(LTRIM(TO_CHAR(FLOOR(dbms_random.value(0,255)),'XX')),2,'0'),8,'X') a
    ,       rPAD(LPAD(LTRIM(TO_CHAR(FLOOR(dbms_random.value(0,255)),'XX')),2,'0'),8,'X') b
    ,       rPAD(LPAD(LTRIM(TO_CHAR(FLOOR(dbms_random.value(0,255)),'XX')),2,'0'),8,'X') c
--------------------------------------------------------------------------------------------------------------*/
            rPAD(LPAD(LTRIM(TO_CHAR(FLOOR(SQRT(dbms_random.value(0,65535))),'XX')),2,'0'),8,'X') a
    ,       rPAD(LPAD(LTRIM(TO_CHAR(FLOOR(SQRT(dbms_random.value(0,65535))),'XX')),2,'0'),8,'X') b
    ,       rPAD(LPAD(LTRIM(TO_CHAR(FLOOR(SQRT(dbms_random.value(0,65535))),'XX')),2,'0'),8,'X') c
--------------------------------------------------------------------------------------------------------------*/
    ,       dbms_random.value(1,1e6)
    from dual connect by level <= 1e7;
    COMMIT;
  end loop;
end;
/
exec dbms_stats.gather_table_stats(user,'T0');
I will create an identical materialized view on that table
create table mv(a varchar2(8 char), b varchar2(8 char), c varchar2(8 char), x number);
create materialized view mv on prebuilt table enable query rewrite as select * from t0;
For each test, I can set different attributes and then fully refresh the materialized view in non-atomic mode.  The various attributes take effect as the materialized view is truncated and repopulated in direct-path mode.
truncate table MV drop storage;
--------------------------------------------------
rem set compression
--------------------------------------------------
--alter materialized view MV nocompress;
--alter materialized view MV compress;
alter materialized view MV compress for query low;
--------------------------------------------------
rem set in memory
--------------------------------------------------
alter table mv inmemory;
--------------------------------------------------
rem set clustering and number of clustering columns
--------------------------------------------------
alter table mv drop clustering;
--alter table mv add clustering by interleaved order (b);
alter table mv add clustering by interleaved order (b, c);
--alter table mv add clustering by interleaved order (b, c, a);
--------------------------------------------------
exec dbms_mview.refresh('MV',atomic_refresh=>FALSE);
exec dbms_inmemory.repopulate(user,'MV');
Then I can see how large the physical and In-Memory segments are.
select * from user_mviews where mview_name = 'MV';
select table_name, tablespace_name, num_rows, blocks, compression, compress_for, inmemory, inmemory_compression 
from user_tables where table_name IN('MV','T0');
select segment_name, segment_type, tablespace_name, bytes/1024/1024 table_MB, blocks, extents, inmemory, inmemory_compression
from user_Segments where segment_name IN('MV','T0');

with x as (
select segment_type, owner, segment_name, inmemory_compression, inmemory_priority
, count(distinct inst_id) instances
, count(distinct segment_type||':'||owner||'.'||segment_name||'.'||partition_name) segments
, sum(inmemory_size)/1024/1024 inmemory_mb, sum(bytes)/1024/1024 tablespace_Mb
from   gv$im_segments i
where segment_name = 'MV'
group by segment_type, owner, segment_name, inmemory_compression, inmemory_priority)
select x.*, inmemory_mb/tablespace_mb*100-100 pct from x
order by owner, segment_type, segment_name
/
I will use a simple test query to see how the performance changes
select b,c, count(a), sum(x) from t0 where b='2AXXXXXX' group by b,c fetch first 10 rows only;
I tested 
  • Uniformly distributed data -v- skewed data
  • Without table compression -v- basic compression -v- Hybrid Columnar Compression (HCC)
  • No attribute clustering -v- interleaved clustering on 1, 2, and 3 columns
Table      Tablespace                                                                                         
Name       Name         NUM_ROWS     BLOCKS COMPRESS COMPRESS_FOR                   INMEMORY INMEMORY_COMPRESS
---------- ---------- ---------- ---------- -------- ------------------------------ -------- -----------------
MV         PSDEFAULT    20000000      60280 ENABLED  QUERY LOW                      ENABLED  FOR QUERY LOW    
T0         PSDEFAULT    20000000     150183 DISABLED                                DISABLED                  

                      Tablespace      Table                                                 
Segment Na Segment Ty Name               MB     BLOCKS    EXTENTS INMEMORY INMEMORY_COMPRESS
---------- ---------- ---------- ---------- ---------- ---------- -------- -----------------
MV         TABLE      PSDEFAULT       472.0      60416        130 ENABLED  FOR QUERY LOW    
T0         TABLE      PSDEFAULT     1,220.0     156160        203 DISABLED                  

                                                                                 In Memory Tablespace           
Segment Ty OWNER    Segment Na INMEMORY_COMPRESS INMEMORY  INSTANCES   SEGMENTS         MB         MB        PCT
---------- -------- ---------- ----------------- -------- ---------- ---------- ---------- ---------- ----------
TABLE      SYSADM   MV         FOR QUERY LOW     NONE              2          1      829.2      936.6 -11.4662752
With query rewrite on the materialized view and the materialized view in the In-Memory store, we see Oracle rewrite the query from the underlying table to the materialized view and then to an in-memory query.
select b,c, sum(x) from t0 where b='2AXXXXXX' group by b,c;

Plan hash value: 389206685
-----------------------------------------------------------------------------------------------
| Id  | Operation                              | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |      |   182 |  7280 |   876  (31)| 00:00:01 |
|   1 |  HASH GROUP BY                         |      |   182 |  7280 |   876  (31)| 00:00:01 |
|*  2 |   MAT_VIEW REWRITE ACCESS INMEMORY FULL| MV   | 78125 |  3051K|   875  (31)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Test Results

Conclusions

  • Without any table compression, attribute clustering does not affect the size of the table on the tablespace, but the size of the table in the In-Memory store is reduced, and query performance is improved.
  • With either basic or Hybrid Columnar compression, attribute clustering reduces the size of the table both in the tablespace and in the in-memory store.
  • All forms of compression and attribute clustering increase the duration of the materialized view refresh. Degradation of the refresh due to clustering was less severe with HCC than with either no compression or simple compression.
  • I found that query performance degraded when using interleaved clustering in combination with simple compression although this resulted in a smaller in-memory segment than HCC, but performance improved with HCC.
  • Uniform data compressed marginally better than skewed.  Otherwise, they produced very similar results.  
  • You do not have to take advantage of compression on the physical segment to take advantage of the compression in In-Memory, but you may get better performance if you do.
  • With this test data set, optimal performance was achieved when clustering on 2 dimension columns. When clustering on all three columns I obtained worse compression and query performance. This varies with the data. With real-world data, I have had examples with better compression and performance with the maximum of 4 clustering column groups. Generally, the best performance corresponds to the attribute clustering that gives the best columnar compression. This is not always the case for simple compression.

Using Attribute Clustering to Improve Compression, Response Time and CPU Consumption: 1. Introduction

Attribute Clustering reorders data in a table so that similar data values are clustered together.  This can improve both basic and columnar compression, resulting in better response time and lower CPU consumption.

This is the first of a two-part blog post.

  1. Introduction
  2. Example and Test Results

Use Case

I am working on a Financials system running on an engineered system.  It runs a daily batch of GL reports on summary ledgers that have unindexed materialized views.  The materialized views are also hybrid column compressed (HCC) to reduce their size and improve reporting performance.  

We also put the materialized views into the In-Memory store.  Initially, we used 'free' base-level In-Memory and worked within the 16Gb/instance limit.  Having moved to Exadata Cloud@Customer, we can use the fully licensed version of In-Memory.

Now I have introduced Attribute Clustering for the materialized views.

Attribute Clustering

Attribute Clustering
Attribute Clustering has been available on Enterprise Edition since Oracle 12.1.0.2.  Data is clustered in close physical proximity according to certain columns.  Linear Ordering stores the data according to the order of the specified clustering columns.  Interleaved Ordering uses a Z-order curve to cluster data in multiple dimensions (this graphic is from Oracle's documentation).

The GL reports have multiple combinations of different predicates. Therefore, as recommended by Oracle, we used interleaved ordering.  Linear ordering is not suitable in this case because there is no single suitable order for each table.  Linear ordering also caused the runtime of the materialized view refresh to extend much more than interleaved ordering as it has to sort the data.

We have not introduced Zone Maps.  That is to say that after testing, we removed them.  Zone maps can be thought of as a coarse index of the zones in the attribute clustering, and would normally be expected to improve the access of the data.  You can see them being used in the execution plans to access the table both in the tablespace and in the In-Memory store.  However, our application dynamically generates a lot of SQL and therefore performs a lot of SQL parse.  We found that the additional work to process the zone map significantly degraded performance.

Attribute Clustering is not enforced for every DML operation. It only affects direct-path insert operations, data movement, or table creation. It is easy to implement it for segments that are already HCC, which also relies on direct-path operations.  The materialized views were created to introduce HCC, hence they are refreshed in non-atomic mode which truncates and repopulates them in direct-path mode.  Thus attribute clustering specified on the materialized views will be implemented as they refresh.

Historical, and therefore static, partitions in the ledger tables are marked for HCC, and we schedule an online rebuild to compress them.  Now, that will also apply attribute clustering.  This process could be automated with Automatic Storage Compression.

Compression

Simply by storing similar data values together, we obtained better compression from HCC.  The tables underlying the materialized views were smaller.  

In-Memory also uses columnar compression.  Attribute clustering produced a reduction in the size of segments in the In-Memory store.  If we were still working within the constraints of Base-Level In-Memory, we would have been able to store more segments in In-Memory.

We are using attribute clustering not to directly improve data access, but to harness a secondary effect, that of improved compression.  We are seeing a reduction in the runtime of the reports.  Most of the database time is already spent on the CPU (as most of the fact tables are in In-Memory), so this translates to a reduction in CPU consumption.  We can consider running more reports concurrently to complete the batch earlier.  We can also consider reducing the number of CPUs and therefore reduce cloud subscription costs.

The second part of this blog will show an example test script and results.

Monday, November 27, 2023

Database Constraints Enforced but not Validated for New Data Now, but Not Existing Data

Recently, while discussing a problem, somebody said to me 'I would like to make this column NOT NULL to stop [this problem] from occurring, but first I would need to go back and fix all the historical data'. 
In Oracle, a constraint that is enabled will apply during DML, so as you insert or update the data, the constraint is applied to the rows that are being updated. Only when a constraint is validated, does the database check that all the data in the table conforms to the constraint.  
If you create a constraint that is enforced, but not validated, you may be able to prevent a problem from getting worse while you fix the data you already have.

A Demonstration 

I will create a table with a unique constraint and two other columns that are NOT NULL.  I put some data in the tables.  Column B is null for some of the rows, but not for others.
create table t (a number, b number, c number, constraint t_pk primary key(A));

insert into t (a,b)
select level, CASE WHEN MOD(level,2)=1 then 1 end --B is null on alternate rows
from dual connect by level<=10;

select * from t;

         A          B          C
---------- ---------- ----------
         1          1           
         2                      
         3          1           
         4                      
         5          1           
         6                      
         7          1           
         8                      
         9          1           
        10                      

10 rows selected.
I would like to make B a NOT NULL column (to stop the application from writing an invalid value to the database), but cannot because I already have some invalid values in the database.
SQL> alter table t modify b not null ;

ORA-02296: cannot enable (SCOTT.) - null values found
02296. 00000 - "cannot enable (%s.%s) - null values found"
*Cause:    an alter table enable constraint failed because the table
           contains values that do not satisfy the constraint.
*Action:   Obvious
However, I can create the constraint with the NOVALIDATE option.
SQL> alter table t modify b not null novalidate;

Table T altered.

SQL> select constraint_name, search_condition_vc from user_constraints where table_name = 'T';

CONSTRAINT_NAME SEARCH_CONDITION_VC                                         
--------------- ------------------------------------------------------------
T_PK                                                                        
SYS_C00221684   "B" IS NOT NULL
Note that at the moment column B is not described as NOT NULL because although the constraint is enforced, it has not been validated.
SQL> desc t
Name Null?    Type   
---- -------- ------ 
A    NOT NULL NUMBER 
B             NUMBER 
C             NUMBER
If I try to add more rows where some of the data in column B is null, the constraint prevents it.
SQL> insert into t (a,b)
  2  select level+10, CASE WHEN MOD(level,2)=1 then 1 end 
  3  from dual connect by level<=10;

ORA-01400: cannot insert NULL into ("SCOTT"."T"."B")
I can set B column to a NOT NULL value, but I cannot set it back to null.
SQL> update t set b = 2 where a=2;

1 row updated.

SQL> update t set b = NULL where a=2;

ORA-01407: cannot update ("SCOTT"."T"."B") to NULL
I can successfully update a different column on a row where B is null and therefore does not meet the constraint.  I do not get an error because I have not updated it.  
SQL> SQL> update t set c = a;

10 rows updated.
Eventually, I will want to validate the constraint so that I know that B has a not null value in all the rows. But I can't do it while there are still some null values
SQL> select * from t;

         A          B          C
---------- ---------- ----------
         1          1          1
         2          2          2
         3          1          3
         4                     4
         5          1          5
         6                     6
         7          1          7
         8                     8
         9          1          9
        10                    10

SQL> DECLARE
  2    l_sql CLOB;
  3  BEGIN
  4    FOR I IN (select * from user_constraints where table_name = 'T' AND constraint_type ='C' 
  5              and validated != 'VALIDATED' and search_condition_vc = '"B" IS NOT NULL') LOOP
  6      l_sql := 'alter table '||i.table_name||' modify constraint '||i.constraint_name||' VALIDATE';
  7      dbms_output.put_line(l_sql);
  8      EXECUTE IMMEDIATE l_sql;
  9    END LOOP;
 10  END;
 11  /
alter table T modify constraint SYS_C00221684 VALIDATE

ORA-02293: cannot validate (SCOTT.SYS_C00221684) - check constraint violated
ORA-06512: at line 7
ORA-06512: at line 7
02293. 00000 - "cannot validate (%s.%s) - check constraint violated"
*Cause:    an alter table operation tried to validate a check constraint to
           populated table that had nocomplying values.
*Action:   Obvious
First I have to fix the data, and then I can validate the constraint
SQL> UPDATE t set b=a where b is null;

4 rows updated.

SQL> REM now validate the constraint
SQL> BEGIN
  2    FOR I IN (select * from user_constraints where table_name = 'T' AND constraint_type ='C' 
  3              and validated != 'VALIDATED' and search_condition_vc = '"B" IS NOT NULL') LOOP
  4      EXECUTE IMMEDIATE 'alter table '||i.table_name||' modify constraint '||i.constraint_name||' VALIDATE';
  5    END LOOP;
  6  END;
  7  /

PL/SQL procedure successfully completed.
Only now that the new constraint has been validated is the column described as NOT NULL.
SQL> desc t
Name Null?    Type   
---- -------- ------ 
A    NOT NULL NUMBER 
B    NOT NULL NUMBER 
C             NUMBER

Wednesday, June 21, 2023

The Goal


One of my favourite books on Oracle performance, "Optimizing Oracle Performance" by Cary Millsap & Jeff Holt, introduced me to "The Goal" by Eli Goldratt and Jeff Cox.  
The Goal is all about performance, without being anything to do with computers.  It is a story of a man who has to save his manufacturing plant from closure by making it profitable.  The language is about manufacturing, but it applies to any system of processes, including any software application and your Oracle (or any other) database!  
Recently, I was checking a quote from it, and I ended up reading it again.  It is 20 years since I first read these two books.  They completely changed how I thought about performance.  Both remain as valid today as they were then.  
It is good to be reminded of these fundamental principles every now and then.
"So this is the goal: To make money by increasing net profit, while simultaneously increasing return on investment, and simultaneously increasing cash flow." 
"There are three measurements which express the goal of making money ... throughput, inventory and operational expense"
"Throughput is the rate at which the system generates money through sales.
Inventory is all the money that the system has invested in purchasing things which it intends to sell.
Operational expense is all the money the system spends in order to turn inventory into throughput."
"A plant in which everyone is working all the time is very inefficient."
"A bottleneck is any resource whose capacity is equal to or less than the demand placed upon it.  And a non-bottleneck is any resource whose capacity is greater than the demand placed on it."
"What does lost time on a bottleneck mean?  It means you have lost throughput."
"The capacity of a plant is equal to the capacity of its bottlenecks."
"A system of local optimums is not an optimum system at all; it is a very inefficient system."
"An hour lost at a bottleneck is an hour lost for the entire system.
An hour saved at a non-bottleneck is worthless."
"1. IDENTIFY the system's constraint(s).
2. Decide how to EXPLOIT the system's constraint(s).
3. SUBORDINATE everything else to the above decision.
4. ELEVATE the system's constraint(s).
5. WARNING!!!! If in the previous steps, a constraint has been broken, go back to step 1, but do not allow INERTIA to cause a system's constraint."
"I started to have a very good guideline; if it comes from cost accounting it must be wrong."

Performance optimisation is sometimes viewed as a black art.  It is not.  Instead, like detection, it "is, or ought to be, an exact science, and should be treated in the same cold and unemotional manner".  

Thursday, June 15, 2023

More Bang for your Buck in the Cloud with Resource Manager

Much of the cost in database IT is tied to the number of CPUs.  Oracle database licencing is priced per CPU.  The dominant factor in determining your cloud subscription cost is also CPU, although, disk, memory, and network can also be a cost factor. 

That incentivises you to minimise your CPU.  I believe it is inevitable that cloud systems will be configured with fewer CPUs and it will become more common to see them running either close to or beyond the point of having 0% idle CPU.  In fact, I'll go further:  

In the cloud, if your system is not constrained by CPU, at least some of the time, you are probably spending too much money on renting too many CPUs.

What happens to an Oracle database when it runs out of CPU?

The resource manager has been part of the Oracle database since 8i, but in my experience, it is rarely used.

Every process has to demand CPU and if necessary wait on the CPU run queue.  If you don't have a resource manager plan, then all the Oracle processes will have equal priority on that queue.  The resource manager will not intervene.  

However, not all processes are created equal.  Instead, the users of an application will consider some things more important or urgent than others.  Some processes are on a critical path to delivering something by a deadline, while others can wait.  That implies a hierarchy of priority.  A resource manager plan allocates CPU to higher priority processes over low priority within the constraint of a minimum guaranteed CPU allocation and can restrict the degree of parallelism.  

Note that "By default, all predefined maintenance windows use the resource plan DEFAULT_MAINTENANCE_PLAN".  When you introduce your own resource manager plan you don't need to alter the predefined windows.

A resource manager plan that reflects the business priorities can enable a system to meet its objectives with fewer resources, particularly CPU resources.  In a cloud system, using fewer resources, particularly CPU resources, will tend to save money on cloud subscription costs.

A User Story

Let me tell you about a PeopleSoft Financials system at an insurance company.  Like all insurance companies, they like to slice and dice their General Ledger in lots of different ways and produce lots of reports every night.

Data flows through the system from GL transaction processing via summary ledgers on which materialized views are built and then reports are run

Transactions -> Post to Ledger -> Summary Ledgers -> Materialised Views -> Reports

A fundamentally important thing this company did was to provide a quantitative definition of acceptable performance.  

  • "GL reports must be finished by the time continental Europe starts work at 8am CET / 2am EST"
  • "Without making the system unavailable to Asia/Pac users" 
  • "At night (in the US), some other things can wait, but need to be available at the start of the US working day."

They were running on a two-node RAC database on an engineered system, on-premises.  When the overnight GL batch was designed and configured on the old hardware, parallelism was increased until it consumed the entire box.

The system has now moved to an Exadata cloud-at-customer machine.  It is still a two-node RAC cluster.  We have a choice of up to 10 OCPUs (20 virtual CPUs) per node.  During testing, we progressively reduced the CPU count until we could only just meet that target.  Every time we reduce the CPU by 1 OCPU on each of the two nodes, we reduced the cost of the cloud subscription by approximately US$2000/month.

Implicit in that statement of adequate performance is also a statement of what is important to the business.  We started to create a hierarchy of processes.  

  • If the business is waiting on the output of a process then that is a high-priority process that is guaranteed a high proportion of available CPU. 
  • If a process is finished before the business needs it then it has a lower priority.  For example, a set of processes was building reporting tables that were not needed until the start of the US working day, so their start time was pushed back, and they were put in a lower prior consumer group that also restricted their degree of parallelism. 

Sometimes, it can be hard to determine whether the users are waiting and whether the performance is adequate, but usually, they will tell you!  However, with an overnight batch process, it is straightforward.  If it is outside office hours, then the users aren't waiting for it, but it needs to be there when they come into the office in the morning.

Like so many physical things in life, nearly everything that happens in computing involves putting a task on a queue and waiting for it to come back.  Most computer systems are chains of inbound and outbound queues.  On the way other requests for resources may be invoked that also have to be queued.  Ultimately, every system is bound by its resources.  On a computer that is CPU, memory, disk, and network.  A critical process whose performance is degraded, because it is not getting enough of the right kind of resource, becomes a bottleneck.

"Time lost at a bottleneck is lost across the system." 

One of my favourite books on Oracle performance is Optimizing Oracle Performance by Cary Millsap & Jeff Holt.  It introduced me to another book, The Goal by Eli Goldratt and Jeff Cox.  Its central theme is the nature of bottlenecks, otherwise called constraints.   "A bottleneck is any resource whose capacity is equal to or less than the demand placed upon it."

It is all about performance, without being anything to do with computers.  It is a Socratic case study of how to implement the 5-step strategy dubbed "The Theory of Constraints" to improve the performance of a system.  The five steps are set out plainly and then again in another book by Goldratt "What is This Thing Called Theory of Constraints and How Should It Be Implemented?"

  • IDENTIFY the system's constraint(s).
  • Decide how to EXPLOIT the system's constraint(s).
  • SUBORDINATE everything else to the above decision.
  • ELEVATE the system's constraint(s).
  • WARNING!!!! If in the previous steps, a constraint has been broken, go back to step 1, but do not allow INERTIA to cause a system's constraint.
  • In the factory in The Goal, the goal is to increase throughput while simultaneously reducing inventory and operating expense.

    In the cloud, the goal is to increase system throughput while simultaneously reducing response time and the cost of resources.

    The Resource Manager Plan

    The hierarchy of processes then determines who should get access to the CPU in preference to whom.  It translates into a database resource manager plan.  This is the 4th of Goldratt's 5 steps.  The higher priority processes are on the critical processing path get precedence for CPU so that they can make process.  The lower-priority processes may have to wait for CPU so they don't impede higher-priority processes (this is the 3rd step).

    The resource plan also manages the degree of parallelism that can be used within each consumer group, so that we don't run out of parallel query servers.  Higher-priority processes may not have a high PQ limit because there are more processes that run concurrently.  Processes are mostly allocated to consumer groups through mappings of module, action, and program name, some are mapped explicitly using triggers.

    Over the years, the resource manager plan for this particular system has gone through three main design iterations.  The 4 lowest-priority consumer groups were added to restrict the consumption of these groups when the higher groups were active.

    Priority1st Iteration2nd Iteration3rd IterationDescription of Consumer Group
    1PSFT_GROUPGeneral group for PeopleSoft application and batch processes.
    PQ limit = ½ of CPU_COUNT (3rd iteration)
    2HIGH_GROUPFor weekly stats collection process of 2 multi-billion row tables (LEDGER and JRNL_LN).
    PQ limit = 2x CPU_COUNT
    3SUML_GROUPProcess that refresh summary ledger tables, and MVs on summary ledgers.
    PQ limit = ¾ of CPU_COUNT
    4NVISION
    _GROUP
    nVision General Ledger reporting processes.
    PQ limit ≈ 3/8 of CPU_COUNT
    5GLXX_GROUPProcesses that build GLXX reporting tables, and do some reporting. Parallelism disabled. Run concurrently with nVision, but more important to complete GL reporting.
    PQ limit = 1. No parallelism
    6PSQUERY
    _GROUP
    NVSRUN
    _GROUP
    Other queries submitted via PeopleSoft ad-hoc query tool and ad-hoc nVision
    PQ limit = 3 - 4
    7ESSBASE
    _GROUP
    Essbase processes.
    PQ limit = 2 - 4
    8LOW_GROUP,
    LOW_LIMITED
    _GROUP
    Other Processes.
    Also deals with an Oracle bug that causes AQ$_PLSQL_NTFN% jobs to run continuously consuming CPU.
    Actual/Estimated Time Limit

    This approach has certainly prevented the processes in GLXX_GROUP, ad-hoc queries in the PSQUERY_GROUP, and other processes in the LOW_GROUP from taking CPU away from critical processes in PSFT_GROUP, NVISION_GROUP and SUML_GROUP.  We also adjusted the configuration of the application to reduce the number of processes that can run concurrently.

    What if we decide to change the number of CPUs 

    When this system ran on an on-premises machine we had a single resource plan because the number of CPUs was fixed.  

    Now it has moved to the cloud, we can choose how many CPUs to pay for.  Performance was tested with various configurations.  Consequently, we have created several different resource plans for different numbers of CPUs with different PQ limits.  When we change the number of CPUs we just specify the corresponding resource manager plan.  

    Some other database parameters have been set to lower non-default values to restrict overall SQL parallelism and the number of concurrent processes on the database job scheduler. These are also changed in line with the number of CPUs.

    alter system set RESOURCE_MANAGER_PLAN=PSFT_PLAN_CPU8 scope=both sid='*';
    alter system set JOB_QUEUE_PROCESSES=8 scope=both sid='*';
    alter system set PARALLEL_MAX_SERVERS=40 scope=both sid='*';
    alter system set PARALLEL_SERVERS_TARGET=40 scope=both sid='*';

    It is possible that in the future we might automate changing the number of CPUs by schedule.  It is then easy to switch resource manager plans by simply setting an initialisation parameter.  

    At the moment, we have one plan in force at all times.  It is also possible to change plans on a schedule using scheduler windows, and you can still intervene manually by opening a window.

    TL;DR In the Cloud, Performance is Instrumented as Cost

    You can have as much CPU and performance as you are willing to pay for.

    By configuring the resource manager to prioritise CPU allocation to high-priority processes, ones for which users are waiting, over lower-priority ones, a system can achieve its performance objectives while consuming fewer resources.

    Monday, April 17, 2023

    Investigating Unfamiliar PL/SQL with the Hierarchical Profiler

    The PL/SQL profiler can tell you how much time you spend where in your PL/SQL code (see my presentation Performance Tuning with the PL/SQL Performance Profilers).  Therefore it also tells you which code blocks were executed, and in which test run.  If you are debugging code with which you are unfamiliar, this can provide insight into where to focus attention and determine what is going on.

    Example Problem

    I was looking at a third-party application that uses the database job scheduler to run multiple concurrent batch jobs.  I needed to work out why they were not balancing properly between the database instances.  This application has its own scheduler package.  It is driven by metadata to define the jobs to be submitted.  This package then calls the delivered DBMS_SCHEDULER package.  However, this third-party package is quite complicated, there are lots of similar calls, and it is difficult to work out what was executed by just reading the code.

    I ran the application having enabled the hierarchical profiler, DBMS_HPROF.  I was able to query the profiler tables to find the calls to DBMS_SCHEDULER that were executed.

    Querying the Profiler Tables

    Each time DBMSHP is run, the data is tagged with a separate run ID, so if I do different tests I can easily separate them.  However, in this example, I have run only one test.

    SELECT * 
    FROM   dbmshp_runs
    ORDER BY runid;
    
     Run
      ID RUN_TIMESTAMP                  TOTAL_ELAPSED_TIME RUN_COMMENT                                          TRACE_ID
    ---- ------------------------------ ------------------ -------------------------------------------------- ----------
       3 28-MAR-23 19.23.20.891595000             78254498                                                             2
    Usually, I am interested in improving performance, so I look for the code that took the most time and profile the code blocks by elapsed time.  However, this time, I have sorted them by module and line number so I can see which code blocks were executed.
    BREAK ON OWNER ON TYPE ON module skip 1
    SELECT fi.symbolid, fi.owner, fi.type, fi.module, fi.function, fi.line#, fi.namespace, fi.calls, fi.function_elapsed_time, fi.sql_id
    FROM   dbmshp_function_info fi
    WHERE  fi.runid = 3
    ORDER BY fi.owner, fi.module, fi.line#;
    The profile includes the application code and also the Oracle packages owned by SYS Function
     Symbol                                                                                                        Line Name             Elapsed
         ID OWNER              TYPE            MODULE                    FUNCTION                                     # Space   CALLS       Time SQL_ID
    ------- ------------------ --------------- ------------------------- ---------------------------------------- ----- ----- ------- ---------- -------------
          8 XXXXX_CUST         PACKAGE BODY    CUST_PARALLEL_JOBS        ISJOBSRUNNING                                6 PLSQL      38        708
          9                                                              ISJOBSRUNNING.C_RUNNING_JOBS_CNT            12 PLSQL      38        266
        137                                                              __static_sql_exec_line13                    13 SQL        38       9681 2y7y7t8bf4ykw
        133                                                              __sql_fetch_line23                          23 SQL        38    2026632 2y7y7t8bf4ykw
          6                                                              GETJOBSSTATUS                               42 PLSQL       7        150
          7                                                              GETJOBSSTATUS.C_JOB_STATUS                  48 PLSQL       7         59
        138                                                              __static_sql_exec_line49                    49 SQL         7        565 d5g73bnmxjuqd
        134                                                              __sql_fetch_line59                          59 SQL         7        238 d5g73bnmxjuqd
          3                                                              CUST_SIMULATE_SURRENDER                    105 PLSQL       1       1232
          5                                                              CUST_SIMULATE_SURRENDER.C_JOB_GROUPS       110 PLSQL       1         16
        135                                                              __static_sql_exec_line111                  111 SQL         1        159 1xrrajz8mgbhs
          4                                                              CUST_SIMULATE_SURRENDER.C_CFG              119 PLSQL       1         12
        136                                                              __static_sql_exec_line120                  120 SQL         1         90 9ytv0rhjjp3mr
        131                                                              __sql_fetch_line160                        160 SQL         1        118 1xrrajz8mgbhs
        132                                                              __sql_fetch_line165                        165 SQL         1        135 9ytv0rhjjp3mr
    …
         54 XXXXX_SCHEDULER    PACKAGE BODY    SCHEDULER_ENGINE          __pkg_init                                   0 PLSQL       1          5
         55 XXXXX_SCHEDULER    PACKAGE SPEC    SCHEDULER_ENGINE          __pkg_init                                   0 PLSQL       1          5
         52 XXXXX_SCHEDULER    PACKAGE BODY    SCHEDULER_ENGINE          RUN_JOB                                    770 PLSQL       7        176
         53                                                              SET_JOB_ARGUMENT                          1317 PLSQL      21        202
        178                                                              __static_sql_exec_line1355                1355 SQL        21       4733 3h8uatusjv84c
    …
        118 SYS                PACKAGE BODY    DBMS_SCHEDULER            CREATE_PROGRAM                              15 PLSQL       1         24
        121                                                              DROP_PROGRAM                                43 PLSQL       2         98
        119                                                              DEFINE_PROGRAM_ARGUMENT                    112 PLSQL       3        186
        122                                                              DROP_PROGRAM_ARGUMENT                      211 PLSQL       6        363
        117                                                              CREATE_JOB                                 432 PLSQL       7        428
        124                                                              RUN_JOB                                    546 PLSQL       7        239
        120                                                              DROP_JOB                                   696 PLSQL      14       7484
        123                                                              ENABLE                                    2992 PLSQL       1         87
        125                                                              SET_ATTRIBUTE                             3063 PLSQL      14        957
        126                                                              SET_ATTRIBUTE                             3157 PLSQL      14       2923
        127                                                              SET_ATTRIBUTE_NULL                        3274 PLSQL       7         42
        116                                                              CHECK_SYS_PRIVS                           3641 PLSQL      69     153470
    …
    
    The hierarchical profiler tracks which code blocks call which code blocks, so I can perform a hierarchical query starting where the parent is null.
    SELECT symbolid, parentsymid,
           RPAD(' ', (level-1)*2, ' ') || a.name AS name, 
           a.line#, a.calls,
           a.subtree_elapsed_time, 
           a.function_elapsed_time       
    FROM   (SELECT fi.symbolid,
                   pci.parentsymid,
                   RTRIM(fi.owner || '.' || fi.module || '.' || NULLIF(fi.function, fi.module), '.') AS name,
                   fi.line#,
                   NVL(pci.subtree_elapsed_time, fi.subtree_elapsed_time) AS subtree_elapsed_time, 
                   NVL(pci.function_elapsed_time, fi.function_elapsed_time) AS function_elapsed_time, 
                   NVL(pci.calls, fi.calls) AS calls
            FROM   dbmshp_function_info fi
                   LEFT JOIN dbmshp_parent_child_info pci ON fi.runid = pci.runid AND fi.symbolid = pci.childsymid
            WHERE  fi.runid = 3
            AND    NOT fi.module LIKE 'DBMS_HPROF%'
            ) a
    CONNECT BY a.parentsymid = PRIOR a.symbolid
    START WITH a.parentsymid IS NULL;
    
    I can see that CUST_PARALLEL_JOBS.CUST_SIMULATE_SURRENDER calls XXXXX_SCHEDULER.SCHEDULER_ENGINE.RUN_JOB and that calls DBMS_SCHEDULER.RUN_JOB.
    
    Symbol  Parent                                                                                                       Line            Elapsed    Elapsed
         ID  Sym ID NAME                                                                                                     #   CALLS       Time       Time
    ------- ------- ---------------------------------------------------------------------------------------------------- ----- ------- ---------- ----------
         18         XXXXX_CUST_ADDON.CUST_SCHED_SIMSURRENDERS                                                                1       1   78254334        570
          3      18   XXXXX_CUST.CUST_PARALLEL_JOBS.CUST_SIMULATE_SURRENDER                                                105       1   77139478       1232
          4       3     XXXXX_CUST.CUST_PARALLEL_JOBS.CUST_SIMULATE_SURRENDER.C_CFG                                        119       1        102         12
        136       4       XXXXX_CUST.CUST_PARALLEL_JOBS.__static_sql_exec_line120                                          120       1         90         90
    …
         52       3     XXXXX_SCHEDULER.SCHEDULER_ENGINE.RUN_JOB                                                           770       7      58708        176
         56      52       XXXXX_SCHEDULER.SCHEDULER_UTILS.LOG_AUDIT_EVENT                                                  173       7         45         40
        115      56         SYS.DBMS_OUTPUT.PUT_LINE                                                                       109      41         43         43
         57      52       XXXXX_SCHEDULER.SCHEDULER_UTILS.SCHEMA_OWNER                                                     238       7         24         24
        124      52       SYS.DBMS_SCHEDULER.RUN_JOB                                                                       546       7      58463        239
        104     124         SYS.DBMS_ISCHED.CHECK_COMPAT                                                                  3509       7         11         11
        112     124         SYS.DBMS_ISCHED.RUN_JOB                                                                        242       7      44391      44391
    …
    
    Now I know which code to examine. This query outer joins the profiler data to the source code. NB. Any wrapped code will not be available in the ALL_SOURCE view. You might want to unwrap it, at least in a test environment (see Philipp Salisberg's PL/SQL Unwrapper for SQL Developer).
    break on owner on name skip 1 on type
    SELECT s.owner, s.type, s.name, h.function, s.line, 
           h.function_elapsed_time/1e6 function_elapsed_time, h.calls, s.text
    FROM   all_source s
      LEFT OUTER JOIN dbmshp_function_info h
        ON s.owner = h.owner and s.name = h.module and s.type = h.type and s.line = h.line# and h.runid = 3
    WHERE ((         s.owner = 'XXXXX_CUST'
                 AND s.name = 'CUST_PARALLEL_JOBS'
                 AND s.type = 'PACKAGE BODY'
                 AND s.line between 100 and 300
           ) OR (    s.owner = 'XXXXX_SCHEDULER'
                 AND s.name = 'SCHEDULER_ENGINE'
                 AND s.type = 'PACKAGE BODY'
                 AND s.line between 770 and 858
          ))
    ORDER BY s.owner, s.name, s.type, s.line
    /
    
    Now, I can scan through the code and see how the code blocks were called.
                                                                                         Function
                                                                                          Elapsed
    OWNER           TYPE         NAME                 FUNCTION                   LINE        Time   CALLS TEXT
    --------------- ------------ -------------------- ------------------------- ----- ----------- ------- -------------------------------------------------------------------------------------------------------------
    XXXXX_CUST      PACKAGE BODY CUST_PARALLEL_JOBS   CUST_SIMULATE_SURRENDER     105     .001232       1 PROCEDURE Cust_Simulate_Surrender (pi_bus_in IN SrvContext, pio_err  IN OUT SrvErr)
                                                                                  106                     IS
    …
                                                                                  213                                     -- run current job when it is not started yet
                                                                                  214                                     IF l_cfg_tbl(indx_job).allowed = 'Y' -- flag Y - to be started
                                                                                  215                                     THEN
                                                                                  216                                         -- run current job
                                                                                  217                                         XXXXX_scheduler.scheduler_engine.Run_Job (l_cfg_tbl(indx_job).XXXXX_job_name);
                                                                                  218                                         --XXXXX_scheduler.scheduler_engine.enable_Job (l_cfg_tbl(indx_job).XXXXX_job_name);
    …
    XXXXX_SCHEDULER PACKAGE BODY SCHEDULER_ENGINE     RUN_JOB                     770     .000176       7 PROCEDURE RUN_JOB( PI_JOB_NAME SCHEDULER_JOBS.JOB_NAME%TYPE )
    …
                                                                                  778                     IS
                                                                                  779                     BEGIN
                                                                                  780                         DBMS_SCHEDULER.RUN_JOB(
                                                                                  781                             SCHEDULER_UTILS.SCHEMA_OWNER || '."' || PI_JOB_NAME || '"', USE_CURRENT_SESSION=>FALSE );
                                                                                  782
                                                                                  783
                                                                                  784
                                                                                  785                         SCHEDULER_UTILS.LOG_AUDIT_EVENT( 'RunJob', TRUE, PI_OBJECT_NAME => PI_JOB_NAME );
                                                                                  786                     EXCEPTION
                                                                                  787                       WHEN OTHERS THEN
    …
                                                                                  798                     END;
                                                                                  799
                                                                                  800
                                                                                  801                     --dmk 29.3.2023 added
                                                                                  802                     PROCEDURE ENABLE_JOB( PI_JOB_NAME SCHEDULER_JOBS.JOB_NAME%TYPE )
    …
                                                                                  810                     IS
                                                                                  811                     BEGIN
                                                                                  812                         DBMS_SCHEDULER.Enable(
                                                                                  813                             SCHEDULER_UTILS.SCHEMA_OWNER || '."' || PI_JOB_NAME || '"');
                                                                                  814
                                                                                  815
                                                                                  816
                                                                                  817                         SCHEDULER_UTILS.LOG_AUDIT_EVENT( 'Enable_Job', TRUE, PI_OBJECT_NAME => PI_JOB_NAME );
                                                                                  818                     EXCEPTION
                                                                                  819                       WHEN OTHERS THEN
    …
                                                                                  830                     END;
    
    By following the profiler data, I have found that DBMS_SCHEDULER.RUN_JOB was used.  I was then able to add an alternative procedure that calls DBMS_SCHEDULER.ENABLE and call that from the custom application code.

    Thursday, April 13, 2023

    Using SQL Profiles to Tackle High Parse Time and CPU Consumption

    The Challenge of Dynamic SQL with Literals   

    The following example is taken from a PeopleSoft General Ledger system. The SQL was generated by the nVision reporting tool (some literal values have been obfuscated).
    SELECT L4.TREE_NODE_NUM,SUM(A.POSTED_TOTAL_AMT) 
    FROM PS_XX_SUM_XXXXX_VW A, PSTREESELECT10 L4, PSTREESELECT10 L2 
    WHERE A.LEDGER='X_UKMGT' 
    AND A.FISCAL_YEAR=2022 AND A.ACCOUNTING_PERIOD=1 
    AND L4.SELECTOR_NUM=415 AND A.CHARTFIELD3=L4.RANGE_FROM_10 
    AND L2.SELECTOR_NUM=416 AND A.ACCOUNT=L2.RANGE_FROM_10 
    AND (A.DEPTID BETWEEN '10000' AND '18999' OR
    A.DEPTID BETWEEN '20000' AND '29149' OR A.DEPTID='29156' OR
    A.DEPTID='29158' OR A.DEPTID BETWEEN '29165' AND '29999' OR A.DEPTID
    BETWEEN '30000' AND '39022' OR A.DEPTID BETWEEN '39023' AND '39999' OR
    A.DEPTID BETWEEN '40000' AND '49999' OR A.DEPTID BETWEEN '50000' AND
    '59999' OR A.DEPTID BETWEEN '60000' AND '69999' OR A.DEPTID BETWEEN
    '70000' AND '79999' OR A.DEPTID BETWEEN '80000' AND '89999' OR
    A.DEPTID='29150' OR A.DEPTID=' ') 
    AND A.CHARTFIELD1='0120413' 
    AND A.CURRENCY_CD='GBP' 
    GROUP BY L4.TREE_NODE_NUM
    
    Plan hash value: 1653134809
    
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    | Id  | Operation                                                   | Name               | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                                            |                    |       |       |    27 (100)|          |       |       |        |      |            |
    |   1 |  PX COORDINATOR                                             |                    |       |       |            |          |       |       |        |      |            |
    |   2 |   PX SEND QC (RANDOM)                                       | :TQ10006           |     1 |    29 |    27  (63)| 00:00:01 |       |       |  Q1,06 | P->S | QC (RAND)  |
    |   3 |    HASH GROUP BY                                            |                    |     1 |    29 |    27  (63)| 00:00:01 |       |       |  Q1,06 | PCWP |            |
    |   4 |     PX RECEIVE                                              |                    |     1 |    29 |    27  (63)| 00:00:01 |       |       |  Q1,06 | PCWP |            |
    |   5 |      PX SEND HASH                                           | :TQ10005           |     1 |    29 |    27  (63)| 00:00:01 |       |       |  Q1,05 | P->P | HASH       |
    |   6 |       HASH GROUP BY                                         |                    |     1 |    29 |    27  (63)| 00:00:01 |       |       |  Q1,05 | PCWP |            |
    |   7 |        HASH JOIN                                            |                    |     1 |    29 |    27  (63)| 00:00:01 |       |       |  Q1,05 | PCWP |            |
    |   8 |         JOIN FILTER CREATE                                  | :BF0000            |     1 |    16 |    25  (68)| 00:00:01 |       |       |  Q1,05 | PCWP |            |
    |   9 |          PX RECEIVE                                         |                    |     1 |    16 |    25  (68)| 00:00:01 |       |       |  Q1,05 | PCWP |            |
    |  10 |           PX SEND HYBRID HASH                               | :TQ10003           |     1 |    16 |    25  (68)| 00:00:01 |       |       |  Q1,03 | P->P | HYBRID HASH|
    |  11 |            STATISTICS COLLECTOR                             |                    |       |       |            |          |       |       |  Q1,03 | PCWC |            |
    |  12 |             VIEW                                            | VW_GBC_10          |     1 |    16 |    25  (68)| 00:00:01 |       |       |  Q1,03 | PCWP |            |
    |  13 |              HASH GROUP BY                                  |                    |     1 |    67 |    25  (68)| 00:00:01 |       |       |  Q1,03 | PCWP |            |
    |  14 |               PX RECEIVE                                    |                    |     1 |    67 |    25  (68)| 00:00:01 |       |       |  Q1,03 | PCWP |            |
    |  15 |                PX SEND HASH                                 | :TQ10002           |     1 |    67 |    25  (68)| 00:00:01 |       |       |  Q1,02 | P->P | HASH       |
    |  16 |                 HASH GROUP BY                               |                    |     1 |    67 |    25  (68)| 00:00:01 |       |       |  Q1,02 | PCWP |            |
    |  17 |                  HASH JOIN                                  |                    |    60 |  4020 |    24  (67)| 00:00:01 |       |       |  Q1,02 | PCWP |            |
    |  18 |                   JOIN FILTER CREATE                        | :BF0001            |    60 |  3120 |    22  (73)| 00:00:01 |       |       |  Q1,02 | PCWP |            |
    |  19 |                    PX RECEIVE                               |                    |    60 |  3120 |    22  (73)| 00:00:01 |       |       |  Q1,02 | PCWP |            |
    |  20 |                     PX SEND HYBRID HASH                     | :TQ10000           |    60 |  3120 |    22  (73)| 00:00:01 |       |       |  Q1,00 | P->P | HYBRID HASH|
    |  21 |                      STATISTICS COLLECTOR                   |                    |       |       |            |          |       |       |  Q1,00 | PCWC |            |
    |  22 |                       PX BLOCK ITERATOR                     |                    |    60 |  3120 |    22  (73)| 00:00:01 |    29 |    29 |  Q1,00 | PCWC |            |
    |  23 |                        MAT_VIEW REWRITE ACCESS INMEMORY FULL| PS_XX_SUM_XXXXX_MV |    60 |  3120 |    22  (73)| 00:00:01 |    29 |    29 |  Q1,00 | PCWP |            |
    |  24 |                   PX RECEIVE                                |                    |   306 |  4590 |     2   (0)| 00:00:01 |       |       |  Q1,02 | PCWP |            |
    |  25 |                    PX SEND HYBRID HASH                      | :TQ10001           |   306 |  4590 |     2   (0)| 00:00:01 |       |       |  Q1,01 | P->P | HYBRID HASH|
    |  26 |                     JOIN FILTER USE                         | :BF0001            |   306 |  4590 |     2   (0)| 00:00:01 |       |       |  Q1,01 | PCWP |            |
    |  27 |                      PX BLOCK ITERATOR                      |                    |   306 |  4590 |     2   (0)| 00:00:01 |   416 |   416 |  Q1,01 | PCWC |            |
    |  28 |                       TABLE ACCESS STORAGE FULL             | PSTREESELECT10     |   306 |  4590 |     2   (0)| 00:00:01 |   416 |   416 |  Q1,01 | PCWP |            |
    |  29 |         PX RECEIVE                                          |                    |   202 |  2626 |     2   (0)| 00:00:01 |       |       |  Q1,05 | PCWP |            |
    |  30 |          PX SEND HYBRID HASH                                | :TQ10004           |   202 |  2626 |     2   (0)| 00:00:01 |       |       |  Q1,04 | P->P | HYBRID HASH|
    |  31 |           JOIN FILTER USE                                   | :BF0000            |   202 |  2626 |     2   (0)| 00:00:01 |       |       |  Q1,04 | PCWP |            |
    |  32 |            PX BLOCK ITERATOR                                |                    |   202 |  2626 |     2   (0)| 00:00:01 |   415 |   415 |  Q1,04 | PCWC |            |
    |  33 |             TABLE ACCESS STORAGE FULL                       | PSTREESELECT10     |   202 |  2626 |     2   (0)| 00:00:01 |   415 |   415 |  Q1,04 | PCWP |            |
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    
    Query Block Name / Object Alias (identified by operation id):
    -------------------------------------------------------------
    
       1 - SEL$6240F0FF
      12 - SEL$B80655F7 / VW_GBC_10@SEL$9C8D6CC0
      13 - SEL$B80655F7
      23 - SEL$B80655F7 / PS_XX_SUM_XXXXX_MV@SEL$CAD4EEF6
      28 - SEL$B80655F7 / L2@SEL$1
      33 - SEL$6240F0FF / L4@SEL$1
    … 
    In my example, ASH sampled 276 different SQL IDs.  Each one was only executed once.  There may have been more statements, but ASH only persists one sample every 10s. Cumulatively, they consumed 2843 seconds of DB time in SQL hard parse.
                                                                                                                                              Plan
                                                                                              SQL Plan        Force Matching    SQL   Plan   Parse
     # OPRID        RUNCNTLID              ACTION                           SQL_ID          Hash Value             Signature    IDs  Execs    Secs Table Name
    -- ------------ ---------------------- -------------------------------- ------------- ------------ --------------------- ------ ------ ------- ------------------ 
     1 NVISION      NVS_RPTBOOK_99         PI=9984520:UKGL999I:12345        01g5hvs91k4hn   1653134809   1995330195085985689    276    276    2843 PS_XX_SUM_XXXXX_MV 
    … 
    
    This is one of at least 276 different SQL statements that all have the same force-matching signature. The statements are essentially the same but differ in some of their literal values. That means that the database has to treat each one as a different SQL statement that must be fully parsed separately. 
    SQL Parse involves checking the statement is syntactically correct, and that the user has permission to access the objects, then during the SQL optimization stage the optimizer decides how to execute the statement before it moves to row source generation. 
    If the statement has been parsed previously and is still in the shared pool, Oracle can skip the optimization and row source generation stages. This is often called soft parse. 

    SQL Optimization 

    During the optimization stage, the optimizer calculates the 'cost' of different possible execution plans. Depending upon the SQL, the optimizer considers different table join orders, different table join methods, and different SQL transformations. The optimizer cost is an estimation of the time that it will take to execute a particular plan. The unit of cost is roughly equivalent to the duration of a single block read. More expensive plans are abandoned as they become more expensive than the cheapest known plan so far. Thus the 'cost-based' optimizer produces the cheapest plan. However, the process of optimization consumes time and CPU. 
    If I write SQL that is executed many times with bind variables rather than literals, then I should avoid some hard parses and the associated CPU consumption. Oracle has always recommended using bind variables rather than literals to improve performance as well as protect against SQL injection. However, there are many applications that still use literals, particularly in dynamically generated SQL. Every statement has to be hard parsed, and the cumulative CPU consumption can start to become significant. PeopleSoft is one such application that does this in some areas of the product, but it is by no means an isolated example. 
    Oracle produced a feature called Cursor Sharing. Literals in statements are automatically converted to bind variables. It can be very effective. It does reduce SQL parse, but can sometimes also produce undesirable side effects where the execution plan may not change as the bind variable values change. 

    Hints 

    Hints are directives to the optimizer. They tell it to do something or more generally not to do something else. If I were to add some optimizer hints to a statement that will produce the same, or a similar, execution plan, then the optimizer should do less work, consume less CPU, and less time coming to the same or similar conclusion. 
     For example, if I add a LEADING hint to force the optimizer to start with a particular object, that will reduce the number of join orders to be considered. 
    • A two-table query has 2 possible join orders; a LEADING hint will reduce it to 1. 
    • A three-table query has 6 possible join orders; a LEADING hint on a single table will reduce it to 2. 
    Often, it is not possible to add hints directly to the code in the application because it is all dynamically generated inside a package, or it may not be desirable to alter third-party code. In my example, the SQL was generated by compiled code within the nVision reporting tool that I cannot alter. I can't use a SQL Patch because I would need a patch for every SQL_ID and I can't predicate the SQL_IDs. Instead, I can create a force-matching SQL profile that will match every statement with the same force-matching signature. 
    N.B. SQL Profiles require the SQL Tuning pack licence. 

    Example SQL Profile 

    I don't have to use the full outline of hints from the execution plan, I have chosen to apply just a few. 
    • LEADING(L2): I want the query to start with the dimension table PSTREESELECT10. This will result in a change to the execution plan 
    • REWRITE: PS_XX_SUM_XXXXX_MV is a materialized view built on the view PS_XX_SUM_XXXXX_VW of an underlying summary ledger. Rewriting the SQL to use the materialized view is a cost-based decision. Oracle usually decides to rewrite it to use the materialized view, but I want to ensure that this always happens with this hint. 
    • NO_PARALLEL: This query selects only a single accounting period, so it is only scanning a single partition, therefore I don't want to invoke a parallel query. 
    • PX_JOIN_FILTER(PS_XX_SUM_XXXXX_MV@SEL$CAD4EEF6): The dimension table is equijoined to the fact table. Therefore, it is a good candidate for using a Bloom filter on the look-up fact table. This doesn't always happen naturally on this statement. I have had to use the query block name taken from the execution plan of the rewritten statement.  The query block name is stable, it is a hash value based on the object name and the operation.
    The profile is then created with DBMS_SQLTUNE.IMPORT_SQL_PROFILE.
    set serveroutput on
    DECLARE
      l_sql_text CLOB;
      l_signature NUMBER;
      h       SYS.SQLPROF_ATTR;
    …
    BEGIN
    …
    h := SYS.SQLPROF_ATTR(
    q'[BEGIN_OUTLINE_DATA]',
    q'[NO_PARALLEL]',
    q'[LEADING(L2)]',
    q'[PX_JOIN_FILTER(PS_XX_SUM_XXXXX_MV@SEL$CAD4EEF6)]',
    q'[REWRITE]',
    q'[END_OUTLINE_DATA]');
    
    l_signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(l_sql_text);
    
    DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
    sql_text    => l_sql_text,
    profile     => h,
    name        => 'NVS_UKGL999I_FUNC_ACEXP1',
    category    => 'DEFAULT',
    validate    => TRUE,
    replace     => TRUE,
    force_match => TRUE);
    …
    END;
    /
    This is the execution plan with the SQL Profile. The note confirms that a SQL profile was used. The hint report shows the hints from the SQL Profile.  
    Note that the SELECTOR_NUM and CHARTFIELD1 predicates have changed.
    SELECT L4.TREE_NODE_NUM,SUM(A.POSTED_TOTAL_AMT) 
    FROM PS_XX_SUM_XXXXX_VW A, PSTREESELECT10 L4, PSTREESELECT10 L2 
    WHERE A.LEDGER='X_UKMGT' 
    AND A.FISCAL_YEAR=2023 AND A.ACCOUNTING_PERIOD=1 
    AND L4.SELECTOR_NUM=433 AND A.CHARTFIELD3=L4.RANGE_FROM_10 
    AND L2.SELECTOR_NUM=434 AND A.ACCOUNT=L2.RANGE_FROM_10 
    AND (A.DEPTID BETWEEN '10000' AND '18999' OR
    A.DEPTID BETWEEN '20000' AND '29149' OR A.DEPTID='29156' OR
    A.DEPTID='29158' OR A.DEPTID BETWEEN '29165' AND '29999' OR A.DEPTID
    BETWEEN '30000' AND '39022' OR A.DEPTID BETWEEN '39023' AND '39999' OR
    A.DEPTID BETWEEN '40000' AND '49999' OR A.DEPTID BETWEEN '50000' AND
    '59999' OR A.DEPTID BETWEEN '60000' AND '69999' OR A.DEPTID BETWEEN
    '70000' AND '79999' OR A.DEPTID BETWEEN '80000' AND '89999' OR
    A.DEPTID='29150' OR A.DEPTID=' ') 
    AND A.CHARTFIELD1='0051001' 
    AND A.CURRENCY_CD='GBP' 
    GROUP BY L4.TREE_NODE_NUM
    
    Plan hash value: 3033847137
    
    ---------------------------------------------------------------------------------------------------------------------------------
    | Id  | Operation                                  | Name               | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
    ---------------------------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                           |                    |       |       |   214 (100)|          |       |       |
    |   1 |  SORT GROUP BY                             |                    |     5 |   400 |   214  (62)| 00:00:01 |       |       |
    |   2 |   HASH JOIN                                |                    |  2347 |   183K|   213  (62)| 00:00:01 |       |       |
    |   3 |    HASH JOIN                               |                    |  2347 |   153K|   210  (63)| 00:00:01 |       |       |
    |   4 |     JOIN FILTER CREATE                     | :BF0000            |   306 |  4590 |     3   (0)| 00:00:01 |       |       |
    |   5 |      PARTITION RANGE SINGLE                |                    |   306 |  4590 |     3   (0)| 00:00:01 |   434 |   434 |
    |   6 |       TABLE ACCESS STORAGE FULL            | PSTREESELECT10     |   306 |  4590 |     3   (0)| 00:00:01 |   434 |   434 |
    |   7 |     JOIN FILTER USE                        | :BF0000            | 26468 |  1344K|   206  (64)| 00:00:01 |       |       |
    |   8 |      PARTITION RANGE SINGLE                |                    | 26468 |  1344K|   206  (64)| 00:00:01 |    42 |    42 |
    |   9 |       MAT_VIEW REWRITE ACCESS INMEMORY FULL| PS_XX_SUM_XXXXX_MV | 26468 |  1344K|   206  (64)| 00:00:01 |    42 |    42 |
    |  10 |    PARTITION RANGE SINGLE                  |                    |   202 |  2626 |     3   (0)| 00:00:01 |   433 |   433 |
    |  11 |     TABLE ACCESS STORAGE FULL              | PSTREESELECT10     |   202 |  2626 |     3   (0)| 00:00:01 |   433 |   433 |
    ---------------------------------------------------------------------------------------------------------------------------------
    
    Query Block Name / Object Alias (identified by operation id):
    -------------------------------------------------------------
    
       1 - SEL$38F8C49D
       6 - SEL$38F8C49D / L2@SEL$1
       9 - SEL$38F8C49D / PS_XX_SUM_XXXXX_MV@SEL$CAD4EEF6
      11 - SEL$38F8C49D / L4@SEL$1
    
    Hint Report (identified by operation id / Query Block Name / Object Alias):
    Total hints for statement: 4 (U - Unused (1))
    ---------------------------------------------------------------------------
    
       0 -  STATEMENT
             U -  NO_PARALLEL
    …
       1 -  SEL$38F8C49D
               -  LEADING(L2)
               -  REWRITE
    
       9 -  SEL$38F8C49D / PS_XX_SUM_XXXXX_MV@SEL$CAD4EEF6
               -  PX_JOIN_FILTER(PS_XX_SUM_XXXXX_MV@SEL$CAD4EEF6)
    
    Note
    -----
    …
       - SQL profile "NVS_UKGL999I_FUNC_ACEXP1" used for this statement
    
    • The new execution plan does indeed start with the dimension table 
    • The query was rewritten to use the materialized view 
    • A Bloom filter was used on the materialized view that is now the fact table
    • The NO_PARALLEL hint wasn't used because Oracle chose not to parallelise this statement anyway.
    
                                                                                     Plan
                                                                                              SQL Plan        Force Matching    SQL   Plan   Parse
     # OPRID        RUNCNTLID              ACTION                           SQL_ID          Hash Value             Signature    IDs  Execs    Secs Table Name
    -- ------------ ---------------------- -------------------------------- ------------- ------------ --------------------- ------ ------ ------- ------------------ 
    …
     1 NVISION      NVS_RPTBOOK_99         PI=9984933:UKGL278I:12345        03nwc4yy1r1r7   3033847137   1995330195085985689    138    138    1428 PS_XX_SUM_XXXXX_MV
    
    Now just 1428s is spent on parse time. We only found 138 SQL IDs, but that is just because there are fewer ASH samples because it is taking less time. 
    In this case, adding these hints with a SQL Profile has halved the time spent parsing this set of SQL statements.

    Tuesday, April 11, 2023

    Reading Trace files with SQL

    Oracle 12.2 provided some new views that enable trace files to be read via SQL. Previously, it had been possible to do this by creating external tables, but the new views make it much easier. You can simply query what trace files exist with SQL, and then access them without need for server access. 

    This is particularly useful on some cloud platforms such as autonomous database, where there is no server access, even for the DBA. However, this technique is applicable to all Oracle databases.  Now, not just the DBA, but developers can easily obtain trace files.

    Lots of other people have blogged about this, but Chris Antognini makes the point extremely well:
    In a post on my PeopleSoft blog, I demonstrated enabling trace on an application server process.  I also specified that as a trace file identifier. Now I can query the trace files that exist, and restrict the query by filename or date.
    set pages 99
    select * from gv$diag_trace_file f
    where 1=1
    and f.modify_time > trunc(sysdate)-1
    and f.trace_filename like 'finprod%ora%.trc'
    order by modify_time desc
    /
    
       INST_ID ADR_HOME                                                     TRACE_FILENAME                 CHANGE_TIME                          MODIFY_TIME                              CON_ID
    ---------- ------------------------------------------------------------ ------------------------------ ------------------------------------ ------------------------------------ ----------
             1 /u02/app/oracle/diag/rdbms/finprod/finprod1                  finprod1_ora_306641.trc        23/03/2023 21.25.41.000000000 -05:00 23/03/2023 21.25.41.000000000 -05:00          0
    
    Then I can also query the trace file contents, and even just spool it to a local file.
    clear screen 
    set head off pages 0 feedback off
    with x as (
    select /*+LEADING(F)*/ f.trace_filename, c.line_number, c.payload
    --, max(c.line_number) over (partition by c.trace_filename) max_line_number
    from gv$diag_trace_file f, gv$diag_trace_File_contents c
    where c.adr_home = f.adr_home
    and c.trace_filename = f.trace_filename
    and f.modify_time > trunc(sysdate)-1
    and f.trace_filename like 'finprod%ora%306641.trc'
    )
    select payload from x
    ORDER BY line_number
    /
    
    The contents of the spool file looks just like the trace file.  I can profile it with tkprof or another trace profiler.
    Trace file /u02/app/oracle/diag/rdbms/finprod/finprod1/trace/finprod1_ora_306641.trc
    Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production
    Version 19.16.0.0.0
    Build label:    RDBMS_19.16.0.0.0DBRU_LINUX.X64_220701
    ORACLE_HOME:    /u02/app/oracle/product/19.0.0.0/dbhome_1
    System name:      Linux 
    Node name:  naukp-aora101
    Release:    4.14.35-2047.514.5.1.2.el7uek.x86_64 
    Version:    #2 SMP Thu Jul 28 15:33:31 PDT 2022 
    Machine:    x86_64 
    Storage:    Exadata 
    Instance name: finprod1
    Redo thread mounted by this instance: 1
    Oracle process number: 225
    Unix process pid: 306641, image: oracle@xxxxp-aora102
    
    *** 2023-03-23T21:46:34.632063-04:00
    *** SESSION ID:(2337.13457) 2023-03-23T21:46:34.632080-04:00
    *** CLIENT ID:(NVRUNCNTL) 2023-03-23T21:46:34.632086-04:00
    *** SERVICE NAME:(finprod.acme.com) 2023-03-23T21:46:34.632161-04:00
    *** MODULE NAME:(RPTBOOK) 2023-03-23T21:46:34.632166-04:00
    *** ACTION NAME:(PI=9980346:NVGL0042:42001) 2023-03-23T21:46:34.632171-04:00
    *** CLIENT DRIVER:() 2023-03-23T21:46:34.632177-04:00
    
    IPCLW:[0.0]{-}[RDMA]:RC: [1679622394631549]Connection 0x7f83ee131550 not formed (2). Returning retry.
    IPCLW:[0.1]{E}[RDMA]:PUB: [1679622394631549]RDMA lport 0x400012c62778 dst 100.107.2.7:40056 bid 0x1805ea7b58 rval 2
    

    Monday, March 06, 2023

    "In the Cloud, Performance is Instrumented as Cost"

    About 5 years ago, I was at a conference where someone put this statement up in a PowerPoint slide.  (I would like to be able to correctly credit the author, but I can't remember who it was).  We all looked it at, thought about and said 'yes, of course' to ourselves.  However, as a consultant who specialises in performance optimisation, it has taken until only recently that I started to have conversations with clients that reflect that idea.

    In the good old/bad old days of 'on premises'

    It is not that long ago that the only option for procuring new hardware was to go through a sizing exercise that involved guessing how much you needed, allowing for future growth in data and processing volumes, and then deciding how much you were actually willing to afford, purchase it, and finally wheel it into your data centre and hope for the best.

    It was then normal to want to get the best possible performance out of whatever system was installed on that hardware.  It would inevitably slow down over time.  Eventually, after the hardware purchase had been fully depreciated, you would have to start the whole cycle again and replace the hardware with newer hardware.

    Similarly, Oracle licencing.  You would have to licence Oracle for all your CPUs (there are a few exceptions where you can associate specific CPUs to specific VMs and only licence Oracle for the CPUs in those VMs).  You would also have to decide how many Oracle features you licenced.  Standard or Enterprise Edition?  Diagnostics? Tuning? RAC? Partitioning? Compression? In-Memory?

    "You are gonna need a bigger boat"

    Then when you encountered performance problems you did the best you could with what you had. As a consultant, there was rarely any point in saying to a customer that they had run out of resource and they needed more.  The answer was usually along the lines of 'we have spent our money on that, and it has to last for five years, we have no additional budget and it has to work'. So you got on with finding the rabbit in the hat.

    In the cloud, instead of purchasing hardware as a capital expense, you rent hardware as an operational expense.

    You can bring your own Oracle licence (BYOL), and then you have exactly what you were previously licenced for.  "At a high level, one Oracle Processor License maps to two OCPUs."

    With Oracle's cloud licencing there are still lots of choices to make, not just how many CPUs and how much memory.  You can choose Infrastructure as a Service (IAAS) where you rent the server and install and licence Oracle on it just as you did on-premises.  You can choose different storage systems with different I/O profiles.  There are different levels of PAAS that have different database features.  You can go all the way up to Extreme performance on Exadata.  All of these choices have a cost consequence.  Oracle provides a Cloud cost estimator tool (other consultancies have produced their own versions).  These tools clearly show the link between these choices and their costs very clear.

    You can have as much performance as you are willing to pay for

    I have been working with a customer who is moving a PeopleSoft system from Supercluster on-premises to Exadata Cloud-at-Customer (so it is physically on-site, but in all other respects it is in the cloud).  They are not bringing their own licence (BYOL). Instead, they are on a tariff of US$1.3441/OCPU/hr, we have found it easier to talk about US$1000/OCPU/month.

    Just as you would with an on-premises system, they went through a sizing exercise that predicted they needed 6 OCPU on each of 2 RAC nodes during the day, and 10 at night. 

    It has been very helpful to have a clear quantitative definition of acceptable performance for the critical part of the system, the overnight reporting batch.  "The reports need to be available to users by the start of the working day in continental Europe, at 8am CET", which is 2am EST.  There is no benefit in providing additional resources to allow the batch to finish any earlier.  Instead, we only need to provide as much as is necessary to reliably meet the target.

    A performance tuning/testing exercise quickly showed that fewer than the predicted number of CPUs were actually needed.  2-4 OCPUs/node during the day is looking comfortable.  The new Exadata has fewer but much faster CPUs.  As we adjusted the application configuration to match we found we are able to reduce the number of OCPUs. 

    If we hadn't already been using base-level In Memory feature on Supercluster, then to complete the overnight batch in time for the start of the European working day, we would probably have needed 10 OCPUs/node.  The base-level In Memory option brought that down to around 7.  This shows the huge value of the careful use of database features and techniques to reduce CPU overhead.

    We are not using BYOL, so we can use fully featured In Memory with a larger store.  Increasing the In Memory store from 16Gb to 40Gb per node saved another OCPU, but cost nothing.  If we had been using BYOL we would have had to pay additionally for fully featured In Memory.  I doubt the marginal benefit would have justified the cost.

    The customer has been considering switching on the extra OCPUs overnight to facilitate the batch.  Doing so costs $1.33/hour, and at the end of the month, they get an invoice from Oracle.  That has concentrated minds and changed behaviours.  The customer understands that there is a real $ cost/saving to their business decisions.

    One day I was asked: "What happens if we reduce the number of CPUs from 6 to 4?"

    Essentially the batch will take longer.  We are already using the database resource manager to prioritise processes when all the CPU is in use.  The resource manager plan has been built to reflect the business priorities, and so keeps it fair for all users.  For example, it ensures that users of the online part of the application get CPU in preference to batch processes, this is important for users in Asia who are online when the batch runs overnight in North America.  We also use the resource plan to impose different parallel query limits to different groups of processes.   If we are going to vary the number of CPUs we will have to switch between different resource manager plans with different limits.  We will also have to reduce the number of reports that can be concurrently executed by the application, so some application configuration has to go hand in hand with the database configuration.

    Effective caching by the database meant we already did relatively little physical I/O during the reporting.  Most of the time was already spent on CPU.  Use of In Memory further reduced physical I/O, and now nearly all the time is spent on CPU, but it also reduced the overall CPU consumption and therefore response time.

    When we did vary the number of CPUs, we were not surprised to observe, from the Active Session History (ASH), that the total amount of database time spent on CPU by the nVision reporting processes is roughly constant (indicated by the blue area in the below charts).  If we reduce the number of concurrent processes, then the batch simply runs for longer.


    There is no question that effective design and tuning are as important as they ever were.  The laws of physics are the same in the cloud as they are in your own data centre.  We worked hard to get the reporting to this level of performance and down to this CPU usage. 
    The difference is that now you can measure exactly how much that effort is saving you on your cloud subscription, and you can choose to spend more or less on that cloud subscription in order to achieve your business objectives.

    Determining the benefit to the business, in terms of the quantity and cost of users' time, remains as difficult as ever.  However, it was not a major consideration in this example because this all happens before the users are at work.