tag:blogger.com,1999:blog-146540182024-03-16T01:10:14.625+00:00The /*+Go-Faster*/ Oracle BlogThe PeopleSoft stuff is at <a href="http://blog.psftdba.com">blog.psftdba.com</a>.<br>This is the non-PeopleSoft Oracle Database blog. Mostly about performance.David Kurtzhttp://www.blogger.com/profile/00468908370233805717noreply@blogger.comBlogger95125tag:blogger.com,1999:blog-14654018.post-48608197587892392992024-02-22T09:44:00.002+00:002024-02-22T09:46:15.822+00:00Table Clusters: 6. Testing the Cluster & Conclusion (TL;DR)<p><i>This post is the last part of a <a href="https://blog.go-faster.co.uk/2024/01/table-clusters1.html">series</a> that discusses table clustering in Oracle.</i></p><ol>
<li><a href="https://blog.go-faster.co.uk/2024/01/table-clusters1.html">Introduction and Ancient History</a></li>
<li><a href="https://blog.go-faster.co.uk/2023/12/table-clusters2.html">Cluster & Cluster Key Design Considerations</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/tablecluster3.html">Populating the Cluster with DBMS_PARALLEL_EXECUTE</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-clusters.html">Checking the Cluster Key</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-custers5.html">Using the Cluster Key Index instead of the Primary/Unique Key Index</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-clusers6.html">Testing the Cluster & Conclusion (TL;DR)</a></li>
</ol><h3 style="text-align: left;">Testing</h3><p style="text-align: left;">We did get improved performance with the clustered tables. More significantly, we encountered less inter-process contention, and so were able to run more concurrent processes, and the overall elapsed time of all the processes was reduced.</p><p style="text-align: left;">Looking at just the performance of the bulk delete statements on the result tables, there is a significant reduction in DB time and physical I/O time on the clustered tables. The reduction in physical I/O is not only because the table is smaller, but because there is no need to perform consistent read recovery on the blocks, there are fewer reads from the undo segment and less CPU was consumed creating consistent read copies in the buffer cache.</p>
<table border="1" cellpadding="0" cellspacing="0" style="border-collapse: collapse; border: none;">
<tbody><tr><td colspan="2" rowspan="2" style="border: 1pt solid windowtext; padding: 0cm 2pt;" valign="middle">Statement</td>
<td style="border-bottom: 1pt solid windowtext; border-image: initial; border-left: none; border-right: 1pt solid windowtext; border-top: 1pt solid windowtext; border: 1pt solid windowtext; padding: 0cm 2pt;">Heap Table</td>
<td style="border-bottom: 1pt solid windowtext; border-image: initial; border-left: none; border-right: 1pt solid windowtext; border-top: 1pt solid windowtext; border: 1pt solid windowtext; padding: 0cm 2pt;">Clustered Table</td></tr>
<tr><td colspan="2" style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;">DELETE FROM PS_GP_RSLT_ACUM…</td></tr>
<tr><td colspan="2" style="border-bottom: 1pt solid windowtext; border-image: initial; border-left: 1pt solid windowtext; border-right: 1pt solid windowtext; border-top: none; border: 1pt solid windowtext; padding: 0cm 2pt;">DB Time (s)</td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">2182</p></td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">1662</p></td></tr>
<tr><td rowspan="2" style="border-bottom: 1pt solid windowtext; border-image: initial; border-left: 1pt solid windowtext; border-right: 1pt solid windowtext; border-top: none; border: 1pt solid windowtext; padding: 0cm 2pt;">delete statement only</td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;">db file sequential</td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">1451</p></td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">891</p></td></tr>
<tr><td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;">CPU</td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">941</p></td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">531</p></td></tr>
</tbody></table><br /><table border="1" cellpadding="0" cellspacing="0" style="border-collapse: collapse; border: none;">
<tbody><tr>
<td colspan="2" rowspan="2" style="border: 1pt solid windowtext; padding: 0cm 2pt;" valign="middle">Statement</td>
<td style="border-bottom: 1pt solid windowtext; border-image: initial; border-left: none; border-right: 1pt solid windowtext; border-top: 1pt solid windowtext; border: 1pt solid windowtext; padding: 0cm 2pt;">Heap Table</td>
<td style="border-bottom: 1pt solid windowtext; border-image: initial; border-left: none; border-right: 1pt solid windowtext; border-top: 1pt solid windowtext; border: 1pt solid windowtext; padding: 0cm 2pt;">Clustered Table</td> </tr>
<tr><td colspan="2" style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;">DELETE FROM PS_GP_RSLT_ABS…</td></tr>
<tr><td colspan="2" style="border-bottom: 1pt solid windowtext; border-image: initial; border-left: 1pt solid windowtext; border-right: 1pt solid windowtext; border-top: none; border: 1pt solid windowtext; padding: 0cm 2pt;">DB Time (s)</td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">781</p></td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">330</p></td></tr>
<tr><td rowspan="2" style="border-bottom: 1pt solid windowtext; border-image: initial; border-left: 1pt solid windowtext; border-right: 1pt solid windowtext; border-top: none; border: 1pt solid windowtext; padding: 0cm 2pt;">delete statement only</td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;">db file sequential</td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">340</p></td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">210</p></td></tr>
<tr><td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;">CPU</td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">300</p></td> <td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">120</p></td></tr></tbody></table>
<div>GP_RSLT_PIN is another, albeit smaller, result table. It is a candidate for clustering, however, it was not clustered for this test and therefore did not show any significant improvement. It was subsequently clustered.</div>
<div><table border="1" cellpadding="0" cellspacing="0" style="border-collapse: collapse; border: none;">
<tbody><tr><td colspan="2" rowspan="2" style="border: 1pt solid windowtext; padding: 0cm 2pt;" valign="middle"><p style="margin-bottom: 0cm;">Statement</p></td>
<td style="border-bottom: 1pt solid windowtext; border-image: initial; border-left: none; border-right: 1pt solid windowtext; border-top: 1pt solid windowtext; border: 1pt solid windowtext; padding: 0cm 2pt;">Heap Table</td>
<td style="border-bottom: 1pt solid windowtext; border-image: initial; border-left: none; border-right: 1pt solid windowtext; border-top: 1pt solid windowtext; border: 1pt solid windowtext; padding: 0cm 2pt;">Heap in Cluster Test</td> </tr>
<tr><td colspan="2" style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;">DELETE FROM PS_GP_RSLT_PIN…</td></tr>
<tr><td colspan="2" style="border-bottom: 1pt solid windowtext; border-image: initial; border-left: 1pt solid windowtext; border-right: 1pt solid windowtext; border-top: none; border: 1pt solid windowtext; padding: 0cm 2pt;">DB Time (s)</td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">270</p></td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">250</p></td> </tr>
<tr><td rowspan="2" style="border-bottom: 1pt solid windowtext; border-image: initial; border-left: 1pt solid windowtext; border-right: 1pt solid windowtext; border-top: none; border: 1pt solid windowtext; padding: 0cm 2pt;">delete statement only</td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;">db file sequential</td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">110</p></td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">120</p></td></tr>
<tr><td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;">CPU</td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;"><p style="text-align: right;">110</p></td>
<td style="border-bottom: 1pt solid windowtext; border-left: none; border-right: 1pt solid windowtext; border-top: none; padding: 0cm 2pt;">
<p style="text-align: right;">90</p></td> </tr>
</tbody></table>
<div><div style="text-align: left;">The execution plans for some queries on clustered tables changed to use the cluster key index which resulted in poorer performance. I had to introduce some SQL profiles to reinstate the original execution plans. </div><div style="text-align: left;">However, the execution plans for these delete statements also switched to the cluster key index resulting in improved performance. So it depends.</div><h3 style="text-align: left;">Conclusion (TL;DR)</h3><div>Table partitioning can help you find data efficiently by allowing the database to eliminate partitions that cannot contain the data. However, you must be running Enterprise Edition and license the partitioning option.</div><div>Table clustering is effective when you are regularly querying data from multiple tables with similar keys, and you can store them in the same data blocks, thus saving the overhead of retrieving multiple blocks. It is available on any Oracle database and does not require any additional licence.</div><div>Both partitioning and clustering can help avoid the overhead of read consistency by storing dissimilar data in different blocks.</div></div></div><div>Sometimes, using the cluster key index can result in worse performance than using the original indexes. A SQL profile or SQL baseline may be needed to stabilise some execution plans.</div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-81870437848532667902024-02-19T15:22:00.006+00:002024-02-22T09:46:36.784+00:00Table Clusters: 5. Using the Cluster Key Index instead of the Primary/Unique Key Index<p><i>This post is part of a <a href="https://blog.go-faster.co.uk/2024/01/table-clusters1.html">series</a> that discusses table clustering in Oracle.</i></p><ol>
<li><a href="https://blog.go-faster.co.uk/2024/01/table-clusters1.html">Introduction and Ancient History</a></li>
<li><a href="https://blog.go-faster.co.uk/2023/12/table-clusters2.html">Cluster & Cluster Key Design Considerations</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/tablecluster3.html">Populating the Cluster with DBMS_PARALLEL_EXECUTE</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-clusters.html">Checking the Cluster Key</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-custers5.html">Using the Cluster Key Index instead of the Primary/Unique Key Index</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-clusers6.html">Testing the Cluster & Conclusion (TL;DR)</a></li>
</ol><p style="text-align: left;">In my test case, the cluster key index is made up of the first 7 columns of the unique key index. One side-effect of this similarity of the keys is that the optimizer may choose to use the cluster key index where previously it used the unique index. </p><p style="text-align: left;">The cluster key index is a unique index. It contains only one entry for each distinct cluster key value that points to the first block that contains rows with those cluster key values. As we saw in the previous post, there are many rows in the table for each distinct cluster key. Therefore, the cluster key index is much smaller than the unique index on any table in the cluster. This contributes to making it appear cheaper to access.</p><div style="text-align: left;">The clustering factor is fundamental to determining the cost of using an index. It is a measure of how many I/Os the database would perform if it were to read every row in that table via the index in index order. Notwithstanding that blocks may be cached, every time the scan changes to a different data block in the table, that is another I/O. </div><div style="text-align: left;"><br /></div><div style="text-align: left;">In my case, the clustering factor of the cluster key index is also the same value as the number of rows and the number of distinct keys. This is because I have set the cluster size equal to the block size so that each cluster key value points to a different block, and each block only contains rows for a single cluster key value. The clustering factor of the cluster key index is much lower than that of the unique indexes, also making it look cheaper to access.</div>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>TABLE_NAME INDEX_NAME UNIQUENES PREFIX_LENGTH LEAF_BLOCKS DISTINCT_KEYS NUM_ROWS CLUSTERING_FACTOR
-------------------- ------------------------ --------- ------------- ----------- ------------- ---------- -----------------
PS_GP_RSLT_CLUSTER PS_GP_RSLT_CLUSTER_IDX UNIQUE 111541 8875383 8875383 8875383
PS_GP_RSLT_ABS PS_GP_RSLT_ABS UNIQUE 8 1271559 152019130 152019130 10806251
PS_GP_RSLT_ACUM PS_GP_RSLT_ACUM UNIQUE 8 8421658 762210387 762210387 101166426
PS_GP_RSLT_PIN PS_GP_RSLT_PIN UNIQUE 9 3894799 327189471 327189471 31774871
</code></span></pre><p style="text-align: left;">
I still need to create the unique index on the tables to enforce uniqueness. I have found that the optimizer tends to choose the cluster key index in preference to the unique index. The cost of accessing cluster key index is lower because it is smaller, and has a lower clustering factor. When I increased the length of the cluster key from 3 to 7 columns I also found that the size and clustering factor of the cluster key index increased, and the clustering factor for the unique indexes decreased, partly because the rows are less disordered with respect to the index key, and partly because the size of the table decreased because each cluster key is only stored one. Although this reduced the cost of accessing the unique indexes, I still find the optimizer tends to choose the cluster key index over the unique index.</p><div><div style="text-align: left;">Sometimes, the switch to the cluster key index is beneficial, but sometimes performance degrades as in the case of this query.</div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>SELECT …
FROM PS_GP_RSLT_ACUM RA ,PS_GP_ACCUMULATOR A ,PS_GP_PYE_HIST_WRK H
WHERE H.EMPLID BETWEEN :1 AND :2 AND H.CAL_RUN_ID=:3
AND H.RUN_CNTL_ID=:4 AND H.OPRID=:5
<b>AND H.EMPLID=RA.EMPLID
AND H.EMPL_RCD=RA.EMPL_RCD
AND H.GP_PAYGROUP=RA.GP_PAYGROUP
AND H.CAL_ID=RA.CAL_ID
AND H.ORIG_CAL_RUN_ID=RA.ORIG_CAL_RUN_ID
AND H.HIST_CAL_RUN_ID=RA.CAL_RUN_ID
AND H.RSLT_SEG_NUM=RA.RSLT_SEG_NUM</b>
AND RA.PIN_NUM=A.PIN_NUM
AND RA.ACM_PRD_OPTN<>'1'
AND(H.CALC_TYPE=A.CALC_TYPE OR H.HIST_TYPE= 'G')
ORDER BY RA.EMPLID,H.PRC_ORD_TS,RA.EMPL_RCD,RA.PIN_NUM</code></span></pre>PS_GP_PYE_HIST_WRK is equi-joined to PS_GP_RSLT_ACUM by all 7 cluster key columns, so the cluster key index can satisfy this join. The plan has switched to using the cluster key index.</div><div><br /><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>Plan hash value: 4007126853
-------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2369 (100)| |
| 1 | SORT ORDER BY | | 133 | 36841 | 2369 (1)| 00:00:01 |
|* 2 | FILTER | | | | | |
|* 3 | HASH JOIN | | 133 | 36841 | 2368 (1)| 00:00:01 |
| 4 | NESTED LOOPS | | 393 | 103K| 2348 (1)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID BATCHED| PS_GP_PYE_HIST_WRK | 1164 | 156K| 12 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | PS_GP_PYE_HIST_WRK | 1 | | 11 (0)| 00:00:01 |
|* 7 | TABLE ACCESS CLUSTER | PS_GP_RSLT_ACUM | 1 | 132 | 3 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | <b>PS_GP_RSLT_CLUSTER_IDX</b> | 1 | | 1 (0)| 00:00:01 |
| 9 | INDEX FAST FULL SCAN | PSBGP_ACCUMULATOR | 9208 | 64456 | 20 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------</code></span></pre>The profile of the ASH data by plan line ID shows that most of the time is spent on physical I/O on line 7 of the plan, physically scanning the blocks in the cluster for each cluster key<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code> SQL Plan SQL Plan H P E ASH
Hash Value Line ID EVENT P x Secs
----------- ---------------- --------------------------------------------- --- - --- --------
4007126853 7 db file sequential read N N Y 120
4007126853 db file sequential read N N Y 80</code></span></pre>
</div>I can force the plan back to using the unique index on PS_GP_RSLT_ACUM with a hint, SQL Profile, SQL Patch, or SQL Plan Baseline, and there is a reduction in database response time.<div>NB: You cannot make a cluster key index invisible.<br /><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>Plan hash value: 1843812660
------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 845 (100)| |
| 1 | SORT ORDER BY | | 1 | 277 | 845 (1)| 00:00:01 |
| * 2 | FILTER | | | | | |
| * 3 | HASH JOIN | | 1 | 277 | 844 (1)| 00:00:01 |
|- 4 | NESTED LOOPS | | 1 | 277 | 844 (1)| 00:00:01 |
|- 5 | STATISTICS COLLECTOR | | | | | |
| 6 | NESTED LOOPS | | 1 | 270 | 843 (1)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID | PS_GP_PYE_HIST_WRK | 416 | 57408 | 6 (0)| 00:00:01 |
| * 8 | INDEX RANGE SCAN | PS_GP_PYE_HIST_WRK | 1 | | 5 (0)| 00:00:01 |
| * 9 | TABLE ACCESS BY INDEX ROWID BATCHED| <b>PS_GP_RSLT_ACUM</b> | 1 | 132 | 5 (0)| 00:00:01 |
| * 10 | INDEX RANGE SCAN | <b>PS_GP_RSLT_ACUM</b> | 1 | | 4 (0)| 00:00:01 |
|- * 11 | INDEX RANGE SCAN | PSBGP_ACCUMULATOR | 1 | 7 | 1 (0)| 00:00:01 |
| 12 | INDEX FAST FULL SCAN | PSBGP_ACCUMULATOR | 1 | 7 | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------------
SQL Plan SQL Plan H P E ASH
Hash Value Line ID EVENT P x Secs
----------- ---------------- --------------------------------------------- --- - --- --------
1843812660 10 db file sequential read N N Y 70
1843812660 9 db file sequential read N N Y 60
1843812660 CPU+CPU Wait N N Y 50</code></span></pre><h4 style="text-align: left;">
Table Cached Blocks </h4><p style="text-align: left;">The <i>table_cached_blocks</i> statistics preference specifies the average number of blocks assumed to be cached in the buffer cache when calculating the index clustering factor.
When DBMS_STATS calculates the clustering factor of an index it does not count visits to table blocks assumed to be cached because they were in the last <i>n</i> distinct table blocks visit, where <i>n</i> is the value to which <i>table_cached_blocks </i>is set.</p><p style="text-align: left;">We have already seen that with 7 cluster key columns, no more than 7 blocks are required to hold any one cluster key. If I set table cached blocks to at least 7, then when Oracle scans the table blocks in unique key order (which matches the cluster key order for the first 7 columns) it does not count additional visits to blocks for the same cluster key.
Thus we see a reduction in the clustering factor on the unique index. There is no advantage to a higher value of this setting.
We do not see a significant reduction in the clustering factor on other indexes with different leading columns. </p><div style="text-align: left;"><b>TCB=1
</b><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>TABLE_NAME INDEX_NAME PREFIX_LENGTH LEAF_BLOCKS NUM_ROWS CLUSTERING_FACTOR DEGREE LAST_ANALYZED
-------------------- ------------------------ ------------- ----------- ---------- ----------------- ---------- -----------------
PS_GP_RSLT_ABS PS_GP_RSLT_ABS 8 1271559 152019130 <b>10806251</b> 1 12-01-24 15:33:02
PS_GP_RSLT_ACUM PS_GP_RSLT_ACUM 8 8421658 762210387 <b>101166426 </b>1 12-01-24 15:37:55
PS_GP_RSLT_PIN PS_GP_RSLT_PIN 9 3894799 327189471 <b>31774872 </b>1 12-01-24 15:39:00
</code></span></pre><b>
TCB=8
</b><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code> TABLE_NAME INDEX_NAME PREFIX_LENGTH LEAF_BLOCKS NUM_ROWS CLUSTERING_FACTOR DEGREE LAST_ANALYZED
-------------------- ------------------------ ------------- ----------- ---------- ----------------- ---------- -----------------
PS_GP_RSLT_ABS PS_GP_RSLT_ABS 8 1271559 152019130 <b>8217000 </b>1 12-01-24 15:05:42
PS_GP_RSLT_ACUM PS_GP_RSLT_ACUM 8 8421658 762210387 <b>16658798 </b>1 12-01-24 15:10:40
PS_GP_RSLT_PIN PS_GP_RSLT_PIN 9 3894799 327189471 <b>11321888 </b>1 12-01-24 15:01:37
</code></span></pre><p style="text-align: left;"><b>
TCB=16
</b></p><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>TABLE_NAME INDEX_NAME PREFIX_LENGTH LEAF_BLOCKS NUM_ROWS CLUSTERING_FACTOR DEGREE LAST_ANALYZED
-------------------- ------------------------ ------------- ----------- ---------- ----------------- ---------- -----------------
PS_GP_RSLT_ABS PS_GP_RSLT_ABS 8 1271559 152019130 8217000 1 12-01-24 15:44:25
PS_GP_RSLT_ACUM PS_GP_RSLT_ACUM 8 8421658 762210387 16658710 1 12-01-24 15:49:29
PS_GP_RSLT_PIN PS_GP_RSLT_PIN 9 3894799 327189471 11321888 1 12-01-24 15:50:36
</code></span></pre><p style="text-align: left;">
The reduction in the clustering factor can mitigate the optimizer's tendency to use the cluster key index, but it may still occur.</p><p style="text-align: left;">NB: <i>table_cached_blocks</i> <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_STATS.html#d996745e20759" target="_blank">applies only when gathering statistics with DBMS_STATS</a>, and not to CREATE INDEX or REBUILD INDEX operations that use the default value of 1. This is not a bug, it is in the <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_STATS.html#d996745e20759" target="_blank">DBMS_STATS documentation</a>. </p></div><div style="text-align: left;">See also</div><div style="text-align: left;"><ul style="text-align: left;"><li><a href="https://richardfoote.wordpress.com/category/table_cached_blocks/" target="_blank">Richard Foote's Blog: Table Cached Blocks</a>. In particular:</li><ul><li><a href="https://richardfoote.wordpress.com/2013/05/08/important-clustering-factor-calculation-improvement-fix-you/" target="_blank">Important!! Clustering Factor Calculation Improvement</a></li></ul><ul><li><a href="https://richardfoote.wordpress.com/2018/07/17/rebuilding-indexes-danger-with-clustering-factor-calculation-chilly-down/" target="_blank">Rebuilding Indexes: Danger With Clustering Factor Calculation</a></li></ul><li><a href="https://jonathanlewis.wordpress.com/2018/07/02/clustering_factor-5/" target="_blank">Jonathan Lewis' Oracle Scratchpad: Clustering_Factor</a></li></ul></div><h4 style="text-align: left;">TL;DR</h4><div>The statistics on the cluster key index may lead the optimizer to determine the cost of using it is lower than the unique index. The switch from the unique/primary key index to the cluster key index may result in poorer performance. Setting Table Cached Blocks on the tables in the cluster may help. However, you may still need to use SQL Profiles/SQL Plan Baselines/SQL Patches to force the optimizer to continue to use the unique indexes.</div></div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-33126083445016013972024-02-16T18:21:00.007+00:002024-02-22T09:46:53.975+00:00Table Clusters: 4. Checking the Cluster Key<i>This post is part of a <a href="https://blog.go-faster.co.uk/2024/01/table-clusters1.html">series</a> that discusses table clustering in Oracle.</i>
<div><ol>
<li><a href="https://blog.go-faster.co.uk/2024/01/table-clusters1.html">Introduction and Ancient History</a></li>
<li><a href="https://blog.go-faster.co.uk/2023/12/table-clusters2.html">Cluster & Cluster Key Design Considerations</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/tablecluster3.html">Populating the Cluster with DBMS_PARALLEL_EXECUTE</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-clusters.html">Checking the Cluster Key</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-custers5.html">Using the Cluster Key Index instead of the Primary/Unique Key Index</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-clusers6.html">Testing the Cluster & Conclusion (TL;DR)</a></li>
</ol><p>This query calculates the frequency of each number of distinct blocks per cluster key. It uses DBMS_ROWID to get the block number from the ROWID. The query counts the number of distinct blocks per cluster key, and the number of times that number of blocks per key occurs.</p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>with x as ( --cluster key and rowid of each row
select emplid, cal_run_id, empl_rcd, gp_paygroup, cal_id, ORIG_CAL_RUN_ID, RSLT_SEG_NUM
, DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) block_no from ps_gp_rslt_abs
), y as ( --count number of rows per cluster key and block number
select /*+MATERIALIZE*/ emplid, cal_run_id, empl_rcd, gp_paygroup, cal_id, ORIG_CAL_RUN_ID, RSLT_SEG_NUM
, block_no, count(*) num_rows
from x
group by emplid, cal_run_id, empl_rcd, gp_paygroup, cal_id, ORIG_CAL_RUN_ID, RSLT_SEG_NUM, block_no
), z as ( --count number of blocks and rows per cluster key
select /*+MATERIALIZE*/ emplid, cal_run_id, empl_rcd, gp_paygroup, cal_id, ORIG_CAL_RUN_ID, RSLT_SEG_NUM
, count(distinct block_no) num_blocks, sum(num_rows) num_rows
from y
group by emplid, cal_run_id, empl_rcd, gp_paygroup, cal_id, ORIG_CAL_RUN_ID, RSLT_SEG_NUM
)
select num_blocks, count(distinct emplid) emplids
, sum(num_rows) sum_rows
, median(num_rows) median_rows
, median(num_rows)/num_blocks median_rows_per_block
from z
group by num_blocks
order by num_blocks
/</code></span></pre>
<p>The answer you get depends on the data, so your mileage will vary. </p><p>Initially, I built the cluster with 3 columns in the key. In my case, 81% of rows were organised such that they have no more than 2 data blocks per cluster key. </p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>NUM_BLOCKS EMPLIDS SUM_ROWS MEDIAN_ROWS MEDIAN_ROWS_PER_BLOCK
---------- ---------- ---------- ----------- ---------------------
1 69638 46809975 12 12
2 47629 78370682 34 17
3 12120 14330976 68 22.6666667
4 4598 4395844 94 23.5
5 2376 6941389 124 24.8
6 652 2510790 155 25.8333333
7 27 34527 185 26.4285714
8 14 12330 217 27.125
9 9 40633 248 27.5555556
10 1 14607 279 27.9
11 1 310 310 28.1818182
12 2 2212 310 25.8333333
13 1 1476 372 28.6153846
14 1 372 372 26.5714286</code></span></pre>I rebuilt the cluster with 7 key columns. Now no cluster key has more than 7 blocks, most of the keys are in a single block, and 85% are in no more than 2. Increasing the length of the cluster key also resulted in the table being smaller because each cluster key is only stored once.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>NUM_BLOCKS EMPLIDS SUM_ROWS MEDIAN_ROWS MEDIAN_ROWS_PER_BLOCK
---------- ---------- ---------- ----------- ---------------------
1 74545 71067239 14 14
2 52943 57481538 40 20
3 13553 11185685 73 24.3333333
4 4567 8949787 120 30
5 1327 3251707 150 30
6 144 81977 160 26.6666667
7 3 1197 204.5 29.2142857</code></span></pre>There is now only a small number of employees whose data is spread across many cluster blocks. They might be slower to access, but I think I have a reasonable balance.</div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-53354081982564280512024-02-15T16:15:00.007+00:002024-02-22T09:47:11.022+00:00Table Clusters: 3. Populating the Cluster with DBMS_PARALLEL_EXECUTE<i>This post is part of a <a href="https://blog.go-faster.co.uk/2024/01/table-clusters1.html">series</a> that discusses table clustering in Oracle.</i>
<div><ol>
<li><a href="https://blog.go-faster.co.uk/2024/01/table-clusters1.html">Introduction and Ancient History</a></li>
<li><a href="https://blog.go-faster.co.uk/2023/12/table-clusters2.html">Cluster & Cluster Key Design Considerations</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/tablecluster3.html">Populating the Cluster with DBMS_PARALLEL_EXECUTE</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-clusters.html">Checking the Cluster Key</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-custers5.html">Using the Cluster Key Index instead of the Primary/Unique Key Index</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-clusers6.html">Testing the Cluster & Conclusion (TL;DR)</a></li>
</ol></div><p>The result tables being clustered are also large, containing hundreds of millions of rows. Normally, when I have to rebuild these as non-clustered tables, I would do so in direct-path mode and with both parallel insert and parallel query. However, this is not effective for table clusters, particularly if you put multiple tables in one cluster, as rows with the same cluster key have to go into the same data blocks.</p><p>Instead, for each result table in the cluster, I have used <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_PARALLEL_EXECUTE.html#GUID-D13B6975-09B5-4711-AD43-45F68228C1CC" rel="nofollow" target="_blank">DBMS_PARALLEL_EXECUTE</a> to take a simple INSERT…SELECT statement, and break it into pieces that can be run concurrently on the database job scheduler. I get the parallelism, though I also have to accept the redo on the insert.</p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>exec DBMS_PARALLEL_EXECUTE.DROP_TASK('CLUSTER_GP_RSLT_ABS');
DECLARE
l_recname VARCHAR2(15) := 'GP_RSLT_ABS';
l_src_prefix VARCHAR2(10) := 'ORIG_';
l_task VARCHAR2(30);
l_sql_stmt CLOB;
l_col_list CLOB;
BEGIN
l_task := 'CLUSTER_'||l_recname;
SELECT LISTAGG(column_name,',') WITHIN GROUP(ORDER BY column_id)
INTO l_col_list
FROM user_tab_cols WHERE table_name = l_src_prefix||l_recname;
l_sql_stmt := 'insert into PSY'||l_recname||' ('||l_col_list||') SELECT '||l_col_list
||' FROM '||l_src_prefix||l_recname||' WHERE rowid BETWEEN :start_id AND :end_id';
DBMS_PARALLEL_EXECUTE.CREATE_TASK (l_task);
DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID(l_task, 'SYSADM', l_src_prefix||l_recname, true, 2e6);
DBMS_PARALLEL_EXECUTE.RUN_TASK(l_task, l_sql_stmt, DBMS_SQL.NATIVE, parallel_level => 24);
END;
/</code></span></pre><p>The performance of this process is the first indication as to whether the cluster key is correct. Too few columns and the population of the table will be much slower because rows have to go in the block already allocated to that cluster key, or if full a new block must be allocated. </p><p>NB: Chunking the data by ROWID only works where the source table is a regular table. It does not work for clustered or index-organised tables. The alternative is to chunk by the value of a numeric column, and that doesn't work well in this case because most of the key columns are strings or dates.</p><h4 style="text-align: left;">Monitoring DBMS_PARALLEL_EXECUTE</h4><div>There are several views provided by Oracle that can be used to monitor tasks created by DBMS_PARALLEL_EXECUTE.</div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span><code><span style="font-size: x-small;">SELECT * FROM <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/DBA_PARALLEL_EXECUTE_TASKS.html#GUID-AEFFCE5D-AB5B-4EE1-9CB3-4491F3D3D0E7" target="_blank">user_parallel_execute_tasks</a>;
</span><span style="font-size: xx-small;"> Number
TASK_NAME CHUNK_TYPE STATUS TABLE_OWNER TABLE_NAME Column TASK_COMMENT JOB_PREFIX
-------------------- ------------ ---------- ----------- ------------------ ---------- ------------------------------ ------------
Apply
Lang X Ed Fire_ Parallel
SQL_STMT Flag EDITION Trigger Apply Level JOB_CLASS
-------------------------------------------------------------------------------- ---- -------- ------- ----- -------- -----------------
CLUSTER_GP_RSLT_ABS ROWID_RANGE FINISHED SYSADM PS_GP_RSLT_ABS TASK$_38380
insert into PSYGP_RSLT_ABS (EMPLID,CAL_RUN_ID,EMPL_RCD,GP_PAYGROUP,CAL_ID,ORIG_C 1 ORA$BASE TRUE 24 DEFAULT_JOB_CLASS
CLUSTER_GP_RSLT_ACUM ROWID_RANGE FINISHED SYSADM PS_GP_RSLT_ACUM TASK$_38382
insert into PSYGP_RSLT_ACUM (EMPLID,CAL_RUN_ID,EMPL_RCD,GP_PAYGROUP,CAL_ID,ORIG_ 1 ORA$BASE TRUE 32 DEFAULT_JOB_CLASS<br /></span></code></span></pre>Each task is broken into chunks.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 12.35px; width: 633.838px;"><span><code><span style="font-size: x-small;">SELECT task_name, status, count(*) chunks
, min(start_ts) min_start_ts, max(end_ts) max_end_ts
, max(end_ts)-min(start_ts) duration
FROM <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/DBA_PARALLEL_EXECUTE_CHUNKS.html#GUID-E3E6748D-0F8C-4EE5-B9B5-DE5B40C4FBE3" target="_blank">user_parallel_execute_chunks</a>
group by task_name, status
order by min_start_ts nulls last
/</span><span style="font-size: xx-small;">TASK_NAME STATUS CHUNKS MIN_START_TS MAX_END_TS DURATION
-------------------- ---------- ---------- ----------------------- ----------------------- -------------------
CLUSTER_GP_RSLT_ABS PROCESSED 80 22/12/2023 09.58.37.712 22/12/2023 10.06.32.264 +00 00:07:54.551373
CLUSTER_GP_RSLT_ACUM PROCESSED 402 22/12/2023 10.08.58.257 22/12/2023 10.38.36.820 +00 00:29:38.562700<br /></span></code></span></pre>In this case, each chunk processes a range of ROWIDs. Each chunk is allocated to a database scheduler job.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><code><span style="font-size: x-small;">SELECT chunk_id, task_name, status, start_rowid, end_rowid, job_name, start_ts, end_ts, error_code, error_message
FROM user_parallel_execute_chunks
WHERE task_name = 'CLUSTER_GP_RSLT_ABS'
ORDER BY chunk_id
/
</span><span style="font-size: 65%;">Chunk
ID TASK_NAME STATUS START_ROWID END_ROWID JOB_NAME START_TS END_TS ERROR_CODE ERROR_MESSAGE
----- -------------------- ---------- ------------------ ------------------ --------------- ----------------------- ----------------------- ---------- -----------------
1 CLUSTER_GP_RSLT_ABS PROCESSED AAAUzUAAmAAAZgAAAA AAAUzUAAmAADp7VH// TASK$_38380_1 22/12/2023 09:58:37.712 22/12/2023 10:00:21.622
2 CLUSTER_GP_RSLT_ABS PROCESSED AAAUzUAAmAADp7WAAA AAAUzUAAmAAGwkrH// TASK$_38380_3 22/12/2023 09:58:37.713 22/12/2023 10:00:20.107
3 CLUSTER_GP_RSLT_ABS PROCESSED AAAUzUAAmAAGwksAAA AAAUzUAAmAAHvwBH// TASK$_38380_2 22/12/2023 09:58:37.713 22/12/2023 10:00:14.939
4 CLUSTER_GP_RSLT_ABS PROCESSED AAAUzUAAmAAHvwCAAA AAAUzUAAmAAIn5XH// TASK$_38380_9 22/12/2023 09:58:37.864 22/12/2023 10:00:28.963
5 CLUSTER_GP_RSLT_ABS PROCESSED AAAUzUAAmAAIn5YAAA AAAUzUAAmAAJ58tH// TASK$_38380_12 22/12/2023 09:58:37.865 22/12/2023 10:00:30.494
6 CLUSTER_GP_RSLT_ABS PROCESSED AAAUzUAAmAAJ58uAAA AAAUzUAAmAAKzADH// TASK$_38380_8 22/12/2023 09:58:37.865 22/12/2023 10:00:26.049
7 CLUSTER_GP_RSLT_ABS PROCESSED AAAUzUAAmAAKzAEAAA AAAUzUAAmAALf7ZH// TASK$_38380_4 22/12/2023 09:58:37.865 22/12/2023 10:00:28.017
8 CLUSTER_GP_RSLT_ABS PROCESSED AAAUzUAAmAALf7aAAA AAAUzUAAmAAMHGvH// TASK$_38380_10 22/12/2023 09:58:37.885 22/12/2023 10:00:23.326
9 CLUSTER_GP_RSLT_ABS PROCESSED AAAUzUAAmAAMHGwAAA AAAUzUAAmAAP5aFH// TASK$_38380_13 22/12/2023 09:58:37.907 22/12/2023 10:00:22.660
10 CLUSTER_GP_RSLT_ABS PROCESSED AAAUzUAAmAAP5aGAAA AAAUzUAAnAACr1bH// TASK$_38380_5 22/12/2023 09:58:37.929 22/12/2023 10:00:21.959
…</span></code></pre><div>However, one job may process many chunks.</div><div>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><code><span style="font-size: x-small;">SELECT t.task_name, t.chunk_type, t.table_name, c.chunk_id, c.job_name, c.start_ts, c.end_ts
, d.actual_start_date, d.run_duration, d.instance_id, d.session_id
FROM user_parallel_execute_tasks t
JOIN user_parallel_execute_chunks c ON c.task_name = t.task_name
JOIN user_scheduler_job_run_details d ON d.job_name = c.job_name
WHERE t.task_name = 'CLUSTER_GP_RSLT_ABS'
ORDER BY t.task_name, c.job_name, c.start_ts
/</span><span style="font-size: x-small;">
</span><span style="font-size: 60%;"> Chunk Inst
TASK_NAME CHUNK_TYPE TABLE_NAME ID JOB_NAME START_TS END_TS ACTUAL_START_DATE RUN_DURATION ID SESSION_ID
-------------------- ------------ --------------- ----- --------------- ----------------------- ----------------------- ----------------------- ------------------- ---- ------------
CLUSTER_GP_RSLT_ABS ROWID_RANGE PS_GP_RSLT_ABS 1 TASK$_38380_1 22/12/2023 09:58:37.712 22/12/2023 10:00:21.622 22/12/2023 09:58:37.660 +00 00:07:52.000000 1 3406,24003
CLUSTER_GP_RSLT_ABS ROWID_RANGE PS_GP_RSLT_ABS 23 TASK$_38380_1 22/12/2023 10:00:21.710 22/12/2023 10:02:01.916 22/12/2023 09:58:37.660 +00 00:07:52.000000 1 3406,24003
CLUSTER_GP_RSLT_ABS ROWID_RANGE PS_GP_RSLT_ABS 44 TASK$_38380_1 22/12/2023 10:02:02.008 22/12/2023 10:03:31.546 22/12/2023 09:58:37.660 +00 00:07:52.000000 1 3406,24003
CLUSTER_GP_RSLT_ABS ROWID_RANGE PS_GP_RSLT_ABS 57 TASK$_38380_1 22/12/2023 10:03:31.640 22/12/2023 10:05:05.398 22/12/2023 09:58:37.660 +00 00:07:52.000000 1 3406,24003
CLUSTER_GP_RSLT_ABS ROWID_RANGE PS_GP_RSLT_ABS 73 TASK$_38380_1 22/12/2023 10:05:05.494 22/12/2023 10:06:29.262 22/12/2023 09:58:37.660 +00 00:07:52.000000 1 3406,24003
CLUSTER_GP_RSLT_ABS ROWID_RANGE PS_GP_RSLT_ABS 8 TASK$_38380_10 22/12/2023 09:58:37.885 22/12/2023 10:00:23.326 22/12/2023 09:58:37.877 +00 00:07:54.000000 1 4904,44975
CLUSTER_GP_RSLT_ABS ROWID_RANGE PS_GP_RSLT_ABS 27 TASK$_38380_10 22/12/2023 10:00:23.394 22/12/2023 10:01:59.096 22/12/2023 09:58:37.877 +00 00:07:54.000000 1 4904,44975
CLUSTER_GP_RSLT_ABS ROWID_RANGE PS_GP_RSLT_ABS 42 TASK$_38380_10 22/12/2023 10:01:59.185 22/12/2023 10:03:37.657 22/12/2023 09:58:37.877 +00 00:07:54.000000 1 4904,44975
CLUSTER_GP_RSLT_ABS ROWID_RANGE PS_GP_RSLT_ABS 61 TASK$_38380_10 22/12/2023 10:03:37.742 22/12/2023 10:05:12.680 22/12/2023 09:58:37.877 +00 00:07:54.000000 1 4904,44975
CLUSTER_GP_RSLT_ABS ROWID_RANGE PS_GP_RSLT_ABS 79 TASK$_38380_10 22/12/2023 10:05:12.776 22/12/2023 10:06:32.142 22/12/2023 09:58:37.877 +00 00:07:54.000000 1 4904,44975
…</span></code></pre></div><div>You can also judge how well the clustering is working by looking at how much database time was consumed by the various events. PS_GP_RSLT_ABS was inserted first, then PS_GP_RSLT_ACUM. We can see that more time was spent on the second table that was inserted, and more time spent on physical read operations as rows have to go into specific blocks with the same cluster keys.</div><div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span><code><span style="font-size: x-small;">select c.task_name, c.status, count(distinct c.chunk_id) chunks, h.module, h.event
, sum(usecs_per_Row)/1e6 ash_secs
from gv$active_session_history h
, user_parallel_execute_chunks c
, user_parallel_execute_tasks t
where h.sample_time BETWEEN c.start_ts AND NVL(c.end_ts,SYSDATE)
and t.task_name = c.task_name
and h.action like c.job_name
group by c.task_name, c.status, h.module, h.event
order by task_name, ash_Secs desc
/
</span><span style="font-size: xx-small;">TASK_NAME STATUS CHUNKS MODULE EVENT ASH_SECS
-------------------- ---------- ------ --------------- ---------------------------------------------------------------- --------
CLUSTER_GP_RSLT_ABS PROCESSED 80 DBMS_SCHEDULER 3534
PROCESSED 78 DBMS_SCHEDULER enq: FB - contention 1184
PROCESSED 80 DBMS_SCHEDULER db file parallel read 1161
PROCESSED 80 DBMS_SCHEDULER buffer busy waits 674
PROCESSED 79 DBMS_SCHEDULER db file scattered read 490
…
CLUSTER_GP_RSLT_ACUM PROCESSED 401 DBMS_SCHEDULER 10174
PROCESSED 401 DBMS_SCHEDULER db file sequential read 8813
PROCESSED 32 DBMS_SCHEDULER log file switch (archiving needed) 4623
PROCESSED 389 DBMS_SCHEDULER db file parallel read 1396
PROCESSED 383 DBMS_SCHEDULER db file scattered read 1346
PROCESSED 295 DBMS_SCHEDULER buffer busy waits 769
PROCESSED 287 DBMS_SCHEDULER enq: FB - contention 715
…<br /></span></code></span></pre><br /></div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-56670480228239954752024-02-14T12:49:00.008+00:002024-02-22T09:47:27.079+00:00Table Clusters: 2. Cluster & Cluster Key Design Considerations<i>This post is part of a <a href="https://blog.go-faster.co.uk/2024/01/table-clusters1.html">series</a> that discusses table clustering in Oracle.</i>
<div><ol>
<li><a href="https://blog.go-faster.co.uk/2024/01/table-clusters1.html">Introduction and Ancient History</a></li>
<li><a href="https://blog.go-faster.co.uk/2023/12/table-clusters2.html">Cluster & Cluster Key Design Considerations</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/tablecluster3.html">Populating the Cluster with DBMS_PARALLEL_EXECUTE</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-clusters.html">Checking the Cluster Key</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-custers5.html">Using the Cluster Key Index instead of the Primary/Unique Key Index</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-clusers6.html">Testing the Cluster & Conclusion (TL;DR)</a></li>
</ol></div><p style="text-align: left;"><span style="font-size: small;"><span style="font-weight: 400;">At the beginning of each PeopleSoft payroll calculation process, all the previously calculated result data that is about to be recalculated by that process is deleted; one delete statement for each result table. The new result data is inserted as each employee is calculated. As multiple calculation processes run concurrently, their data tends to get mixed up in the result tables. So the delete statements will concurrently update different rows in the same data block, leading to the database needing to do additional work to ensure read consistency. <br /></span></span>The result tables are not subsequently updated. Therefore, they are reasonable candidates for building in a table cluster.</p><h3 style="text-align: left;">Cluster Design Considerations</h3><p style="text-align: left;">The original purpose of <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/tables-and-table-clusters.html#GUID-04AADD81-E5C2-498B-B857-DF2A37DD3520" target="_blank">table clusters</a> was to co-locate rows from different tables that would generally be queried together, in the same data blocks. This makes retrieval easier by reducing disk I/Os and access time. Less storage is required because cluster keys are not repeated in either the cluster or the cluster key index. As disks have become bigger and faster, and memory has become more plentiful, this is less often a consideration.</p><p style="text-align: left;">In this case, I am interested in avoiding read consistency contention. I want each data block in the cluster to contain only rows with a single distinct cluster key value so that different transactions relating to different employees, and therefore different cluster keys, will be involved in different data blocks. Therefore, each data block in the cluster will be subject to no more than one concurrent transaction, and the database will not have to maintain multiple read-consistent versions. I will still avoid the read consistency overhead whether I store multiple tables in one cluster or different tables in different clusters.</p><p style="text-align: left;">The size attribute of the CREATE CLUSTER command specifies the amount of space in bytes reserved to store all rows with the same cluster key value. Oracle will round it up to the next divisor of the block size. Thus, if it is greater than half the size of the data block, the database will reserve at least one whole data block for each cluster value. In my case, the data blocks are 8192 bytes (the default size), so I have set the size equal to the block size. </p><p style="text-align: left;">I don't know in advance how many distinct cluster key values my data will have, and it will change over time. Therefore, I will be creating <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/tables-and-table-clusters.html#GUID-CC31365B-83B0-4E09-A047-BF1B79AC887A" target="_blank">indexed clusters</a>, and I have to build a B-tree index on the cluster key.</p><p style="text-align: left;">I have found that the optimizer tends to choose the cluster key index rather than the longer unique index to search the table because it only has one row per cluster key and is, therefore, smaller and cheaper. However, it may then have to scan all the blocks for that cluster key, which may in practice take longer.</p><p style="text-align: left;">If one table already frequently fills or exceeds a single block for each cluster key, there is unlikely to be any advantage to adding another table to the same cluster because if Oracle uses the cluster key index, it will then scan all the blocks for that key. </p><p style="text-align: left;">In my case, I have found that two of the three tables that I plan to cluster, each require more than one block per cluster key, and the third almost fills a block per cluster key. Therefore, I have decided to put each table in a separate cluster, albeit with the same cluster key.</p><h3 style="text-align: left;">Cluster Key Design Considerations</h3><p style="text-align: left;">The columns listed in the <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/CREATE-CLUSTER.html#GUID-4DBC701F-AFC3-486D-AA32-B5CB1D6946F7" target="_blank">CREATE CLUSTER</a> command specify the cluster key. They will be used to group data together. The tables in the cluster have many unique key columns in common. The first 7 columns of the unique key have been used for cluster key columns. This is enough to prevent the number of rows per cluster key from growing indefinitely, but not so many that you end up with only a few rows per cluster key, which would result in most table blocks being only partially filled. This would consume space and increase I/O.</p><div>The cluster key is indexed to help find the data blocks for a particular key, just as you would on any other table. You do not specify columns when creating this index, because it uses the cluster key columns.</div><div>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>CREATE CLUSTER cluster_gp_rslt_abs
(<b>EMPLID </b>VARCHAR2(11), <b>CAL_RUN_ID </b>VARCHAR2(18), <b>EMPL_RCD </b>SMALLINT, <b>GP_PAYGROUP </b>VARCHAR2(10)
,<b>CAL_ID </b>VARCHAR2(18), <b>ORIG_CAL_RUN_ID </b>VARCHAR2(18), <b>RSLT_SEG_NUM </b>SMALLINT)
SIZE 8192 <i>/*one block per cluster value*/</i>
TABLESPACE GPAPP
/
CREATE INDEX cluster_gp_rslt_abs_idx ON CLUSTER cluster_gp_rslt_abs
/
CREATE TABLE psygp_rslt_abs (EMPLID VARCHAR2(11) NOT NULL,
CAL_RUN_ID VARCHAR2(18) NOT NULL,
EMPL_RCD SMALLINT NOT NULL,
GP_PAYGROUP VARCHAR2(10) NOT NULL,
CAL_ID VARCHAR2(18) NOT NULL,
ORIG_CAL_RUN_ID VARCHAR2(18) NOT NULL,
RSLT_SEG_NUM SMALLINT NOT NULL,
…
) CLUSTER cluster_gp_rslt_abs (<b>EMPLID, CAL_RUN_ID, EMPL_RCD, GP_PAYGROUP, CAL_ID, ORIG_CAL_RUN_ID, RSLT_SEG_NUM</b>)<br />/
CREATE CLUSTER cluster_gp_rslt_acum
(<b>EMPLID </b>VARCHAR2(11), <b>CAL_RUN_ID </b>VARCHAR2(18), <b>EMPL_RCD </b>SMALLINT, <b>GP_PAYGROUP </b>VARCHAR2(10)
,<b>CAL_ID </b>VARCHAR2(18), <b>ORIG_CAL_RUN_ID </b>VARCHAR2(18), <b>RSLT_SEG_NUM </b>SMALLINT) SIZE 8192 TABLESPACE GPAPP
/
CREATE INDEX cluster_gp_rslt_acum_idx ON CLUSTER cluster_gp_rslt_acum
/
CREATE TABLE psygp_rslt_acum (EMPLID VARCHAR2(11) NOT NULL,
…
) CLUSTER cluster_gp_rslt_acum (<b>EMPLID, CAL_RUN_ID, EMPL_RCD, GP_PAYGROUP, CAL_ID, ORIG_CAL_RUN_ID, RSLT_SEG_NUM</b>)
/
CREATE CLUSTER cluster_gp_rslt_pin
(<b>EMPLID </b>VARCHAR2(11), <b>CAL_RUN_ID </b>VARCHAR2(18), <b>EMPL_RCD </b>SMALLINT, <b>GP_PAYGROUP </b>VARCHAR2(10)
,<b>CAL_ID </b>VARCHAR2(18), <b>ORIG_CAL_RUN_ID </b>VARCHAR2(18), <b>RSLT_SEG_NUM </b>SMALLINT) SIZE 8192 TABLESPACE GPAPP
/
CREATE INDEX cluster_gp_rslt_pin_idx ON CLUSTER cluster_gp_rslt_pin
/
CREATE TABLE PSYGP_RSLT_PIN (EMPLID VARCHAR2(11) NOT NULL,
…
) CLUSTER cluster_gp_rslt_pin (<b>EMPLID, CAL_RUN_ID, EMPL_RCD, GP_PAYGROUP, CAL_ID, ORIG_CAL_RUN_ID, RSLT_SEG_NUM</b>)
/
…</code></span></pre><p style="text-align: left;">The indexes on the result tables required by the application, including the unique key indexes, were recreated on the result tables after they had been rebuilt in the cluster and repopulated. I have only shown the DDL for the unique indexes below. It is not different to build an index on a clustered table than on a normal heap table.</p><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>CREATE UNIQUE INDEX PS_GP_RSLT_ABS ON PS_GP_RSLT_ABS
(<b>EMPLID, CAL_RUN_ID, EMPL_RCD, GP_PAYGROUP, CAL_ID, ORIG_CAL_RUN_ID, RSLT_SEG_NUM</b>, ABSENCE_DATE, PIN_TAKE_NUM)
PCTFREE 1 COMPRESS 8 … TABLESPACE PSINDEX
/
…
CREATE UNIQUE INDEX PS_GP_RSLT_ACUM ON PS_GP_RSLT_ACUM
(<b>EMPLID, CAL_RUN_ID, EMPL_RCD, GP_PAYGROUP, CAL_ID, ORIG_CAL_RUN_ID, RSLT_SEG_NUM</b>, PIN_NUM, EMPL_RCD_ACUM<b>,</b>
,ACM_FROM_DT, ACM_THRU_DT, SLICE_BGN_DT, SEQ_NUM8)
PCTFREE 1 COMPRESS 8 … TABLESPACE PSINDEX
/
…
CREATE UNIQUE INDEX PS_GP_RSLT_PIN ON PS_GP_RSLT_PIN
(<b>EMPLID, CAL_RUN_ID, EMPL_RCD, GP_PAYGROUP, CAL_ID, ORIG_CAL_RUN_ID, RSLT_SEG_NUM</b>, INSTANCE, PIN_NUM, SLICE_BGN_DT, SLICE_END_DT)
PCTFREE 1 COMPUTE STATISTICS COMPRESS 9 … TABLESPACE PSINDEX
/
…</code></span></pre>
<p>See also <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/admin/managing-clusters.html#GUID-A4315FEA-FDFF-4918-9320-FEEF593B34E5" rel="nofollow" target="_blank">Oracle 19c DBA Guide, Guidelines for Managing Clusters</a></p></div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-33182934733305849912024-02-13T16:52:00.008+00:002024-02-22T09:47:46.136+00:00Table Clusters: 1. An Alternative to Partitioning? - Introduction & Ancient History<i>This post is the first part of a <a href="https://blog.go-faster.co.uk/2024/01/table-clusters1.html">series</a> that discusses table clustering in Oracle.</i><div><i>Links will appear as sections are posted.<br /></i><div><ol>
<li><a href="https://blog.go-faster.co.uk/2024/01/table-clusters1.html">Introduction and Ancient History</a></li>
<li><a href="https://blog.go-faster.co.uk/2023/12/table-clusters2.html">Cluster & Cluster Key Design Considerations</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/tablecluster3.html">Populating the Cluster with DBMS_PARALLEL_EXECUTE</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-clusters.html">Checking the Cluster Key</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-custers5.html">Using the Cluster Key Index instead of the Primary/Unique Key Index</a></li>
<li><a href="https://blog.go-faster.co.uk/2024/02/table-clusers6.html">Testing the Cluster & Conclusion (TL;DR)</a></li>
</ol></div><h3 style="text-align: left;">Introduction</h3><div>Table clustering and table partitioning are very different technologies. However, they both create a relationship between the logical value of the data and its physical location. Similar data values are stored together, and therefore dissimilar data values are kept apart. </div><p>The advantage of storing similar values together is to reduce I/O and improve access time. However, this series of blogs looks at the characteristic of keeping dissimilar values apart that, as with partitioning, can be harnessed to avoid the need to maintain read consistency during concurrent processing and therefore avoid its overhead.</p><p>Partitioning is only available in the Enterprise Edition of Oracle, and then you have to license the partitioning option. Table clustering is available in all database versions and doesn't require any additional licence. So you might consider clustering when partitioning is not an option.</p><h3 style="text-align: left;">Ancient History</h3><p>The last time I put tables into a cluster was in 2001 on Oracle 7.3.3 (partitioning didn't become available until Oracle 8.0). Our problem was that multiple instances of the PeopleSoft Global Payroll calculation were concurrently updating different rows in the same data blocks leading the database to generate read consistent copies of each block for each session. That consumed lots of CPU, required additional space in the buffer cache, generated additional physical reads on the undo segments, and generated additional writes due to delayed block cleanout of dirty data blocks in the buffer cache. This significantly degraded performance, and very soon overall performance became worse as we increased the number of concurrent processes.</p><p>I had the idea of clustering the payroll tables on employee ID. Thus I could ensure the data for different employees was in different data blocks and the database wouldn't have to do read-consistent recovery on the blocks in those tables. There might still be some contention on indexes, but this would be less severe on indexes that lead on the cluster key columns because index entries are sorted in key order.</p><p><i><a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/tables-and-table-clusters.html#GUID-04AADD81-E5C2-498B-B857-DF2A37DD3520" rel="nofollow" target="_blank">"A table cluster is a group of tables that share common columns and store related data in the same blocks … Because table clusters store related rows of different tables in the same data blocks, properly used table clusters offer the following benefits over non-clustered tables:</a></i></p><p></p><ul style="text-align: left;"><li><i><a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/tables-and-table-clusters.html#GUID-04AADD81-E5C2-498B-B857-DF2A37DD3520" rel="nofollow" target="_blank">Disk I/O is reduced for joins of clustered tables.</a></i></li><li><i><a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/tables-and-table-clusters.html#GUID-04AADD81-E5C2-498B-B857-DF2A37DD3520" rel="nofollow" target="_blank">Access time improves for joins of clustered tables.</a></i></li><li><i><a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/tables-and-table-clusters.html#GUID-04AADD81-E5C2-498B-B857-DF2A37DD3520" rel="nofollow" target="_blank">Less storage is required to store related table and index data because the cluster key value is not stored repeatedly for each row."</a></i></li></ul><p></p><p><span style="font-size: x-small;"><a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/tables-and-table-clusters.html#GUID-04AADD81-E5C2-498B-B857-DF2A37DD3520" rel="nofollow" target="_blank">see <i>Oracle 19c Database Concepts: Overview of Table Clusters</i></a></span></p><p>Table clusters were not fashionable then, and have certainly not become more so since. Although we all use them every day, the Oracle catalogue has 37 tables in 10 clusters. In 19c, the <i>C_OBJ#</i> cluster contains 17 tables! When I proposed table clustering, the Swiss DBA turned to me and said 'If you build a cluster, I am going to a Kloster!' (this pun works in German: a '<a href="https://translate.google.com/?sl=de&tl=en&text=kloster&op=translate" rel="nofollow" target="_blank">Kloster</a>' is a monastery or convent). This rebuke has stayed with me ever since.</p><p>Nonetheless, we rebuilt our result tables in a cluster, and it delivered a performance improvement until the data volumes grew such that suddenly we had multiple data blocks per cluster key, and then the performance was much worse! Our mistake was not having enough columns in the cluster key, thus illustrating that the choice of cluster keys is very important.</p><p>However, that forced the upgrade to Oracle 8i and we started to use table partitioning, such that a partition corresponded to the data processed by each concurrent payroll process. That approach works very well, certainly better than clustering, for many customers who use this product and are licensed for partitioning. They could generally scale the number of streams until they fully loaded either the CPU or the disk subsystem. </p><p>Now in 2023, I am looking at another large PeopleSoft HCM implementation using the same calculation engine for absence, but this customer isn't licensed for partitioning, so we are back to table clusters.</p><p><a href="https://blog.go-faster.co.uk/2023/12/table-clusters2.html">Now read on</a>.</p></div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-43462823223588043142024-01-26T17:57:00.003+00:002024-01-28T16:43:25.876+00:00Just because the execution plan says INMEMORY, it doesn't mean it is using In-Memory<h4 style="text-align: left;">Parallel Query
</h4><div><p style="text-align: left;">If you are using RAC, and you have in-memory objects populated across nodes (i.e. distribution by ROWID range) or you have objects populated in only 1 node (i.e. distribution by partition or sub-partition) then you need to use parallel query to access data populated on a node to which the query is not connected. <br /></p><ul style="text-align: left;"><li>There is no cache fusion with Database In-Memory. Oracle does not ship In-Memory Compression Units (IMCUs) across the RAC interconnect.</li><li>Similarly, if you have set PARALLEL_FORCE_LOCAL=TRUE the parallel query will not be able to access the remote nodes.</li></ul>In-memory improves performance by avoiding physical I/O, but the reduction in CPU consumption can be more significant. In the cloud, this can save money by reducing your cloud subscription costs. However, parallel query can be a brutal way of using CPU to complete a query faster. It often increases total CPU consumption, thus negating some of the benefits of in-memory.<p></p><h4 style="text-align: left;">Options:</h4><div>A query that is not executing in parallel will only be able to access objects in the local in-memory store. You can ensure that a segment is stored in the in-memory store on every RAC node by specifying DUPLICATE ALL. Parallel queries will also use the local in-memory store. </div><div><ul style="text-align: left;"><li>This option can improve performance but the in-memory stores consume more memory. On a 2-node RAC database, it doubles the memory consumption of In-Memory.</li><li>The DUPLICATE option is only available on Exadata. On other platforms, it is ignored (see also <a href="https://blogs.oracle.com/in-memory/post/oracle-database-in-memory-on-rac-part-3">Oracle Database In-Memory on RAC - Part 3</a>).</li></ul>Alternatively, you can use database services to create node affinity. <br /><ul style="text-align: left;"><li>A process can connect using a database service that specifies a specific node or nodes. </li><li>Parallel queries can be restricted to specific nodes by setting <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/PARALLEL_INSTANCE_GROUP.html" target="_blank">PARALLEL_INSTANCE_GROUP</a> to use a service (see also <a href="https://blogs.oracle.com/in-memory/post/oracle-database-in-memory-on-rac-part-2">Oracle Database In-Memory on RAC - Part 2</a>).</li><li>In-memory segments can be placed in the in-memory store on specific nodes by distributing them with a specific service (see also <a href="https://blogs.oracle.com/in-memory/post/how-to-control-where-objects-are-populated-into-memory-on-rac">How to control where objects are populated into [In-]memory on RAC</a>).</li><li>You may prefer to create different services for the query processes and in-memory population processes. In the case of node failure, you probably want the query process connection to fail over to another node. However, you may not want that to happen for in-memory distribution processes because of the additional memory overhead.</li></ul></div><div>Otherwise, on a 2-node RAC, a non-parallel query has a 50% chance of finding the segment in the in-memory store because it has a 50% chance of connecting to the node where it is stored!</div><h3 style="text-align: left;">Is It Using In Memory?</h3><div>I am going to demonstrate this using a table with 2 partitions.</div></div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>CREATE TABLE t (a number, b number, c VARCHAR2(1000)) PARTITION BY RANGE (b)
(partition t1 VALUES LESS THAN(50)
,partition t2 VALUES LESS THAN(MAXVALUE)
) <b>INMEMORY</b>;
INSERT INTO t SELECT level, MOD(level,100), RPAD(TO_CHAR(TO_DATE(level,'j'),'Jsp'),100,'.')
FROM DUAL CONNECT BY LEVEL <= 1e5;
commit;
</code></span></pre><h4 style="text-align: left;">Serial Query</h4><div>I am going to generate execute plans for two similar queries that each query different partitions of a table. The execution plans have the same plan hash value. The only difference is that the first query accesses only the first partition, and the second query only accesses the second partition. </div><div>Both plans claim they are doing an INMEMORY full scan of the table. However, this is only a statement of intent.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>explain plan for SELECT sum(a), sum(b), count(*) FROM t WHERE b=42;
…
Plan hash value: 2993254470
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 11 (0)| 00:00:01 | | |
| 1 | SORT AGGREGATE | | 1 | 26 | | | | |
| 2 | PARTITION RANGE SINGLE | | 942 | 24492 | 11 (0)| 00:00:01 | 1 | 1 |
|* 3 | <b>TABLE ACCESS INMEMORY FULL</b>| T | 942 | 24492 | 11 (0)| 00:00:01 | 1 | 1 |
-----------------------------------------------------------------------------------------------------
</code></span></pre>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>explain plan for SELECT sum(a), sum(b), count(*) FROM t WHERE b=56;
…
Plan hash value: 2993254470
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 11 (0)| 00:00:01 | | |
| 1 | SORT AGGREGATE | | 1 | 26 | | | | |
| 2 | PARTITION RANGE SINGLE | | 926 | 24076 | 11 (0)| 00:00:01 | 2 | 2 |
|* 3 | <b>TABLE ACCESS INMEMORY FULL</b>| T | 926 | 24076 | 11 (0)| 00:00:01 | 2 | 2 |
-----------------------------------------------------------------------------------------------------
</code></span></pre>
Oracle, distributes the partitions across the in-memory stores in RAC nodes. In my case, the first partition is on instance 1, and the second partition is on instance 2.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>select inst_id, owner, segment_name, partition_name, inmemory_size, bytes, bytes_not_populated, populate_status, inmemory_duplicate
from gv$im_segments where segment_name = 'T' order by inst_id;
INST_ID OWNER SEGMENT_NAME PARTITION_NAME INMEMORY_SIZE BYTES BYTES_NOT_POPULATED POPULATE_STAT INMEMORY_DUPL
---------- ---------- ------------ -------------- ------------- ---------- ------------------- ------------- -------------
1 SYSADM T T1 6422528 8241152 0 COMPLETED <b>NO DUPLICATE</b>
2 SYSADM T T2 6422528 8241152 0 COMPLETED <b>NO DUPLICATE</b>
</code></span></pre>I will run the queries, capturing the session statistics to a temporary table. I use the Oracle-delivered global temporary table <a href="https://docs.oracle.com/en/database/oracle/oracle-database/21/refrn/PLAN_TABLE.html#GUID-0CAFEAD1-8C79-4200-8658-947D04BDFFE2" target="_blank"><i>plan_table</i></a> so that I don't have to create my own table.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>delete from plan_table;
insert into plan_table (statement_id, plan_id, id, cost, parent_id) select '_', s.* from v$mystat s;
SELECT sum(a), sum(b), count(*) FROM t WHERE b=42;
insert into plan_table (statement_id, plan_id, id, cost, parent_id) select 'A', s.* from v$mystat s;
SELECT sum(a), sum(b), count(*) FROM t WHERE b=56;
insert into plan_table (statement_id, plan_id, id, cost, parent_id) select 'B', s.* from v$mystat s;
</code></span></pre>
Then I can simply query where the IM statistics are different.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>with x (scenario, sid, statistic#, value) as (select statement_id, plan_id, id, cost from plan_table)
select x.statistic#, n.name
, a.value-x.value diff_a
, b.value-a.value diff_b
from v$statname n, x, x a, x b
where x.scenario = '_'
and x.sid = a.sid and x.statistic# = a.statistic# and a.scenario = 'A'
and x.sid = b.sid and x.statistic# = b.statistic# and b.scenario = 'B'
and (x.value < a.value OR a.value < b.value)
and n.statistic# = x.statistic#
and n.name like 'IM %' and not n.name like 'IM %populate%'
order by x.statistic#;
</code></span></pre>
I only got an in-memory query on instance 2. On instance 1, there is a single <i>IM scan segments disk</i> operation. This is the 'number of times a segment marked for in-memory was accessed entirely from the buffer cache/direct read' (see <a href="https://blogs.oracle.com/in-memory/post/popular-statistics-with-database-in-memory">Popular Statistics with Database In-Memory</a>). This indicates that there was no in-memory query. </div><div><div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>STATISTIC# NAME DIFF_A DIFF_B
---------- -------------------------------------------------- ---------- ----------
772 IM scan CUs no cleanout 0 1
802 IM scan CUs current 0 1
830 IM scan CUs readlist creation accumulated time 0 2
832 IM scan CUs readlist creation number 0 1
838 IM scan delta - only base scan 0 1
1376 IM scan CUs pcode aggregation pushdown 0 3
1377 IM scan rows pcode aggregated 0 1000
1379 IM scan CUs pcode pred evaled 0 1
1385 IM scan dict engine results reused 0 3
1480 IM scan CUs memcompress for query low 0 1
1493 <b>IM scan segments disk 1</b> 0
1494 IM scan bytes in-memory 0 5940559
1495 IM scan bytes uncompressed 0 5444950
1496 IM scan CUs columns accessed 0 2
1498 IM scan CUs columns theoretical max 0 3
1505 IM scan rows 0 50000
1506 IM simd compare calls 0 3
1512 IM simd decode unpack calls 0 6
1513 IM simd decode symbol calls 0 2
1520 IM simd decode unpack selective calls 0 6
1527 IM scan rows valid 0 50000
1533 IM scan rows projected 0 1
1538 IM scan CUs split pieces 0 1
1571 IM scan CUs predicates received 0 1
1572 IM scan CUs predicates applied 0 1
1577 IM scan segments minmax eligible 0 1
1611 IM SubCU-MM CUs Examined 0 1
</code></span></pre><h4>
Parallel Query </h4></div><div>I will repeat the test, but use a parallel hint to enable parallel query.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>SELECT <b>/*+PARALLEL*/</b> sum(a), sum(b), count(*) FROM t WHERE b=42;
</code></span></pre>Now, I get a parallel execution plan
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>Plan hash value: 943991435
-----------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 6 (0)| 00:00:01 | | | | | |
| 1 | SORT AGGREGATE | | 1 | 26 | | | | | | | |
| 2 | PX COORDINATOR | | | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 26 | | | | | Q1,00 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | 1 | 26 | | | | | Q1,00 | PCWP | |
| 5 | PX BLOCK ITERATOR | | 942 | 24492 | 6 (0)| 00:00:01 | 1 | 1 | Q1,00 | PCWC | |
|* 6 | TABLE ACCESS <b>INMEMORY FULL</b>| T | 942 | 24492 | 6 (0)| 00:00:01 | 1 | 1 | Q1,00 | PCWP | |
-----------------------------------------------------------------------------------------------------------------------------------------
</code></span></pre>
The IM statistics show that both the queries performed an in-memory query.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>STATISTIC# NAME DIFF_A DIFF_B
---------- -------------------------------------------------- ---------- ----------
772 IM scan CUs no cleanout 3 3
802 IM scan CUs current 3 3
830 IM scan CUs readlist creation accumulated time 4 4
832 IM scan CUs readlist creation number 3 3
838 IM scan delta - only base scan 3 3
1376 IM scan CUs pcode aggregation pushdown 9 9
1377 IM scan rows pcode aggregated 1000 1000
1379 IM scan CUs pcode pred evaled 3 3
1385 IM scan dict engine results reused 9 9
1480 IM scan CUs memcompress for query low 3 3
1494 IM scan bytes in-memory 17819283 17821701
1495 IM scan bytes uncompressed 16328826 16334850
1496 IM scan CUs columns accessed 6 6
1498 IM scan CUs columns theoretical max 9 9
1505 IM scan rows 150000 150000
1506 IM simd compare calls 9 9
1512 IM simd decode unpack calls 18 18
1513 IM simd decode symbol calls 6 6
1520 IM simd decode unpack selective calls 18 18
1527 IM scan rows valid 50000 50000
1529 IM scan rows range excluded 100000 100000
1533 IM scan rows projected 3 3
1538 IM scan CUs split pieces 6 3
1571 IM scan CUs predicates received 3 3
1572 IM scan CUs predicates applied 3 3
1577 IM scan segments minmax eligible 3 3
1611 IM SubCU-MM CUs Examined 3 3
</code></span></pre><h4>
Duplicate In-Memory Store </h4></div><div>This time, I will repeat the test with a duplicate in-memory store. The DUPLICATE option stores the segment in the in-memory store on one other RAC node, the DUPLICATE ALL option stores it on all RAC nodes. On a 2-node RAC they come to the same thing.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>CREATE TABLE t (a number, b number, c VARCHAR2(1000)) PARTITION BY RANGE (b)
(partition t1 VALUES LESS THAN(50)
,partition t2 VALUES LESS THAN(MAXVALUE)
) <b>INMEMORY DUPLICATE ALL</b>;
</code></span></pre>
Now, both partitions are stored on both instances.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>select inst_id, owner, segment_name, partition_name, inmemory_size, bytes, bytes_not_populated, populate_status, inmemory_duplicate
from gv$im_segments where segment_name = 'T' order by inst_id, segment_name, partition_name;
INST_ID OWNER SEGMENT_NAME PARTITION_NAME INMEMORY_SIZE BYTES BYTES_NOT_POPULATED POPULATE_STAT INMEMORY_DUPL
---------- ---------- ------------ -------------- ------------- ---------- ------------------- ------------- -------------
1 SYSADM T T1 6422528 8241152 0 COMPLETED <b>DUPLICATE</b>
1 SYSADM T T2 6422528 8241152 0 COMPLETED <b>DUPLICATE</b>
2 SYSADM T T1 6422528 8241152 0 COMPLETED <b>DUPLICATE</b>
2 SYSADM T T2 9568256 8241152 0 COMPLETED <b>DUPLICATE</b>
</code></span></pre>
I will return to the original queries without the parallel hints
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>SELECT sum(a), sum(b), count(*) FROM t WHERE b=42;
SELECT sum(a), sum(b), count(*) FROM t WHERE b=56;
</code></span></pre>
The in-memory statistics are the same for both queries, indicating that an in-memory query was successfully performed for both partitions because both partitions are stored in in-memory on both instances.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code>STATISTIC# NAME DIFF_A DIFF_B
---------- -------------------------------------------------- ---------- ----------
772 IM scan CUs no cleanout 1 1
802 IM scan CUs current 1 1
830 IM scan CUs readlist creation accumulated time 3 2
832 IM scan CUs readlist creation number 1 1
838 IM scan delta - only base scan 1 1
1376 IM scan CUs pcode aggregation pushdown 3 3
1377 IM scan rows pcode aggregated 1000 1000
1379 IM scan CUs pcode pred evaled 1 1
1385 IM scan dict engine results reused 3 3
1480 IM scan CUs memcompress for query low 1 1
1494 IM scan bytes in-memory 5939777 5940563
1495 IM scan bytes uncompressed 5442942 5444950
1496 IM scan CUs columns accessed 2 2
1498 IM scan CUs columns theoretical max 3 3
1505 IM scan rows 50000 50000
1506 IM simd compare calls 3 3
1512 IM simd decode unpack calls 6 6
1513 IM simd decode symbol calls 2 2
1520 IM simd decode unpack selective calls 6 6
1527 IM scan rows valid 50000 50000
1533 IM scan rows projected 1 1
1538 IM scan CUs split pieces 2 2
1571 IM scan CUs predicates received 1 1
1572 IM scan CUs predicates applied 1 1
1577 IM scan segments minmax eligible 1 1
1611 IM SubCU-MM CUs Examined 1 1
</code></span></pre><h3 style="text-align: left;">
TL;DR </h3><div>The presence of an in-memory operation in an execution plan does not mean that the statement is definitely using in-memory. Rather, it means that it will use an in-memory query if it finds the segment in the in-memory store, and that content is up to date. </div><div>Look at the session level statistics to determine whether the query really did use in-memory as I have demonstrated in this blog, or a SQL Monitor active report (see <a href="https://blogs.oracle.com/in-memory/post/oracle-database-in-memory-on-rac-part-1-revised">Oracle Database In-Memory on RAC - Part 1 (revised)</a>). </div><div>Parallel query is used to access an object stored in in-memory on a different node remote to where the session is connected. If the query is not run in parallel it will not be able to access it. This will be indicated by the ' IM scan segments disk' statistics.
Alternatives are to duplicate the in-memory store on Exadata or to use services to create node affinity.</div></div></div><p style="text-align: left;"><i>My thanks to <a href="https://blogs.oracle.com/authors/andy-rivenes" target="_blank">Andy Rivenes</a> for the initial comment that sent me off into this subject, and to the various articles that he and <a href="https://blogs.oracle.com/authors/maria-colgan" target="_blank">Maria Colgan</a> have posted on <a href="https://blogs.oracle.com/in-memory/#" target="_blank">Oracle Database In-Memory</a> blog that I have linked in this note.</i></p><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-16669238069452397112024-01-03T08:40:00.000+00:002024-01-03T08:40:34.835+00:00Job Chains<div style="text-align: left;">I have a requirement to run several concurrent jobs, and then, only when they have all finished, I want to run another job. Rather than create several stand-alone jobs, I can create a chain of sub-jobs.</div><p></p><ul style="text-align: left;"><li>Each step in the chain maps to a program that can invoke either a PL/SQL procedure, a PL/SQL block, a SQL script, or an external program.</li><li>Each step has a rule that includes a condition that determines when it starts. Thus, it can be after one or more other steps have been completed or succeeded.</li><li>A priority can be specified on the program that will determine the order in which programs will be run on the scheduler, all other factors being equal.</li><li>The number of jobs that are permitted to run concurrently can be controlled with a user-defined scheduler resource. The resource is defined as having a number of units. The number of units consumed by a job can be specified in an attribute of a stand-alone job. In a job chain, the resource consumption attribute is applied to the program called from the chain step, rather than the job. Only as many jobs as there are resources available are executed concurrently.</li></ul><p></p>
<h3 style="text-align: left;">Job Chain Parameters</h3>
User-defined parameters can be passed into a stand-alone job, but not (as far as I have been able to find out) into steps in a job chain. Instead job chain metadata, including the job name and sub-name, can be specified as parameters and then application parameters could be looked up for each step in a parameter table. <p>This naturally leads to a data-driven approach to managing chains, starting with a parameter table containing meta-data from which to create a job chain. Then, when the chain executes, the programs can look up the parameters from the same table, and update other values on it for logging.</p>
<h3 style="text-align: left;">Demonstration</h3><div style="text-align: left;"><ul style="text-align: left;"><li>In this example, all the jobs will execute a procedure in a PL/SQL package.</li><li>10 jobs that will all run for different specified amounts of time. </li><li>I want to run the longest ones first, so they will be given higher priority.</li><li>The jobs will each consume a different number of units of a user-defined resource. Therefore, it will constrain how many jobs can run concurrently.</li><li>A final job will only run when the first 10 jobs have all been completed.</li></ul></div><h4 style="text-align: left;">Parameter Table</h4><p>I will start by creating a parameter table that will be used to create the job chain. It will contain a row for each step in the chain. .
</p><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>create table test_chain
(seq INTEGER
,chain_name VARCHAR2(128)
,step_name VARCHAR2(24)
,program_name VARCHAR2(128)
,program_action VARCHAR2(128)
,resource_units NUMBER
,priority INTEGER DEFAULT 3 NOT NULL CONSTRAINT test_chain_priority_chk CHECK (priority IN(1,2,3,4,5))
,condition VARCHAR2(4000) DEFAULT 'TRUE' NOT NULL
,end_step VARCHAR2(1) DEFAULT 'N' NOT NULL CONSTRAINT test_chain_end_step_chk CHECK (end_step IN('Y','N'))
,seconds number
,begindttm timestamp
,enddttm timestamp
,CONSTRAINT test_chain_uk PRIMARY KEY (chain_name, step_name)
);</code></span></pre>
The parameter table is populated with the chain steps.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>truncate table test_chain;
insert into test_chain
(seq, chain_name, step_name, program_name, program_action, resource_units, condition, priority, seconds)
select 1, 'TEST_CHAIN_1', 'CHAIN_STEP_'||level, 'TEST_PROGRAM_'||level, 'TEST_PROCEDURE'
, level resource_units, 'TRUE' condition, NTILE(5) OVER (order by level desc) priority, 10*level seconds
from dual connect by level <= 10
/
insert into test_chain
(seq, chain_name, step_name, program_name, program_action, seconds)
select 2, 'TEST_CHAIN_1', 'CHAIN_STEP_LAST', 'TEST_PROGRAM_LAST', 'TEST_PROCEDURE', 1
from dual
/
update test_chain c
set condition = (
SELECT LISTAGG(':'||b.step_name||'.state=''SUCCEEDED''',' AND ') WITHIN GROUP (ORDER BY b.step_name)
FROM test_chain b
WHERE b.seq = c.seq-1
and b.chain_name = c.chain_name)
where seq = 2 and chain_name = 'TEST_CHAIN_1'
/
insert into test_chain
(seq, chain_name, step_name, end_step, condition, seconds)
select 3, 'TEST_CHAIN_1', 'CHAIN_STEP_END', 'Y', ':CHAIN_STEP_LAST.state=''SUCCEEDED''', 1
from dual
/
commit;</code></span></pre>
The chain steps are in 3 sequenced groups. <div><ol style="text-align: left;"><li>10 concurrent jobs that run first.</li><li>A job that runs after the first 10 jobs have been completed. The initiation criteria are generated with a <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/LISTAGG.html" target="_blank">LISTAGG()</a> function that lists the 10 steps in sequence 1.</li><li>A step that specifies the end of the chain. It is dependent on the job in sequence 2. There is no program for this step.</li></ol></div><div>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span><code><span style="font-size: x-small;">column seq format 99
column chain_name format a20
column step_name format a20
column program_name format a25
column program_action format a25 wrapped on
column resource_units heading 'Res.|Units' format 99999
column condition format a40
column units format 999
column seconds format 999
</span><span style="font-size: xx-small;">select c.*
from test_chain c
where chain_name = 'TEST_CHAIN_1'
order by seq, resource_units; Res.
SEQ CHAIN_NAME STEP_NAME PROGRAM_NAME PROGRAM_ACTION Units PRIORITY CONDITION E SECONDS
--- ------------- ---------------- ------------------ ------------------ ----- ---------- ---------------------------------------- - -------
1 TEST_CHAIN_1 CHAIN_STEP_1 TEST_PROGRAM_1 TEST_PROCEDURE 1 5 TRUE N 10
1 TEST_CHAIN_1 CHAIN_STEP_2 TEST_PROGRAM_2 TEST_PROCEDURE 2 5 TRUE N 20
1 TEST_CHAIN_1 CHAIN_STEP_3 TEST_PROGRAM_3 TEST_PROCEDURE 3 4 TRUE N 30
1 TEST_CHAIN_1 CHAIN_STEP_4 TEST_PROGRAM_4 TEST_PROCEDURE 4 4 TRUE N 40
1 TEST_CHAIN_1 CHAIN_STEP_5 TEST_PROGRAM_5 TEST_PROCEDURE 5 3 TRUE N 50
1 TEST_CHAIN_1 CHAIN_STEP_6 TEST_PROGRAM_6 TEST_PROCEDURE 6 3 TRUE N 60
1 TEST_CHAIN_1 CHAIN_STEP_7 TEST_PROGRAM_7 TEST_PROCEDURE 7 2 TRUE N 70
1 TEST_CHAIN_1 CHAIN_STEP_8 TEST_PROGRAM_8 TEST_PROCEDURE 8 2 TRUE N 80
1 TEST_CHAIN_1 CHAIN_STEP_9 TEST_PROGRAM_9 TEST_PROCEDURE 9 1 TRUE N 90
1 TEST_CHAIN_1 CHAIN_STEP_10 TEST_PROGRAM_10 TEST_PROCEDURE 10 1 TRUE N 100
2 TEST_CHAIN_1 CHAIN_STEP_LAST TEST_PROGRAM_LAST TEST_PROCEDURE 3 :CHAIN_STEP_1.state='SUCCEEDED' AND :CHA N 1
IN_STEP_10.state='SUCCEEDED' AND :CHAIN_
STEP_2.state='SUCCEEDED' AND :CHAIN_STEP
_3.state='SUCCEEDED' AND :CHAIN_STEP_4.s
tate='SUCCEEDED' AND :CHAIN_STEP_5.state
='SUCCEEDED' AND :CHAIN_STEP_6.state='SU
CCEEDED' AND :CHAIN_STEP_7.state='SUCCEE
DED' AND :CHAIN_STEP_8.state='SUCCEEDED'
AND :CHAIN_STEP_9.state='SUCCEEDED'
3 TEST_CHAIN_1 CHAIN_STEP_END 3 :CHAIN_STEP_LAST.state='SUCCEEDED' Y 1 </span></code></span></pre>It is only necessary to have multiple programs if you need to execute different procedures, use different priorities, or use different amounts of a resource. In this example, each step has a different program even though they all execute the same procedure because I want to demonstrate the effect of different amounts of resource consumption and different priorities.</div><h4 style="text-align: left;">Test Procedure </h4><div>This procedure will be called by the chain steps. The chain step name will be passed to the test procedure as a parameter. The first update statement both updates BEGINDTTM on the parameter table and fetches the number of seconds for which the procedure is to sleep.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>create or replace procedure test_procedure
(p_step_name VARCHAR2) as
k_module CONSTANT v$session.module%TYPE := $$PLSQL_UNIT;
l_module v$session.module%TYPE;
l_action v$session.action%TYPE;
l_seconds test_chain.seconds%TYPE;
BEGIN
dbms_application_info.read_module(l_module, l_action);
dbms_application_info.set_module(k_module, p_step_name);
UPDATE test_chain
SET begindttm = SYSTIMESTAMP
WHERE step_name = p_step_name
RETURNING seconds INTO l_seconds;
COMMIT;
dbms_output.put_line(k_module||'.'||p_step_name||':'||l_seconds);
dbms_lock.sleep(l_seconds);
UPDATE test_chain
SET enddttm = SYSTIMESTAMP
WHERE step_name = p_step_name;
COMMIT;
dbms_application_info.set_module(l_module, l_action);
EXCEPTION
WHEN OTHERS THEN
dbms_application_info.set_module(l_module, l_action);
RAISE;
END;
/</code></span></pre><h4 style="text-align: left;">
Creating the Chain </h4></div><div>Then the parameter table is used to create the chain, programs, chain rules, and job that will be executed.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span><code><span style="font-size: x-small;">DECLARE
e_scheduler_chain_does_not_exist EXCEPTION;
PRAGMA exception_init(e_scheduler_chain_does_not_exist,-23308);
e_scheduler_job_does_not_exist EXCEPTION;
PRAGMA exception_init(e_scheduler_job_does_not_exist,-27475);
e_scheduler_object_does_not_exist EXCEPTION;
PRAGMA exception_init(e_scheduler_object_does_not_exist,-27476);
e_scheduler_object_already_exists EXCEPTION;
PRAGMA exception_init(e_scheduler_object_already_exists,-27477);
l_job_suffix CONSTANT VARCHAR2(10) := '_JOB';
l_resource_suffix CONSTANT VARCHAR2(10) := '_RESOURCE';
BEGIN
FOR i IN (SELECT DISTINCT chain_name FROM test_chain) LOOP
BEGIN --drop resource if already present
<a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-C17588F0-887D-41F2-90ED-BE399A44DD1D" target="_blank">DBMS_SCHEDULER.drop_resource</a> (resource_name => i.chain_name||l_resource_suffix);
EXCEPTION WHEN e_scheduler_object_does_not_exist THEN NULL;
END;
<a href="http://DBMS_SCHEDULER.create_resource" target="_blank">DBMS_SCHEDULER.create_resource</a> ( --recreate resource
resource_name => i.chain_name||l_resource_suffix,
units => 10,
status => 'ENFORCE_CONSTRAINTS', -- Default
constraint_level => 'JOB_LEVEL'); -- Default
BEGIN --drop scheduler job if already present
<a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-25291853-146D-4F10-B181-856B40FA684A" target="_blank">DBMS_SCHEDULER.drop_job</a>(job_name => i.chain_name||l_job_suffix);
EXCEPTION WHEN e_scheduler_job_does_not_exist THEN NULL;
END;
BEGIN --drop chain if already present
<a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-4B98C092-28F8-4331-BDF9-6F0A84F9B351" target="_blank">DBMS_SCHEDULER.drop_chain</a> (chain_name => i.chain_name, force=>TRUE);
EXCEPTION WHEN e_scheduler_chain_does_not_exist THEN NULL;
END;
<a href="http://DBMS_SCHEDULER.create_chain" target="_blank">DBMS_SCHEDULER.create_chain</a> ( --recreate chain
chain_name => i.chain_name,
rule_set_name => NULL,
evaluation_interval => NULL);
END LOOP;
FOR i IN (
select c.* from test_chain c
ORDER BY seq, priority, resource_units desc
) LOOP
dbms_output.put_line(i.chain_name||', Step:'||i.step_name||', Condition:'||i.condition);
IF i.program_name IS NOT NULL THEN
BEGIN
<a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-F41A5779-1915-4D5D-A7F5-87727320B742" target="_blank">DBMS_SCHEDULER.create_program</a> ( --create program to call stored procedure
program_name => i.program_name,
program_type => 'STORED_PROCEDURE',
program_action => i.program_action,
number_of_arguments => 1,
enabled => FALSE,
comments => 'Program for chain:'||i.chain_name||', step:'||i.step_name);
<a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-6EEDB8ED-5C48-4459-B66E-F760AA38C365" target="_blank">DBMS_SCHEDULER.DEFINE_METADATA_ARGUMENT</a>( --pass job_subname as first parameter
program_name => i.program_name,
metadata_attribute => 'job_subname',
argument_position => 1);
<a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-D7A11F8A-8746-4815-91C4-BC8DDBA4C74A" target="_blank">DBMS_SCHEDULER.set_attribute</a> ( --apply priority to program
name => i.program_name,
attribute => 'job_priority',
value => i.priority);
<a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-2D8930DD-1042-4FA9-A0C0-2E4C7A7BFE9B" target="_blank">DBMS_SCHEDULER.set_resource_constraint</a> ( --apply resource consumption constraint to program
object_name => i.program_name, --cannot go on step
resource_name => i.chain_name||l_resource_suffix,
units => i.resource_units);
dbms_scheduler.enable(i.program_name);
dbms_output.put_line(i.chain_name||', Step:'||i.step_name||', Program:'||i.program_name);
EXCEPTION WHEN e_scheduler_object_already_exists THEN NULL;
END;
<a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-0D52F163-7C2B-480D-88C8-84FBA8143D88" target="_blank">DBMS_SCHEDULER.define_chain_step</a> ( --create chain step to call program
chain_name => i.chain_name,
step_name => i.step_name,
program_name => i.program_name);
END IF;
IF i.end_step = 'Y' THEN --if last step in chain
<a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-BF7D99FE-C33F-444E-8725-BBC24DD33027" target="_blank">DBMS_SCHEDULER.define_chain_rule</a> ( -- create job chain end step
chain_name => i.chain_name,
condition => i.condition,
action => 'END',
rule_name => i.step_name,
comments => 'End of chain '||i.chain_name);
<a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-33CD9F19-8448-4BA8-AAB3-3B82A670085D" target="_blank">DBMS_SCHEDULER.enable</a> (i.chain_name); --enable the chain
<a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-7E744D62-13F6-40E9-91F0-1569E6C38BBC" target="_blank">dbms_scheduler.create_job</a> ( --create a job to execute the chain once
job_name=> i.chain_name||l_job_suffix,
job_type=> 'CHAIN',
job_action=> i.chain_name,
start_date=> sysdate,
enabled=> FALSE);
ELSE --otherwise create an ordinary job rule for each step
DBMS_SCHEDULER.define_chain_rule (
chain_name => i.chain_name,
condition => i.condition,
action => 'START "'||i.step_name||'"',
rule_name => i.step_name,
comments => 'Sequence '||i.seq);
END IF;
END LOOP;
END;
/
</span><span style="font-size: 70%;">TEST_CHAIN_1, Step:CHAIN_STEP_10, Condition:TRUE
TEST_CHAIN_1, Step:CHAIN_STEP_10, Program:TEST_PROGRAM_10
TEST_CHAIN_1, Step:CHAIN_STEP_9, Condition:TRUE
TEST_CHAIN_1, Step:CHAIN_STEP_9, Program:TEST_PROGRAM_9
TEST_CHAIN_1, Step:CHAIN_STEP_8, Condition:TRUE
TEST_CHAIN_1, Step:CHAIN_STEP_8, Program:TEST_PROGRAM_8
TEST_CHAIN_1, Step:CHAIN_STEP_7, Condition:TRUE
TEST_CHAIN_1, Step:CHAIN_STEP_7, Program:TEST_PROGRAM_7
TEST_CHAIN_1, Step:CHAIN_STEP_6, Condition:TRUE
TEST_CHAIN_1, Step:CHAIN_STEP_6, Program:TEST_PROGRAM_6
TEST_CHAIN_1, Step:CHAIN_STEP_5, Condition:TRUE
TEST_CHAIN_1, Step:CHAIN_STEP_5, Program:TEST_PROGRAM_5
TEST_CHAIN_1, Step:CHAIN_STEP_4, Condition:TRUE
TEST_CHAIN_1, Step:CHAIN_STEP_4, Program:TEST_PROGRAM_4
TEST_CHAIN_1, Step:CHAIN_STEP_3, Condition:TRUE
TEST_CHAIN_1, Step:CHAIN_STEP_3, Program:TEST_PROGRAM_3
TEST_CHAIN_1, Step:CHAIN_STEP_2, Condition:TRUE
TEST_CHAIN_1, Step:CHAIN_STEP_2, Program:TEST_PROGRAM_2
TEST_CHAIN_1, Step:CHAIN_STEP_1, Condition:TRUE
TEST_CHAIN_1, Step:CHAIN_STEP_1, Program:TEST_PROGRAM_1
TEST_CHAIN_1, Step:CHAIN_STEP_LAST, Condition::CHAIN_STEP_1.state='SUCCEEDED' AND :CHAIN_STEP_10.state='SUCCEEDED' AND :CHAIN_STEP_2.state='SUCCEEDED'
AND :CHAIN_STEP_3.state='SUCCEEDED' AND :CHAIN_STEP_4.state='SUCCEEDED' AND :CHAIN_STEP_5.state='SUCCEEDED' AND :CHAIN_STEP_6.state='SUCCEEDED'
AND :CHAIN_STEP_7.state='SUCCEEDED' AND :CHAIN_STEP_8.state='SUCCEEDED' AND :CHAIN_STEP_9.state='SUCCEEDED'
TEST_CHAIN_1, Step:CHAIN_STEP_LAST, Program:TEST_PROGRAM_LAST
TEST_CHAIN_1, Step:CHAIN_STEP_END, Condition::CHAIN_STEP_LAST.state='SUCCEEDED'</span><span style="font-size: x-small;">
PL/SQL procedure successfully completed.
</span></code></span></pre><h4 style="text-align: left;">
Exploring the Chain
</h4><div>Various views are available to see how the chain is defined.</div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>select * from <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/ALL_SCHEDULER_RESOURCES.html#GUID-B0928B55-6A55-473C-AEB5-B8977E5D77DF" target="_blank">all_scheduler_resources</a> WHERE resource_name like 'TEST_CHAIN%'
Jobs
Resource Run
OWNER RESOURCE_NAME STATUS Units UNITS_USED Count COMMENTS
---------- -------------------------------- -------------------- -------- ---------- ----- --------------------
SYSADM TEST_CHAIN_1_RESOURCE ENFORCE_CONSTRAINTS 10 0 0
</code></span></pre>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span><code><span style="font-size: x-small;">SELECT owner,chain_name,rule_set_owner,rule_set_name,number_of_rules,number_of_steps,enabled,comments
FROM <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/ALL_SCHEDULER_CHAINS.html#GUID-284E02A0-2AFC-449B-8406-4DE996282EAE" target="_blank">all_scheduler_chains</a>
WHERE chain_name like 'TEST_CHAIN%';
</span><span style="font-size: xx-small;"> Rule Set
OWNER CHAIN_NAME Owner RULE_SET_NAME NUMBER_OF_RULES NUMBER_OF_STEPS ENABLED COMMENTS
---------- -------------------- ---------- --------------- --------------- --------------- ------- ----------------------------------------
SYSADM TEST_CHAIN_1 SYSADM SCHED_RULESET$7 12 11 TRUE</span><span style="font-size: x-small;">
</span></code></span></pre>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span><code><span style="font-size: xx-small;">SELECT owner, program_name, program_type, program_action, number_of_arguments, enabled, priority, weight, has_Constraints, comments
FROM <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/ALL_SCHEDULER_PROGRAMS.html#GUID-9D8EFDE6-EC90-4822-86EE-6C074A6EFC6C" target="_blank">all_SCHEDULER_PROGRAMS</a>
WHERE PROGRAM_NAME LIKE 'TEST_PROGRAM%';
</span><span style="font-size: 70%;"> Num Has
OWNER PROGRAM_NAME PROGRAM_TYPE PROGRAM_ACTION Args ENABLED Prio Wgt Const. COMMENTS
---------- -------------------- ---------------- --------------- ---- ------- ---- --- ------ ------------------------------------------------------------
SYSADM TEST_PROGRAM_10 STORED_PROCEDURE TEST_PROCEDURE 1 TRUE 1 1 TRUE Program for chain:TEST_CHAIN_1, step:CHAIN_STEP_10
SYSADM TEST_PROGRAM_9 STORED_PROCEDURE TEST_PROCEDURE 1 TRUE 1 1 TRUE Program for chain:TEST_CHAIN_1, step:CHAIN_STEP_9
SYSADM TEST_PROGRAM_8 STORED_PROCEDURE TEST_PROCEDURE 1 TRUE 2 1 TRUE Program for chain:TEST_CHAIN_1, step:CHAIN_STEP_8
SYSADM TEST_PROGRAM_7 STORED_PROCEDURE TEST_PROCEDURE 1 TRUE 2 1 TRUE Program for chain:TEST_CHAIN_1, step:CHAIN_STEP_7
SYSADM TEST_PROGRAM_6 STORED_PROCEDURE TEST_PROCEDURE 1 TRUE 3 1 TRUE Program for chain:TEST_CHAIN_1, step:CHAIN_STEP_6
SYSADM TEST_PROGRAM_5 STORED_PROCEDURE TEST_PROCEDURE 1 TRUE 3 1 TRUE Program for chain:TEST_CHAIN_1, step:CHAIN_STEP_5
SYSADM TEST_PROGRAM_4 STORED_PROCEDURE TEST_PROCEDURE 1 TRUE 4 1 TRUE Program for chain:TEST_CHAIN_1, step:CHAIN_STEP_4
SYSADM TEST_PROGRAM_3 STORED_PROCEDURE TEST_PROCEDURE 1 TRUE 4 1 TRUE Program for chain:TEST_CHAIN_1, step:CHAIN_STEP_3
SYSADM TEST_PROGRAM_2 STORED_PROCEDURE TEST_PROCEDURE 1 TRUE 5 1 TRUE Program for chain:TEST_CHAIN_1, step:CHAIN_STEP_2
SYSADM TEST_PROGRAM_1 STORED_PROCEDURE TEST_PROCEDURE 1 TRUE 5 1 TRUE Program for chain:TEST_CHAIN_1, step:CHAIN_STEP_1
SYSADM TEST_PROGRAM_LAST STORED_PROCEDURE TEST_PROCEDURE 1 TRUE 3 1 FALSE Program for chain:TEST_CHAIN_1, step:CHAIN_STEP_LAST</span><span style="font-size: x-small;">
</span></code></span></pre>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>SELECT owner, chain_name, step_name, program_owner, program_name, step_type
FROM <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/ALL_SCHEDULER_CHAIN_STEPS.html#GUID-F8CF01C8-C1AE-45A6-95D9-25F0A5E9C583" target="_blank">all_scheduler_chain_steps</a>
WHERE chain_name like 'TEST_CHAIN%'
ORDER BY owner, chain_name, step_name;
Program
OWNER CHAIN_NAME STEP_NAME Owner PROGRAM_NAME STEP_TYPE
---------- -------------------- ------------------------- ---------- -------------------- ----------
SYSADM TEST_CHAIN_1 CHAIN_STEP_1 SYSADM TEST_PROGRAM_1 PROGRAM
SYSADM TEST_CHAIN_1 CHAIN_STEP_10 SYSADM TEST_PROGRAM_10 PROGRAM
SYSADM TEST_CHAIN_1 CHAIN_STEP_2 SYSADM TEST_PROGRAM_2 PROGRAM
SYSADM TEST_CHAIN_1 CHAIN_STEP_3 SYSADM TEST_PROGRAM_3 PROGRAM
SYSADM TEST_CHAIN_1 CHAIN_STEP_4 SYSADM TEST_PROGRAM_4 PROGRAM
SYSADM TEST_CHAIN_1 CHAIN_STEP_5 SYSADM TEST_PROGRAM_5 PROGRAM
SYSADM TEST_CHAIN_1 CHAIN_STEP_6 SYSADM TEST_PROGRAM_6 PROGRAM
SYSADM TEST_CHAIN_1 CHAIN_STEP_7 SYSADM TEST_PROGRAM_7 PROGRAM
SYSADM TEST_CHAIN_1 CHAIN_STEP_8 SYSADM TEST_PROGRAM_8 PROGRAM
SYSADM TEST_CHAIN_1 CHAIN_STEP_9 SYSADM TEST_PROGRAM_9 PROGRAM
SYSADM TEST_CHAIN_1 CHAIN_STEP_LAST SYSADM TEST_PROGRAM_LAST PROGRAM
</code></span></pre>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span><code><span style="font-size: x-small;">SELECT owner,chain_name,rule_owner,rule_name,condition,action,comments
FROM <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/ALL_SCHEDULER_CHAIN_RULES.html#GUID-5024EF0D-02F6-4352-BA5F-D32D92A3C6CC" target="_blank">all_scheduler_chain_rules</a>
WHERE chain_name like 'TEST_CHAIN%'
ORDER BY owner, chain_name, rule_owner, rule_name;
</span><span style="font-size: 70%;"><span>
</span><span> Rule
OWNER CHAIN_NAME Owner RULE_NAME CONDITION ACTION COMMENTS
---------- ------------- ---------- ------------------ -------------------------------------------------- ------------------------------ ------------------------------
SYSADM TEST_CHAIN_1 SYSADM CHAIN_STEP_1 TRUE START "CHAIN_STEP_1" Sequence 1
SYSADM TEST_CHAIN_1 SYSADM CHAIN_STEP_10 TRUE START "CHAIN_STEP_10" Sequence 1
SYSADM TEST_CHAIN_1 SYSADM CHAIN_STEP_2 TRUE START "CHAIN_STEP_2" Sequence 1
SYSADM TEST_CHAIN_1 SYSADM CHAIN_STEP_3 TRUE START "CHAIN_STEP_3" Sequence 1
SYSADM TEST_CHAIN_1 SYSADM CHAIN_STEP_4 TRUE START "CHAIN_STEP_4" Sequence 1
SYSADM TEST_CHAIN_1 SYSADM CHAIN_STEP_5 TRUE START "CHAIN_STEP_5" Sequence 1
SYSADM TEST_CHAIN_1 SYSADM CHAIN_STEP_6 TRUE START "CHAIN_STEP_6" Sequence 1
SYSADM TEST_CHAIN_1 SYSADM CHAIN_STEP_7 TRUE START "CHAIN_STEP_7" Sequence 1
SYSADM TEST_CHAIN_1 SYSADM CHAIN_STEP_8 TRUE START "CHAIN_STEP_8" Sequence 1
SYSADM TEST_CHAIN_1 SYSADM CHAIN_STEP_9 TRUE START "CHAIN_STEP_9" Sequence 1
SYSADM TEST_CHAIN_1 SYSADM CHAIN_STEP_END :CHAIN_STEP_LAST.state='SUCCEEDED' END End of chain TEST_CHAIN_1
SYSADM TEST_CHAIN_1 SYSADM CHAIN_STEP_LAST :CHAIN_STEP_1.state='SUCCEEDED' AND :CHAIN_STEP_10 START "CHAIN_STEP_LAST" Sequence 2
.state='SUCCEEDED' AND :CHAIN_STEP_2.state='SUCCEE
DED' AND :CHAIN_STEP_3.state='SUCCEEDED' AND :CHAI
N_STEP_4.state='SUCCEEDED' AND :CHAIN_STEP_5.state
='SUCCEEDED' AND :CHAIN_STEP_6.state='SUCCEEDED' A
ND :CHAIN_STEP_7.state='SUCCEEDED' AND :CHAIN_STEP
.state='SUCCEEDED' AND :CHAIN_STEP_9.state='SUCC
EEDED'</span></span></code></span></pre>
<h4 style="text-align: left;">Executing the Chain </h4></div>
<div>Simply enable the job to execute the chain. The job created by this PL/SQL will only execute the chain once because by default it will automatically drop after it completes.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>exec DBMS_SCHEDULER.enable ('test_chain_1_job');</code></span></pre>
<h4 style="text-align: left;">Monitoring the Chain </h4></div>
<div>Oracle also provides views to monitor running jobs and chains. <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/ALL_SCHEDULER_RUNNING_CHAINS.html#GUID-9009F4D1-FD87-4B80-8A7D-55E6DC17F965" target="_blank">ALL_SCHEDULER_RUNNING_CHAINS</a> reports the current status of each step in the chain.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>SELECT owner,job_name,chain_owner,chain_name,step_name,state
FROM all_scheduler_running_chains ORDER BY owner, job_name, chain_name, step_name;
Chain
OWNER JOB_NAME Owner CHAIN_NAME STEP_NAME STATE
---------- ------------------------ ---------- -------------------- ------------------------- ---------------
SYSADM TEST_CHAIN_1_JOB SYSADM TEST_CHAIN_1 CHAIN_STEP_1 SUCCEEDED
SYSADM TEST_CHAIN_1_JOB SYSADM TEST_CHAIN_1 CHAIN_STEP_10 RUNNING
SYSADM TEST_CHAIN_1_JOB SYSADM TEST_CHAIN_1 CHAIN_STEP_2 SUCCEEDED
SYSADM TEST_CHAIN_1_JOB SYSADM TEST_CHAIN_1 CHAIN_STEP_3 SUCCEEDED
SYSADM TEST_CHAIN_1_JOB SYSADM TEST_CHAIN_1 CHAIN_STEP_4 SUCCEEDED
SYSADM TEST_CHAIN_1_JOB SYSADM TEST_CHAIN_1 CHAIN_STEP_5 RUNNING
SYSADM TEST_CHAIN_1_JOB SYSADM TEST_CHAIN_1 CHAIN_STEP_6 SUCCEEDED
SYSADM TEST_CHAIN_1_JOB SYSADM TEST_CHAIN_1 CHAIN_STEP_7 RUNNING
SYSADM TEST_CHAIN_1_JOB SYSADM TEST_CHAIN_1 CHAIN_STEP_8 RUNNING
SYSADM TEST_CHAIN_1_JOB SYSADM TEST_CHAIN_1 CHAIN_STEP_9 SUCCEEDED
SYSADM TEST_CHAIN_1_JOB SYSADM TEST_CHAIN_1 CHAIN_STEP_LAST NOT_STARTED</code></span></pre>
You can also see each completed job and sub-job on <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/ALL_SCHEDULER_JOB_RUN_DETAILS.html#GUID-E87CA539-38E8-41A4-B10B-784308A56F02" target="_blank">ALL_SCHEDULER_RUN_DETAILS</a>.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span><code><span style="font-size: x-small;">select log_id, job_name, job_subname, req_start_date, actual_start_date, log_date, run_duration, output
from all_scheduler_job_run_details
where job_name like 'TEST_CHAIN%'
AND log_date > sysdate-…
order by actual_start_date;
</span><span style="font-size: 60%;"> LOG_ID JOB_NAME JOB_SUBNAME REQ_START_DATE ACTUAL_START_DATE LOG_DATE RUN_DURATION OUTPUT
---------- ------------------------ ------------------------ ------------------------------ ------------------------------ ------------------------------ --------------- ----------------------------------------
7942016 TEST_CHAIN_1_JOB 26/12/2023 15:54:15.720 +00:00 26/12/2023 15:54:18.970 +00:00 26/12/2023 11:03:05.901 -05:00 +000 00:08:47
7941800 TEST_CHAIN_1_JOB CHAIN_STEP_9 26/12/2023 15:54:19.279 +00:00 26/12/2023 15:54:19.494 +00:00 26/12/2023 10:55:49.599 -05:00 +000 00:01:30 TEST_PROCEDURE.CHAIN_STEP_9:90
7941892 TEST_CHAIN_1_JOB CHAIN_STEP_4 26/12/2023 15:54:19.633 +00:00 26/12/2023 15:56:00.613 +00:00 26/12/2023 10:56:40.671 -05:00 +000 00:00:40 TEST_PROCEDURE.CHAIN_STEP_4:40
7941894 TEST_CHAIN_1_JOB CHAIN_STEP_2 26/12/2023 15:54:19.651 +00:00 26/12/2023 15:56:00.615 +00:00 26/12/2023 10:56:20.832 -05:00 +000 00:00:20 TEST_PROCEDURE.CHAIN_STEP_2:20
7941906 TEST_CHAIN_1_JOB CHAIN_STEP_3 26/12/2023 15:54:19.639 +00:00 26/12/2023 15:56:05.946 +00:00 26/12/2023 10:56:36.206 -05:00 +000 00:00:30 TEST_PROCEDURE.CHAIN_STEP_3:30
7941952 TEST_CHAIN_1_JOB CHAIN_STEP_6 26/12/2023 15:54:19.620 +00:00 26/12/2023 15:56:36.438 +00:00 26/12/2023 10:57:36.608 -05:00 +000 00:01:00 TEST_PROCEDURE.CHAIN_STEP_6:60
7941940 TEST_CHAIN_1_JOB CHAIN_STEP_10 26/12/2023 15:54:19.261 +00:00 26/12/2023 15:57:37.626 +00:00 26/12/2023 10:59:17.691 -05:00 +000 00:01:40 TEST_PROCEDURE.CHAIN_STEP_10:100
7942000 TEST_CHAIN_1_JOB CHAIN_STEP_8 26/12/2023 15:54:19.388 +00:00 26/12/2023 15:59:18.696 +00:00 26/12/2023 11:00:38.840 -05:00 +000 00:01:20 TEST_PROCEDURE.CHAIN_STEP_8:80
7942032 TEST_CHAIN_1_JOB CHAIN_STEP_5 26/12/2023 15:54:19.628 +00:00 26/12/2023 16:00:48.614 +00:00 26/12/2023 11:01:38.783 -05:00 +000 00:00:50 TEST_PROCEDURE.CHAIN_STEP_5:50
7942036 TEST_CHAIN_1_JOB CHAIN_STEP_7 26/12/2023 15:54:19.500 +00:00 26/12/2023 16:01:44.374 +00:00 26/12/2023 11:02:54.432 -05:00 +000 00:01:10 TEST_PROCEDURE.CHAIN_STEP_7:70
7942014 TEST_CHAIN_1_JOB CHAIN_STEP_LAST 26/12/2023 16:02:54.564 +00:00 26/12/2023 16:02:59.714 +00:00 26/12/2023 11:03:00.729 -05:00 +000 00:00:01 TEST_PROCEDURE.CHAIN_STEP_LAST:1</span></code></span></pre>
The start and end time of each step are also recorded on the parameter table by TEST_PROCEDURE
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span><code><span style="font-size: x-small;">select * from test_chain
where chain_name = 'TEST_CHAIN_1'
order by seq, resource_units;
</span><span style="font-size: 50%;">
SEQ CHAIN_NAME STEP_NAME PROGRAM_NAME PROGRAM_ACTION Units Prio CONDITION END Secs. BEGINDTTM ENDDTTM
--- --------------- -------------------- ----------------- -------------- ----- ---- -------------------------------------------------- --- ----- ------------------------ ------------------------
1 TEST_CHAIN_1 CHAIN_STEP_1 TEST_PROGRAM_1 TEST_PROCEDURE 1 5 TRUE N 10 26/12/2023 10:54:19.737 26/12/2023 10:54:29.791
1 TEST_CHAIN_1 CHAIN_STEP_2 TEST_PROGRAM_2 TEST_PROCEDURE 2 5 TRUE N 20 26/12/2023 10:56:00.636 26/12/2023 10:56:20.828
1 TEST_CHAIN_1 CHAIN_STEP_3 TEST_PROGRAM_3 TEST_PROCEDURE 3 4 TRUE N 30 26/12/2023 10:56:05.960 26/12/2023 10:56:36.188
1 TEST_CHAIN_1 CHAIN_STEP_4 TEST_PROGRAM_4 TEST_PROCEDURE 4 4 TRUE N 40 26/12/2023 10:56:00.626 26/12/2023 10:56:40.667
1 TEST_CHAIN_1 CHAIN_STEP_5 TEST_PROGRAM_5 TEST_PROCEDURE 5 3 TRUE N 50 26/12/2023 11:00:48.621 26/12/2023 11:01:38.779
1 TEST_CHAIN_1 CHAIN_STEP_6 TEST_PROGRAM_6 TEST_PROCEDURE 6 3 TRUE N 60 26/12/2023 10:56:36.443 26/12/2023 10:57:36.604
1 TEST_CHAIN_1 CHAIN_STEP_7 TEST_PROGRAM_7 TEST_PROCEDURE 7 2 TRUE N 70 26/12/2023 11:01:44.378 26/12/2023 11:02:54.428
1 TEST_CHAIN_1 CHAIN_STEP_8 TEST_PROGRAM_8 TEST_PROCEDURE 8 2 TRUE N 80 26/12/2023 10:59:18.702 26/12/2023 11:00:38.837
1 TEST_CHAIN_1 CHAIN_STEP_9 TEST_PROGRAM_9 TEST_PROCEDURE 9 1 TRUE N 90 26/12/2023 10:54:19.546 26/12/2023 10:55:49.596
1 TEST_CHAIN_1 CHAIN_STEP_10 TEST_PROGRAM_10 TEST_PROCEDURE 10 1 TRUE N 100 26/12/2023 10:57:37.640 26/12/2023 10:59:17.687
2 TEST_CHAIN_1 CHAIN_STEP_LAST TEST_PROGRAM_LAST TEST_PROCEDURE 3 :CHAIN_STEP_1.state='SUCCEEDED' AND :CHAIN_STEP_10 N 1 26/12/2023 11:02:59.722 26/12/2023 11:03:00.725
.state='SUCCEEDED' AND :CHAIN_STEP_2.state='SUCCEE
DED' AND :CHAIN_STEP_3.state='SUCCEEDED' AND :CHAI
N_STEP_4.state='SUCCEEDED' AND :CHAIN_STEP_5.state
='SUCCEEDED' AND :CHAIN_STEP_6.state='SUCCEEDED' A
ND :CHAIN_STEP_7.state='SUCCEEDED' AND :CHAIN_STEP
_8.state='SUCCEEDED' AND :CHAIN_STEP_9.state='SUCC
EEDED'
3 TEST_CHAIN_1 CHAIN_STEP_END 3 :CHAIN_STEP_LAST.state='SUCCEEDED' Y 1</span><span style="font-size: x-small;">
</span></code></span></pre>
<h3>Acknowledgments </h3><div>All of this can be worked out from the <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/admin/scheduling-jobs-with-oracle-scheduler.html#GUID-BF3AB6EB-BC19-4303-9E02-6466804BA119" target="_blank">Oracle documentation</a>, but I have found these pages very helpful:<li>Tim Hall's <a href="http://Oracle-Base.com" target="_blank">Oracle-Base.com</a></li></div><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px;"><div>
<li style="text-align: left;"><a href="https://oracle-base.com/articles/10g/scheduler-enhancements-10gr2#job_chains" target="_blank">Scheduler Enhancements in Oracle 10g Database Release 2: Job Chains</a></li>
</div></blockquote><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px;"><div><li style="text-align: left;"><a href="https://oracle-base.com/articles/12c/scheduler-enhancements-12cr2#scheduler-resource-queues" target="_blank">Scheduler (DBMS_SCHEDULER) Enhancements in Oracle Database 12c Release 2 (12.2): Scheduler Resource Queues</a></li></div></blockquote>
<div><li><a href="https://support.oracle.com/epmos/faces/DocContentDisplay?id=1272728.1" target="_blank">Oracle Support Note 1272728.1: Is It Possible To Assign Chain Parameters To A Scheduler Job?</a></li>
<li><a href="https://asktom.oracle.com/" target="_blank">AskTOM</a></li></div><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px;"><div>
<li style="text-align: left;">Connor McDonald: <a href="https://asktom.oracle.com/ords/asktom.search?tag=how-to-start-a-job-on-two-conditions-with-dbms-scheduler-another-job-has-finished-and-we-are-on-monday" target="_blank">How to start a job on two conditions with DBMS_SCHEDULER : another job has finished and we are on monday?</a></li></div></blockquote><div><p></p></div></div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-13976781711046182832024-01-02T14:23:00.002+00:002024-01-02T14:23:44.285+00:00Controlling the Number of Database Scheduler (DBMS_SCHEDULER) Jobs That Can Execute Concurrently<div style="text-align: left;">The maximum number of database scheduler jobs that can run concurrently on each Oracle instance is primarily controlled by the parameter JOB_QUEUE_PROCESSES. The default value is the lesser of 20*CPU_COUNT or SESSIONS/4. I think 20 jobs per CPU is usually far too high because gives the scheduler the potential to swamp the CPU. Therefore, I usually reduce this parameter, often setting it to the same value as CPU_COUNT, so if you have 10 vCPUs per instance, you can run 10 concurrent jobs on each instance. </div><div style="text-align: left;">However, this is a database-wide parameter. </div><div style="text-align: left;"><ul style="text-align: left;"><li>What if you want to restrict different jobs to a different number of concurrently executing instances? </li><li>Or, you may have a more complex rule where different jobs have different weights? </li></ul></div><div style="text-align: left;">You can create a named resource with <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-3CC944B1-A072-4970-8B27-F68AB7E2D6D9" target="_blank">DBMS_SCHEDULER.CREATE_RESOURCE</a> and give it a certain number of units. Then you can specify the number of units of which resource a particular job consumes with <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-2D8930DD-1042-4FA9-A0C0-2E4C7A7BFE9B" target="_blank">DBMS_SCHEDULER.SET_RESOURCE_CONSTRAINT</a>. This must be done before the job is enabled, and then the job can be enabled afterward. </div><div><div><h4 style="text-align: left;">Test 1: Separate Resources For Each Job</h4><div>In this test: </div><div><ul style="text-align: left;"><li>Each TEST_An job runs for 30 seconds and consumes 2 units of resource A, which has 10 units, so five jobs can run. </li><li>Each TEST_Bn job runs for 30 seconds and consumes 1 unit of resource B, which has 3 units, so three jobs continue. </li><li>The constraints on the two types of jobs are independent.</li></ul></div><div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>BEGIN
<a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-3CC944B1-A072-4970-8B27-F68AB7E2D6D9" target="_blank">DBMS_SCHEDULER.create_resource</a> (
resource_name => 'TEST_RESOURCE_A',
units => 10,
status => 'ENFORCE_CONSTRAINTS',
constraint_level => 'JOB_LEVEL');
DBMS_SCHEDULER.create_resource (
resource_name => 'TEST_RESOURCE_B',
units => 3,
status => 'ENFORCE_CONSTRAINTS',
constraint_level => 'JOB_LEVEL');
END;
/
BEGIN
FOR i IN 1..10 LOOP
<a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-7E744D62-13F6-40E9-91F0-1569E6C38BBC" target="_blank">dbms_scheduler.create_job</a> (
job_name=> 'TEST_A'||i,
job_type=> 'PLSQL_BLOCK',
job_action=> 'BEGIN DBMS_LOCK.SLEEP(30); END;',
start_date=> sysdate,
enabled=> false);
dbms_scheduler.create_job (
job_name=> 'TEST_B'||i,
job_type=> 'PLSQL_BLOCK',
job_action=> 'BEGIN DBMS_LOCK.SLEEP(30); END;',
start_date=> sysdate,
enabled=> false);
<a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-2D8930DD-1042-4FA9-A0C0-2E4C7A7BFE9B" target="_blank">DBMS_SCHEDULER.set_resource_constraint</a> (
object_name => 'TEST_A'||i,
resource_name => 'TEST_RESOURCE_A',
units => 2);
DBMS_SCHEDULER.set_resource_constraint (
object_name => 'TEST_B'||i,
resource_name => 'TEST_RESOURCE_B',
units => 1);
dbms_scheduler.enable('TEST_A'||i);
dbms_scheduler.enable('TEST_B'||i);
END LOOP;
END;
/</code></span></pre>You can see when each job started and finished in <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/ALL_SCHEDULER_JOB_RUN_DETAILS.html" target="_blank">ALL_SCHEDULER_JOB_RUN_DETAILS</a>.<br /><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>set pages 99
column job_name format a8
column status format a10
clear screen
select log_id, log_date, job_name, status, actual_start_date, run_duration
from all_scheduler_job_run_details where job_name like 'TEST%'
and actual_start_date >= TRUNC(SYSDATE)+…/24
order by actual_start_date</code></span></pre></div><div><ul style="text-align: left;"><li>The first five TEST_A jobs and the first 3 TEST_B jobs ran. As the groups all finished after exactly 30s, new groups were run. I've added spacing to illustrate the groups of jobs that run together.</li></ul><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span><code><span style="font-size: x-small;">
</span><span style="font-size: xx-small;"> LOG_ID LOG_DATE JOB_NAME STATUS ACTUAL_START_DATE RUN_DURATION
---------- ------------------------------------ -------- ---------- ------------------------------------------- -------------------
7747242 21/12/2023 06:42:42.106062000 -05:00 TEST_A1 SUCCEEDED 21/12/2023 11:42:11.578114000 EUROPE/LONDON +00 00:00:31.000000
7747244 21/12/2023 06:42:42.105168000 -05:00 TEST_A2 SUCCEEDED 21/12/2023 11:42:11.924307000 EUROPE/LONDON +00 00:00:30.000000
7747246 21/12/2023 06:42:42.615880000 -05:00 TEST_A3 SUCCEEDED 21/12/2023 11:42:12.171116000 EUROPE/LONDON +00 00:00:30.000000
7747248 21/12/2023 06:42:42.615938000 -05:00 TEST_A4 SUCCEEDED 21/12/2023 11:42:12.208987000 EUROPE/LONDON +00 00:00:30.000000
7747250 21/12/2023 06:42:42.615895000 -05:00 TEST_A5 SUCCEEDED 21/12/2023 11:42:12.247785000 EUROPE/LONDON +00 00:00:30.000000
7747210 21/12/2023 06:42:43.680210000 -05:00 TEST_B1 SUCCEEDED 21/12/2023 11:42:13.323724000 EUROPE/LONDON +00 00:00:30.000000
7747212 21/12/2023 06:42:43.681465000 -05:00 TEST_B5 SUCCEEDED 21/12/2023 11:42:13.356243000 EUROPE/LONDON +00 00:00:30.000000
7747214 21/12/2023 06:42:43.680210000 -05:00 TEST_B2 SUCCEEDED 21/12/2023 11:42:13.387883000 EUROPE/LONDON +00 00:00:30.000000
7747276 21/12/2023 06:43:17.947304000 -05:00 TEST_A6 SUCCEEDED 21/12/2023 11:42:47.543438000 EUROPE/LONDON +00 00:00:30.000000
7747278 21/12/2023 06:43:17.947331000 -05:00 TEST_B3 SUCCEEDED 21/12/2023 11:42:47.543510000 EUROPE/LONDON +00 00:00:30.000000
7747280 21/12/2023 06:43:17.949158000 -05:00 TEST_B4 SUCCEEDED 21/12/2023 11:42:47.758469000 EUROPE/LONDON +00 00:00:30.000000
7747282 21/12/2023 06:43:17.947824000 -05:00 TEST_A7 SUCCEEDED 21/12/2023 11:42:47.759084000 EUROPE/LONDON +00 00:00:30.000000
7747284 21/12/2023 06:43:18.457503000 -05:00 TEST_A8 SUCCEEDED 21/12/2023 11:42:47.966750000 EUROPE/LONDON +00 00:00:30.000000
7747286 21/12/2023 06:43:18.457438000 -05:00 TEST_B6 SUCCEEDED 21/12/2023 11:42:48.063658000 EUROPE/LONDON +00 00:00:30.000000
7747320 21/12/2023 06:43:19.008041000 -05:00 TEST_A10 SUCCEEDED 21/12/2023 11:42:48.846141000 EUROPE/LONDON +00 00:00:30.000000
7747322 21/12/2023 06:43:19.008081000 -05:00 TEST_A9 SUCCEEDED 21/12/2023 11:42:48.846239000 EUROPE/LONDON +00 00:00:30.000000
7747332 21/12/2023 06:43:49.215439000 -05:00 TEST_B9 SUCCEEDED 21/12/2023 11:43:19.165493000 EUROPE/LONDON +00 00:00:30.000000
7747334 21/12/2023 06:43:49.729057000 -05:00 TEST_B7 SUCCEEDED 21/12/2023 11:43:19.262625000 EUROPE/LONDON +00 00:00:30.000000
7747336 21/12/2023 06:43:49.726501000 -05:00 TEST_B8 SUCCEEDED 21/12/2023 11:43:19.262675000 EUROPE/LONDON +00 00:00:30.000000
7747290 21/12/2023 06:44:23.992734000 -05:00 TEST_B10 SUCCEEDED 21/12/2023 11:43:53.567006000 EUROPE/LONDON +00 00:00:30.000000</span></code></span></pre><h4 style="text-align: left;">
Test 2: One Resource Used by Two Jobs</h4></div><div>In this second test, both jobs use RESOURCE_A which still has 10 units.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>BEGIN
FOR i IN 1..10 LOOP
dbms_scheduler.create_job (
job_name=> 'TEST_A'||i,
job_type=> 'PLSQL_BLOCK',
job_action=> 'BEGIN DBMS_LOCK.SLEEP(30); END;',
start_date=> sysdate,
enabled=> false);
dbms_scheduler.create_job (
job_name=> 'TEST_B'||i,
job_type=> 'PLSQL_BLOCK',
job_action=> 'BEGIN DBMS_LOCK.SLEEP(30); END;',
start_date=> sysdate,
enabled=> false);
DBMS_SCHEDULER.set_resource_constraint (
object_name => 'TEST_A'||i,
resource_name => 'TEST_RESOURCE_A',
units => 2);
DBMS_SCHEDULER.set_resource_constraint (
object_name => 'TEST_B'||i,
resource_name => 'TEST_RESOURCE_A',
units => 1);
dbms_scheduler.enable('TEST_A'||i);
dbms_scheduler.enable('TEST_B'||i);
END LOOP;
END;
/</code></span></pre>
Now, we can run 5 TEST_A jobs and 10 test B jobs, or a combination. </div><div><ul style="text-align: left;"><li>So initially we had 2 TEST_A jobs (that consume 4 units) and 6 TEST_B jobs (that consume 6 units). This completely consumed RESOURCE_A which only has 10 units. No new jobs that use this resource could start until others were completed.</li><li>Next, we got 4 TEST_A jobs (that consume 8 units) and 2 TEST_B jobs (that consume 2 units), so again this consumed the whole of RESOURCE_A and again no further jobs could run that require this resource until others were completed.
</li></ul></div><div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: xx-small;"><code> LOG_ID LOG_DATE JOB_NAME STATUS ACTUAL_START_DATE RUN_DURATION
---------- ------------------------------------ -------- ---------- ------------------------------------------- -------------------
7747540 21/12/2023 13:40:10.170737000 -05:00 TEST_B1 SUCCEEDED 21/12/2023 18:39:39.718608000 EUROPE/LONDON +00 00:00:30.000000
7747542 21/12/2023 13:40:10.169080000 -05:00 TEST_B2 SUCCEEDED 21/12/2023 18:39:39.971882000 EUROPE/LONDON +00 00:00:30.000000
7747544 21/12/2023 13:40:10.169451000 -05:00 TEST_B3 SUCCEEDED 21/12/2023 18:39:40.012029000 EUROPE/LONDON +00 00:00:30.000000
7747546 21/12/2023 13:40:10.680827000 -05:00 TEST_B4 SUCCEEDED 21/12/2023 18:39:40.258398000 EUROPE/LONDON +00 00:00:30.000000
7747548 21/12/2023 13:40:10.680943000 -05:00 TEST_B5 SUCCEEDED 21/12/2023 18:39:40.300213000 EUROPE/LONDON +00 00:00:30.000000
7747590 21/12/2023 13:40:10.680683000 -05:00 TEST_B6 SUCCEEDED 21/12/2023 18:39:40.343663000 EUROPE/LONDON +00 00:00:30.000000
7747574 21/12/2023 13:40:11.231396000 -05:00 TEST_A1 SUCCEEDED 21/12/2023 18:39:40.730872000 EUROPE/LONDON +00 00:00:30.000000
7747576 21/12/2023 13:40:11.231575000 -05:00 TEST_A5 SUCCEEDED 21/12/2023 18:39:40.786089000 EUROPE/LONDON +00 00:00:30.000000
7747594 21/12/2023 13:40:40.376871000 -05:00 TEST_A2 SUCCEEDED 21/12/2023 18:40:10.271696000 EUROPE/LONDON +00 00:00:30.000000
7747598 21/12/2023 13:40:40.888493000 -05:00 TEST_A6 SUCCEEDED 21/12/2023 18:40:10.679917000 EUROPE/LONDON +00 00:00:30.000000
7747600 21/12/2023 13:40:40.889568000 -05:00 TEST_A7 SUCCEEDED 21/12/2023 18:40:10.680655000 EUROPE/LONDON +00 00:00:30.000000
7747614 21/12/2023 13:40:41.401080000 -05:00 TEST_B10 SUCCEEDED 21/12/2023 18:40:11.304750000 EUROPE/LONDON +00 00:00:30.000000
7747656 21/12/2023 13:40:42.975067000 -05:00 TEST_A3 SUCCEEDED 21/12/2023 18:40:12.598174000 EUROPE/LONDON +00 00:00:30.000000
7747672 21/12/2023 13:40:51.168059000 -05:00 TEST_B9 SUCCEEDED 21/12/2023 18:40:21.061653000 EUROPE/LONDON +00 00:00:30.000000
7747678 21/12/2023 13:41:13.183397000 -05:00 TEST_A4 SUCCEEDED 21/12/2023 18:40:42.789196000 EUROPE/LONDON +00 00:00:30.000000
7747680 21/12/2023 13:41:13.182999000 -05:00 TEST_A8 SUCCEEDED 21/12/2023 18:40:43.093332000 EUROPE/LONDON +00 00:00:30.000000
7747682 21/12/2023 13:41:13.183455000 -05:00 TEST_A9 SUCCEEDED 21/12/2023 18:40:43.093346000 EUROPE/LONDON +00 00:00:30.000000
7747616 21/12/2023 13:41:16.729148000 -05:00 TEST_B7 SUCCEEDED 21/12/2023 18:40:46.287305000 EUROPE/LONDON +00 00:00:30.000000
7747618 21/12/2023 13:41:16.729207000 -05:00 TEST_B8 SUCCEEDED 21/12/2023 18:40:46.287313000 EUROPE/LONDON +00 00:00:30.000000
7747684 21/12/2023 13:41:23.423360000 -05:00 TEST_A10 SUCCEEDED 21/12/2023 18:40:53.317328000 EUROPE/LONDON +00 00:00:30.000000</code></span></pre>
You can also do this if you have a chain of sub-jobs. You would have a program that would be called for each step in the chain, and the resource constraint is applied to the program instead of the job. I will demonstrate this in <a href="/2024/01/job-chains.html">another blog post</a>. </div><div><br /></div><div><i>My thanks to Tim Hall for his blog post <a href="https://oracle-base.com/articles/12c/scheduler-enhancements-12cr2#scheduler-resource-queues" target="_blank">Scheduler (DBMS_SCHEDULER) Enhancements in Oracle Database 12c Release 2 (12.2)</a>.</i></div></div></div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-58682547462045919772023-12-19T10:18:00.003+00:002023-12-19T10:19:17.934+00:00Using Attribute Clustering to Improve Compression, Response Time and CPU Consumption: 2. An Example<p>Attribute Clustering reorders data in a table so that similar data values are clustered together. This can improve both basic and columnar compression, resulting in better response time and lower CPU consumption.</p><p>This is the second of a two-part blog post.</p><p></p><ol><li><a href="https://blog.go-faster.co.uk/2023/12/attribute-clustering1.html">Introduction</a></li><li><a href="https://blog.go-faster.co.uk/2023/12/attribute-clustering2.html">Example and Test Results</a></li></ol><p></p><h3 style="text-align: left;">An Example of Attribute Clustering</h3>This test illustrates the potential benefits of attribute clustering (the scripts are available on <a href="https://github.com/davidkurtz/demoscripts/tree/master/attrib_clustering_example" target="_blank">GitHub</a>). It simulates the fact table in a data warehouse, or in my use case the General Ledger table in a Financials system. The table will have 20 million rows. Each dimension column will randomly have one of 256 distinct values, padded to 8 characters. In this case, the distribution of data values is skewed by the square root function. The alternative commented section produces uniform data. <div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>create table t0(a varchar2(8 char), b varchar2(8 char), c varchar2(8 char), x number);
truncate table t0;
BEGIN
FOR i IN 1..2 LOOP
insert /*+APPEND PARALLEL*/ into t0
select /*+PARALLEL*/
/*--------------------------------------------------------------------------------------------------------------
<i> rPAD(LPAD(LTRIM(TO_CHAR(FLOOR(dbms_random.value(0,255)),'XX')),2,'0'),8,'X') a
, rPAD(LPAD(LTRIM(TO_CHAR(FLOOR(dbms_random.value(0,255)),'XX')),2,'0'),8,'X') b
, rPAD(LPAD(LTRIM(TO_CHAR(FLOOR(dbms_random.value(0,255)),'XX')),2,'0'),8,'X') c
</i>--------------------------------------------------------------------------------------------------------------*/
rPAD(LPAD(LTRIM(TO_CHAR(FLOOR(SQRT(dbms_random.value(0,65535))),'XX')),2,'0'),8,'X') a
, rPAD(LPAD(LTRIM(TO_CHAR(FLOOR(SQRT(dbms_random.value(0,65535))),'XX')),2,'0'),8,'X') b
, rPAD(LPAD(LTRIM(TO_CHAR(FLOOR(SQRT(dbms_random.value(0,65535))),'XX')),2,'0'),8,'X') c
--------------------------------------------------------------------------------------------------------------*/
, dbms_random.value(1,1e6)
from dual connect by level <= 1e7;
COMMIT;
end loop;
end;
/
exec dbms_stats.gather_table_stats(user,'T0');</code></span></pre>
I will create an identical materialized view on that table
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>create table mv(a varchar2(8 char), b varchar2(8 char), c varchar2(8 char), x number);
create materialized view mv on prebuilt table enable query rewrite as select * from t0;</code></span></pre>
For each test, I can set different attributes and then fully refresh the materialized view in non-atomic mode. The various attributes take effect as the materialized view is truncated and repopulated in direct-path mode.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>truncate table MV drop storage;
--------------------------------------------------
rem <i>set compression</i>
--------------------------------------------------
--alter materialized view MV nocompress;
--alter materialized view MV compress;
<b>alter materialized view MV compress for query low;</b>
--------------------------------------------------
rem <i>set in memory</i>
--------------------------------------------------
alter table mv inmemory;
--------------------------------------------------
rem <i>set clustering and number of clustering columns</i>
--------------------------------------------------
<b>alter table mv drop clustering;</b>
--alter table mv add clustering by interleaved order (b);
<b>alter table mv add clustering by interleaved order (b, c);</b>
--alter table mv add clustering by interleaved order (b, c, a);
--------------------------------------------------
exec dbms_mview.refresh('MV',atomic_refresh=>FALSE);
exec dbms_inmemory.repopulate(user,'MV');</code></span></pre>
Then I can see how large the physical and In-Memory segments are.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>select * from user_mviews where mview_name = 'MV';
select table_name, tablespace_name, num_rows, blocks, compression, compress_for, inmemory, inmemory_compression
from user_tables where table_name IN('MV','T0');
select segment_name, segment_type, tablespace_name, bytes/1024/1024 table_MB, blocks, extents, inmemory, inmemory_compression
from user_Segments where segment_name IN('MV','T0');
with x as (
select segment_type, owner, segment_name, inmemory_compression, inmemory_priority
, count(distinct inst_id) instances
, count(distinct segment_type||':'||owner||'.'||segment_name||'.'||partition_name) segments
, sum(inmemory_size)/1024/1024 inmemory_mb, sum(bytes)/1024/1024 tablespace_Mb
from gv$im_segments i
where segment_name = 'MV'
group by segment_type, owner, segment_name, inmemory_compression, inmemory_priority)
select x.*, inmemory_mb/tablespace_mb*100-100 pct from x
order by owner, segment_type, segment_name
/</code></span></pre>
I will use a simple test query to see how the performance changes
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>select b,c, count(a), sum(x) from t0 where b='2AXXXXXX' group by b,c fetch first 10 rows only;
</code></span></pre>
I tested <div><ul style="text-align: left;"><li>Uniformly distributed data -v- skewed data</li><li>Without table compression -v- basic compression -v- Hybrid Columnar Compression (HCC)</li><li>No attribute clustering -v- interleaved clustering on 1, 2, and 3 columns
</li></ul><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>Table Tablespace
Name Name NUM_ROWS BLOCKS COMPRESS COMPRESS_FOR INMEMORY INMEMORY_COMPRESS
---------- ---------- ---------- ---------- -------- ------------------------------ -------- -----------------
MV PSDEFAULT 20000000 60280 ENABLED QUERY LOW ENABLED FOR QUERY LOW
T0 PSDEFAULT 20000000 150183 DISABLED DISABLED
Tablespace Table
Segment Na Segment Ty Name MB BLOCKS EXTENTS INMEMORY INMEMORY_COMPRESS
---------- ---------- ---------- ---------- ---------- ---------- -------- -----------------
MV TABLE PSDEFAULT 472.0 60416 130 ENABLED FOR QUERY LOW
T0 TABLE PSDEFAULT 1,220.0 156160 203 DISABLED
In Memory Tablespace
Segment Ty OWNER Segment Na INMEMORY_COMPRESS INMEMORY INSTANCES SEGMENTS MB MB PCT
---------- -------- ---------- ----------------- -------- ---------- ---------- ---------- ---------- ----------
TABLE SYSADM MV FOR QUERY LOW NONE 2 1 829.2 936.6 -11.4662752</code></span></pre>
With query rewrite on the materialized view and the materialized view in the In-Memory store, we see Oracle rewrite the query from the underlying table to the materialized view and then to an in-memory query.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>select b,c, sum(x) from t0 where b='2AXXXXXX' group by b,c;
Plan hash value: 389206685
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 182 | 7280 | 876 (31)| 00:00:01 |
| 1 | HASH GROUP BY | | 182 | 7280 | 876 (31)| 00:00:01 |
|* 2 | MAT_VIEW REWRITE ACCESS INMEMORY FULL| MV | 78125 | 3051K| 875 (31)| 00:00:01 |
-----------------------------------------------------------------------------------------------</code></span></pre>
<h3 style="text-align: left;">Test Results</h3><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNPZa3Y7mCvBAgYG6RtxAJpmSPu46OwwUN8nob5-h_wmRHbRjz-x9v_Utbmedry5009XWYr_RBCYSTVNQpnUo8SDLArLpi2ukY6eZohz2UYZUMvbnYfNmMWEJSd_DDNeELyYgZakVgJA7FJV5WOZBptbgjo5GX0NUgbidRXhyzHUz3g9eE3kHoXw/s1007/attribute_clustering_example.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="627" data-original-width="1007" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNPZa3Y7mCvBAgYG6RtxAJpmSPu46OwwUN8nob5-h_wmRHbRjz-x9v_Utbmedry5009XWYr_RBCYSTVNQpnUo8SDLArLpi2ukY6eZohz2UYZUMvbnYfNmMWEJSd_DDNeELyYgZakVgJA7FJV5WOZBptbgjo5GX0NUgbidRXhyzHUz3g9eE3kHoXw/w640-h398/attribute_clustering_example.png" width="640" /></a></div><div><h3 style="text-align: left;">Conclusions</h3></div><div><ul style="text-align: left;"><li>Without any table compression, attribute clustering does not affect the size of the table on the tablespace, but the size of the table in the In-Memory store is reduced, and query performance is improved.</li><li>With either basic or Hybrid Columnar compression, attribute clustering reduces the size of the table both in the tablespace and in the in-memory store.</li><li>All forms of compression and attribute clustering increase the duration of the materialized view refresh. Degradation of the refresh due to clustering was less severe with HCC than with either no compression or simple compression.</li><li>I found that query performance degraded when using interleaved clustering in combination with simple compression although this resulted in a smaller in-memory segment than HCC, but performance improved with HCC.</li><li>Uniform data compressed marginally better than skewed. Otherwise, they produced very similar results. </li><li>You do not have to take advantage of compression on the physical segment to take advantage of the compression in In-Memory, but you may get better performance if you do.</li><li>With this test data set, optimal performance was achieved when clustering on 2 dimension columns. When clustering on all three columns I obtained worse compression and query performance. This varies with the data. With real-world data, I have had examples with better compression and performance with the maximum of 4 clustering column groups. Generally, the best performance corresponds to the attribute clustering that gives the best columnar compression. This is not always the case for simple compression.
</li></ul></div></div></div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-70775789092754842652023-12-19T10:10:00.000+00:002023-12-19T10:10:16.244+00:00Using Attribute Clustering to Improve Compression, Response Time and CPU Consumption: 1. IntroductionAttribute Clustering reorders data in a table so that similar data values are clustered together. This can improve both basic and columnar compression, resulting in better response time and lower CPU consumption.<p>This is the first of a two-part blog post.</p><p></p><ol style="text-align: left;"><li><a href="https://blog.go-faster.co.uk/2023/12/attribute-clustering1.html">Introduction</a></li><li><a href="https://blog.go-faster.co.uk/2023/12/attribute-clustering2.html">Example and Test Results</a></li></ol><p></p><h3 style="text-align: left;">Use Case</h3><p>I am working on a Financials system running on an engineered system. It runs a daily batch of GL reports on summary ledgers that have unindexed materialized views. The materialized views are also hybrid column compressed (HCC) to reduce their size and improve reporting performance. </p><p>We also put the materialized views into the <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/inmem/intro-to-in-memory-column-store.html#GUID-BFA53515-7643-41E5-A296-654AB4A9F9E7" rel="nofollow" target="_blank">In-Memory</a> store. Initially, we used 'free' base-level In-Memory and worked within the 16Gb/instance limit. Having moved to <a href="https://docs.oracle.com/en/engineered-systems/exadata-cloud-at-customer/" rel="nofollow" target="_blank">Exadata Cloud@Customer</a>, we can use the fully licensed version of In-Memory.</p><p>Now I have introduced Attribute Clustering for the materialized views.</p><h3 style="text-align: left;"><a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/dwhsg/attribute-clustering.html#GUID-7B007A3C-53C2-4437-9E71-9ECECF8B4FAB" rel="nofollow" target="_blank">Attribute Clustering</a></h3><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/dwhsg/img/dwhsg136.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img alt="Attribute Clustering" border="0" data-original-height="386" data-original-width="602" height="205" src="https://docs.oracle.com/en/database/oracle/oracle-database/19/dwhsg/img/dwhsg136.png" title="Oracle 19c Data Warehousing Guide: 13.1.3. An Attribute Clustered Table" width="320" /></a></div>Attribute Clustering has been available on <a href="https://docs.oracle.com/cd/E55822_01/DBLIC/editions.htm#DBLIC116" target="_blank">Enterprise Edition since Oracle 12.1.0.2</a>. Data is clustered in close physical proximity according to certain columns. <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/dwhsg/attribute-clustering.html#GUID-F22430E9-5A2E-4128-B91A-63414121BF88" rel="nofollow" target="_blank">Linear Ordering</a> stores the data according to the order of the specified clustering columns. <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/dwhsg/attribute-clustering.html#GUID-C10C10DF-DB77-4F40-B4CA-7F5612DA5CE4" rel="nofollow" target="_blank">Interleaved Ordering</a> uses a <a href="https://en.wikipedia.org/wiki/Z-order_curve" rel="nofollow" target="_blank">Z-order</a> curve to cluster data in multiple dimensions (this graphic is from <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/dwhsg/attribute-clustering.html#GUID-CFA30358-183D-4770-9A79-C6720BF9D753" rel="nofollow" target="_blank">Oracle's documentation</a>).<p></p><p>The GL reports have multiple combinations of different predicates. Therefore, as recommended by Oracle, we used interleaved ordering. Linear ordering is not suitable in this case because there is no single suitable order for each table. Linear ordering also caused the runtime of the materialized view refresh to extend much more than interleaved ordering as it has to sort the data.</p><p>We have not introduced Zone Maps. That is to say that after testing, we removed them. Zone maps can be thought of as a coarse index of the zones in the attribute clustering, and would normally be expected to improve the access of the data. You can see them being used in the execution plans to access the table both in the tablespace and in the In-Memory store. However, our application dynamically generates a lot of SQL and therefore performs a lot of SQL parse. We found that the additional work to process the zone map significantly degraded performance.</p><p>Attribute Clustering is not enforced for every DML operation. It only affects direct-path insert operations, data movement, or table creation. It is easy to implement it for segments that are already HCC, which also relies on direct-path operations. The materialized views were created to introduce HCC, hence they are refreshed in non-atomic mode which truncates and repopulates them in direct-path mode. Thus attribute clustering specified on the materialized views will be implemented as they refresh.</p><p>Historical, and therefore static, partitions in the ledger tables are marked for HCC, and we schedule an online rebuild to compress them. Now, that will also apply attribute clustering. This process could be automated with <a href="https://blogs.oracle.com/dbstorage/post/implementing-an-automated-compression-tiering-and-storage-tiering-solution-using-automatic-data-optimization" target="_blank">Automatic Storage Compression</a>.</p><h3 style="text-align: left;">Compression</h3><p>Simply by storing similar data values together, we obtained better compression from HCC. The tables underlying the materialized views were smaller. </p><p>In-Memory also uses columnar compression. Attribute clustering produced a reduction in the size of segments in the In-Memory store. If we were still working within the constraints of Base-Level In-Memory, we would have been able to store more segments in In-Memory.</p><p>We are using attribute clustering not to directly improve data access, but to harness a secondary effect, that of improved compression. We are seeing a reduction in the runtime of the reports. Most of the database time is already spent on the CPU (as most of the fact tables are in In-Memory), so this translates to a reduction in CPU consumption. We can consider running more reports concurrently to complete the batch earlier. We can also consider reducing the number of CPUs and therefore reduce cloud subscription costs.</p><div>The <a href="https://blog.go-faster.co.uk/2023/12/attribute-clustering2.html">second part of this blog</a> will show an example test script and results.</div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-67246327359643725602023-11-27T16:58:00.000+00:002023-11-27T16:58:04.769+00:00Database Constraints Enforced but not Validated for New Data Now, but Not Existing DataRecently, while discussing a problem, somebody said to me '<i>I would like to make this column NOT NULL to stop [this problem] from occurring, but first I would need to go back and fix all the historical data</i>'. <div>In Oracle, a constraint that is enabled will apply during DML, so as you insert or update the data, the constraint is applied to the rows that are being updated. Only when a constraint is validated, does the database check that all the data in the table conforms to the constraint. </div><div><span style="font-size: x-small;">See also <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/constraint.html#GUID-1055EA97-BA6F-4764-A15F-1024FD5B6DFE__I1010237" rel="nofollow" target="_blank">Oracle 19c SQL Language Reference: Common SQL DDL Clauses: constraint enable clause</a>.</span></div><div>If you create a constraint that is enforced, but not validated, you may be able to prevent a problem from getting worse while you fix the data you already have.</div><h3 style="text-align: left;">A Demonstration </h3><div>I will create a table with a unique constraint and two other columns that are NOT NULL. I put some data in the tables. Column B is null for some of the rows, but not for others.<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>create table t (a number, b number, c number, constraint t_pk primary key(A));
insert into t (a,b)
select level, CASE WHEN MOD(level,2)=1 then 1 end --B is null on alternate rows
from dual connect by level<=10;
select * from t;
A B C
---------- ---------- ----------
1 1
2
3 1
4
5 1
6
7 1
8
9 1
10
10 rows selected.</code></span></pre>
I would like to make B a NOT NULL column (to stop the application from writing an invalid value to the database), but cannot because I already have some invalid values in the database.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>SQL> alter table t modify b not null ;
ORA-02296: cannot enable (SCOTT.) - null values found
02296. 00000 - "cannot enable (%s.%s) - null values found"
*Cause: an alter table enable constraint failed because the table
contains values that do not satisfy the constraint.
*Action: Obvious</code></span></pre>
However, I can create the constraint with the NOVALIDATE option.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>SQL> alter table t modify b not null novalidate;
Table T altered.
SQL> select constraint_name, search_condition_vc from user_constraints where table_name = 'T';
CONSTRAINT_NAME SEARCH_CONDITION_VC
--------------- ------------------------------------------------------------
T_PK
SYS_C00221684 "B" IS NOT NULL</code></span></pre>
Note that at the moment column B is not described as NOT NULL because although the constraint is enforced, it has not been validated.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>SQL> desc t
Name Null? Type
---- -------- ------
A NOT NULL NUMBER
B NUMBER
C NUMBER</code></span></pre>
If I try to add more rows where some of the data in column B is null, the constraint prevents it.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>SQL> insert into t (a,b)
2 select level+10, CASE WHEN MOD(level,2)=1 then 1 end
3 from dual connect by level<=10;
ORA-01400: cannot insert NULL into ("SCOTT"."T"."B")</code></span></pre>
I can set B column to a NOT NULL value, but I cannot set it back to null.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>SQL> update t set b = 2 where a=2;
1 row updated.
SQL> update t set b = NULL where a=2;
ORA-01407: cannot update ("SCOTT"."T"."B") to NULL</code></span></pre>
I can successfully update a different column on a row where B is null and therefore does not meet the constraint. I do not get an error because I have not updated it. <pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>SQL> SQL> update t set c = a;
10 rows updated.</code></span></pre>
Eventually, I will want to validate the constraint so that I know that B has a not null value in all the rows. But I can't do it while there are still some null values
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span><code><span style="font-size: x-small;">SQL> select * from t;
A B C
---------- ---------- ----------
1 1 1
2 2 2
3 1 3
4 4
5 1 5
6 6
7 1 7
8 8
9 1 9
10 10
</span><span style="font-size: xx-small;">SQL> DECLARE
2 l_sql CLOB;
3 BEGIN
4 FOR I IN (select * from user_constraints where table_name = 'T' AND constraint_type ='C'
5 and validated != 'VALIDATED' and search_condition_vc = '"B" IS NOT NULL') LOOP
6 l_sql := 'alter table '||i.table_name||' modify constraint '||i.constraint_name||' VALIDATE';
7 dbms_output.put_line(l_sql);
8 EXECUTE IMMEDIATE l_sql;
9 END LOOP;
10 END;
11 /</span><span style="font-size: x-small;">
alter table T modify constraint SYS_C00221684 VALIDATE
ORA-02293: cannot validate (SCOTT.SYS_C00221684) - check constraint violated
ORA-06512: at line 7
ORA-06512: at line 7
02293. 00000 - "cannot validate (%s.%s) - check constraint violated"
*Cause: an alter table operation tried to validate a check constraint to
populated table that had nocomplying values.
*Action: Obvious</span></code></span></pre>First I have to fix the data, and then I can validate the constraint
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>SQL> UPDATE t set b=a where b is null;
4 rows updated.
SQL> REM now validate the constraint
SQL> BEGIN
2 FOR I IN (select * from user_constraints where table_name = 'T' AND constraint_type ='C'
3 and validated != 'VALIDATED' and search_condition_vc = '"B" IS NOT NULL') LOOP
4 EXECUTE IMMEDIATE 'alter table '||i.table_name||' modify constraint '||i.constraint_name||' VALIDATE';
5 END LOOP;
6 END;
7 /
PL/SQL procedure successfully completed.</code></span></pre>
Only now that the new constraint has been validated is the column described as NOT NULL.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>SQL> desc t
Name Null? Type
---- -------- ------
A NOT NULL NUMBER
B NOT NULL NUMBER
C NUMBER</code></span></pre>
</div></div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/00468908370233805717noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-45669230207381575732023-06-21T12:17:00.000+01:002023-06-21T12:17:40.781+01:00The Goal<div><table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right;"><tbody><tr><td style="text-align: center;"><a href="https://upload.wikimedia.org/wikipedia/en/0/0e/The-goal-bookcover.jpg" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" data-original-height="346" data-original-width="237" height="200" src="https://upload.wikimedia.org/wikipedia/en/0/0e/The-goal-bookcover.jpg" width="137" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><br /></td></tr></tbody></table></div><div>One of my favourite books on Oracle performance, "<a href="https://www.oreilly.com/library/view/optimizing-oracle-performance/059600527X/" target="_blank">Optimizing Oracle Performance</a>" by Cary Millsap & Jeff Holt, introduced me to "<a href="https://en.wikipedia.org/wiki/The_Goal_(novel)" target="_blank">The Goal</a>" by Eli Goldratt and Jeff Cox. </div><div>The Goal is all about performance, without being anything to do with computers. It is a story of a man who has to save his manufacturing plant from closure by making it profitable. The language is about manufacturing, but it applies to any system of processes, including any software application and your Oracle (or any other) database! </div><div>Recently, I was checking a quote from it, and I ended up reading it again. It is 20 years since I first read these two books. They completely changed how I thought about performance. Both remain as valid today as they were then. </div><div>It is good to be reminded of these fundamental principles every now and then.</div><div><blockquote><i>"So this is the goal: To make money by increasing net profit, while simultaneously increasing return on investment, and simultaneously increasing cash flow." </i></blockquote></div><div><blockquote><i>"There are three measurements which express the goal of making money ... throughput, inventory and operational expense"</i></blockquote></div><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px; text-align: left;"><div><blockquote><i>"Throughput is the rate at which the system generates money through sales.</i></blockquote></div><div><blockquote><i>Inventory is all the money that the system has invested in purchasing things which it intends to sell.</i></blockquote></div><div><blockquote><i>Operational expense is all the money the system spends in order to turn inventory into throughput."</i></blockquote></div></blockquote><div><blockquote><i>"A plant in which everyone is working all the time is very inefficient."</i></blockquote></div><div><blockquote><i>"A bottleneck is any resource whose capacity is equal to or less than the demand placed upon it. And a non-bottleneck is any resource whose capacity is greater than the demand placed on it."</i></blockquote></div><div><blockquote><i>"What does lost time on a bottleneck mean? It means you have lost throughput."</i></blockquote></div><div><blockquote><i>"The capacity of a plant is equal to the capacity of its bottlenecks."</i></blockquote></div><div><blockquote><i>"A system of local optimums is not an optimum system at all; it is a very inefficient system."</i></blockquote></div><div></div><blockquote><div><i>"An hour lost at a bottleneck is an hour lost for the entire system.<br />An hour saved at a non-bottleneck is worthless."</i></div></blockquote><div></div><blockquote><div><i>"1. IDENTIFY the system's constraint(s).</i></div><div><i>2. Decide how to EXPLOIT the system's constraint(s).</i></div><div><i>3. SUBORDINATE everything else to the above decision.</i></div><div><i>4. ELEVATE the system's constraint(s).</i></div><div><i>5. WARNING!!!! If in the previous steps, a constraint has been broken, go back to step 1, but do not allow INERTIA to cause a system's constraint."</i></div></blockquote><div><blockquote><i>"I started to have a very good guideline; if it comes from cost accounting it must be wrong."</i></blockquote><p>Performance optimisation is sometimes viewed as a black art. It is not. Instead, like detection, it <i>"is, or ought to be, an exact science, and should be treated in the same cold and unemotional manner"</i>. </p><p></p><ul style="text-align: left;"><li>"<a href="https://en.wikipedia.org/wiki/The_Goal_(novel)" target="_blank">The Goal</a>" explains the general principles.</li><li>"<a href="https://www.oreilly.com/library/view/optimizing-oracle-performance/059600527X/" target="_blank">Optimizing Oracle Performance</a>" applies them to the Oracle database.</li></ul><p></p></div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-69742660204707231352023-06-15T13:28:00.001+01:002023-11-03T17:36:23.759+00:00More Bang for your Buck in the Cloud with Resource Manager<p>Much of the cost in database IT is tied to the number of CPUs. Oracle database licencing is priced per CPU. The dominant factor in determining your cloud subscription cost is also CPU, although, disk, memory, and network can also be a cost factor. </p><p>That incentivises you to minimise your CPU. I believe it is inevitable that cloud systems will be configured with fewer CPUs and it will become more common to see them running either close to or beyond the point of having 0% idle CPU. In fact, I'll go further: </p><p><i><b>In the cloud, if your system is not constrained by CPU, at least some of the time, you are probably spending too much money on renting too many CPUs.</b></i></p><h3 style="text-align: left;">What happens to an Oracle database when it runs out of CPU?</h3><p>The resource manager has been part of the Oracle database since 8i, but in my experience, it is rarely used.</p>
<p>Every process has to demand CPU and if necessary wait on the CPU run queue. If you don't have a resource manager plan, then all the Oracle processes will have equal priority on that queue. The resource manager will not intervene. </p><p>However, not all processes are created equal. Instead, the users of an application will consider some things more important or urgent than others. Some processes are on a critical path to delivering something by a deadline, while others can wait. That implies a hierarchy of priority. A resource manager plan allocates CPU to higher priority processes over low priority within the constraint of a minimum guaranteed CPU allocation and can restrict the degree of parallelism. </p>
<p>Note that "By default, all predefined maintenance windows use the resource plan DEFAULT_MAINTENANCE_PLAN". When you introduce your own resource manager plan you don't need to alter the predefined windows.</p><p>A resource manager plan that reflects the business priorities can enable a system to meet its objectives with fewer resources, particularly CPU resources. In a cloud system, using fewer resources, particularly CPU resources, will tend to save money on cloud subscription costs.</p><h3 style="text-align: left;">A User Story</h3><p>Let me tell you about a PeopleSoft Financials system at an insurance company. Like all insurance companies, they like to slice and dice their General Ledger in lots of different ways and produce lots of reports every night.</p><p>Data flows through the system from GL transaction processing via summary ledgers on which materialized views are built and then reports are run</p><p><i>Transactions -> Post to Ledger -> Summary Ledgers -> Materialised Views -> Reports</i></p><p>A fundamentally important thing this company did was to provide a quantitative definition of acceptable performance. </p><p></p><ul style="text-align: left;"><li>"GL reports must be finished by the time continental Europe starts work at 8am CET / 2am EST"</li><li>"Without making the system unavailable to Asia/Pac users" </li>
<li>"At night (in the US), some other things can wait, but need to be available at the start of the US working day."</li></ul><p></p><p>They were running on a two-node RAC database on an engineered system, on-premises. When the overnight GL batch was designed and configured on the old hardware, parallelism was increased until it consumed the entire box.</p><p>The system has now moved to an Exadata cloud-at-customer machine. It is still a two-node RAC cluster. We have a choice of up to 10 OCPUs (20 virtual CPUs) per node. During testing, we progressively reduced the CPU count until we could only just meet that target. Every time we reduce the CPU by 1 OCPU on each of the two nodes, we reduced the cost of the cloud subscription by approximately US$2000/month.</p><p>Implicit in that statement of adequate performance is also a statement of what is important to the business. We started to create a hierarchy of processes. </p><p></p><ul style="text-align: left;"><li>If the business is waiting on the output of a process then that is a high-priority process that is guaranteed a high proportion of available CPU. </li><li>If a process is finished before the business needs it then it has a lower priority. For example, a set of processes was building reporting tables that were not needed until the start of the US working day, so their start time was pushed back, and they were put in a lower prior consumer group that also restricted their degree of parallelism. </li></ul><p></p><p>Sometimes, it can be hard to determine whether the users are waiting and whether the performance is adequate, but usually, they will tell you! However, with an overnight batch process, it is straightforward. If it is outside office hours, then the users aren't waiting for it, but it needs to be there when they come into the office in the morning.</p><p>Like so many physical things in life, nearly everything that happens in computing involves putting a task on a queue and waiting for it to come back. Most computer systems are chains of inbound and outbound queues. On the way other requests for resources may be invoked that also have to be queued. Ultimately, every system is bound by its resources. On a computer that is CPU, memory, disk, and network. A critical process whose performance is degraded, because it is not getting enough of the right kind of resource, becomes a bottleneck.</p><h4 style="text-align: left;">"Time lost at a bottleneck is lost across the system." </h4><p>One of my favourite books on Oracle performance is <a href="https://www.oreilly.com/library/view/optimizing-oracle-performance/059600527X/" target="_blank">Optimizing Oracle Performance</a> by Cary Millsap & Jeff Holt. It introduced me to another book, <a href="https://en.wikipedia.org/wiki/The_Goal_(novel)" target="_blank">The Goal</a> by Eli Goldratt and Jeff Cox. Its central theme is the nature of bottlenecks, otherwise called constraints. "A bottleneck is any resource whose capacity is equal to or less than the demand placed upon it."</p><p>It is all about performance, without being anything to do with computers. It is a Socratic case study of how to implement the 5-step strategy dubbed "The Theory of Constraints" to improve the performance of a system. The five steps are set out plainly and then again in another book by Goldratt "<a href="https://www.amazon.co.uk/Constraints-Implemented-Eliyahu-Goldratt-1990-06-02/dp/B01K2ERNG2" target="_blank">What is This Thing Called Theory of Constraints and How Should It Be Implemented?</a>"</p>
<li>IDENTIFY the system's constraint(s).</li>
<li>Decide how to EXPLOIT the system's constraint(s).</li>
<li>SUBORDINATE everything else to the above decision.</li>
<li>ELEVATE the system's constraint(s).</li>
<li>WARNING!!!! If in the previous steps, a constraint has been broken, go back to step 1, but do not allow INERTIA to cause a system's constraint.</li>
<p></p><p>In the factory in The Goal, the goal is to increase throughput while simultaneously reducing inventory and operating expense.</p><p>In the cloud, the goal is to increase system throughput while simultaneously reducing response time and the cost of resources.</p><h3 style="text-align: left;">The Resource Manager Plan</h3><p>The hierarchy of processes then determines who should get access to the CPU in preference to whom. It translates into a database resource manager plan. This is the 4<sup><span style="font-size: xx-small;">th</span></sup> of Goldratt's 5 steps. The higher priority processes are on the critical processing path get precedence for CPU so that they can make process. The lower-priority processes may have to wait for CPU so they don't impede higher-priority processes (this is the 3<sup><span style="font-size: xx-small;">rd</span></sup> step).</p><p>The resource plan also manages the degree of parallelism that can be used within each consumer group, so that we don't run out of parallel query servers. Higher-priority processes may not have a high PQ limit because there are more processes that run concurrently. Processes are mostly allocated to consumer groups through mappings of module, action, and program name, some are mapped explicitly using triggers.</p><p>Over the years, the resource manager plan for this particular system has gone through three main design iterations. The 4 lowest-priority consumer groups were added to restrict the consumption of these groups when the higher groups were active.</p>
<p></p>
<table border="1" bordercolor="#808080" cellspacing="0" style="width: 100%;" valign="top">
<tbody><tr><th style="width: 8%;">Priority</th><th style="width: 16%;">1<sup><span style="font-size: xx-small;">st</span></sup> Iteration</th><th style="width: 17%;">2<sup><span style="font-size: xx-small;">nd</span></sup> Iteration</th><th style="width: 14%;">3<sup><span style="font-size: xx-small;">rd</span></sup> Iteration</th><th>Description of Consumer Group</th></tr>
<tr><td>1</td><td>PSFT_GROUP</td><td></td><td></td><td>General group for PeopleSoft application and batch processes.<br />PQ limit = ½ of CPU_COUNT (3<sup><span style="font-size: xx-small;">rd</span></sup> iteration)</td></tr>
<tr><td>2</td><td></td><td></td><td>HIGH_GROUP</td><td>For weekly stats collection process of 2 multi-billion row tables (LEDGER and JRNL_LN).<br />PQ limit = 2x CPU_COUNT</td></tr>
<tr><td>3</td><td>SUML_GROUP</td><td></td><td></td><td>Process that refresh summary ledger tables, and MVs on summary ledgers.<br />PQ limit = ¾ of CPU_COUNT</td></tr>
<tr><td>4</td><td>NVISION<br />_GROUP</td><td></td><td></td><td>nVision General Ledger reporting processes.<br />PQ limit ≈ 3/8 of CPU_COUNT</td></tr>
<tr><td>5</td><td></td><td></td><td>GLXX_GROUP</td><td>Processes that build GLXX reporting tables, and do some reporting.
Parallelism disabled. Run concurrently with nVision, but more important to complete GL reporting.<br />PQ limit = 1. No parallelism</td></tr>
<tr><td>6</td><td></td><td>PSQUERY<br />_GROUP</td><td>NVSRUN<br />_GROUP</td><td>Other queries submitted via PeopleSoft ad-hoc query tool and ad-hoc nVision<br />PQ limit = 3 - 4
</td></tr>
<tr><td>7</td><td></td><td>ESSBASE<br />_GROUP</td><td></td><td>Essbase processes.<br />PQ limit = 2 - 4</td></tr>
<tr><td>8</td><td></td><td></td><td>LOW_GROUP,<br />LOW_LIMITED<br />_GROUP</td><td>Other Processes.<br />Also deals with an Oracle bug that causes AQ$_PLSQL_NTFN% jobs to run continuously consuming CPU.<br />Actual/Estimated Time Limit</td></tr>
</tbody></table>
<p>This approach has certainly prevented the processes in GLXX_GROUP, ad-hoc queries in the PSQUERY_GROUP, and other processes in the LOW_GROUP from taking CPU away from critical processes in PSFT_GROUP, NVISION_GROUP and SUML_GROUP. We also adjusted the configuration of the application to reduce the number of processes that can run concurrently.</p><h4 style="text-align: left;">What if we decide to change the number of CPUs </h4><p>When this system ran on an on-premises machine we had a single resource plan because the number of CPUs was fixed. </p><p>Now it has moved to the cloud, we can choose how many CPUs to pay for. Performance was tested with various configurations. Consequently, we have created several different resource plans for different numbers of CPUs with different PQ limits. When we change the number of CPUs we just specify the corresponding resource manager plan. </p><p>Some other database parameters have been set to lower non-default values to restrict overall SQL parallelism and the number of concurrent processes on the database job scheduler. These are also changed in line with the number of CPUs.</p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>alter system set RESOURCE_MANAGER_PLAN=PSFT_PLAN_CPU8 scope=both sid='*';
alter system set JOB_QUEUE_PROCESSES=8 scope=both sid='*';
alter system set PARALLEL_MAX_SERVERS=40 scope=both sid='*';
alter system set PARALLEL_SERVERS_TARGET=40 scope=both sid='*';</code></span></pre><p>It is possible that in the future we might automate changing the number of CPUs by schedule. It is then easy to switch resource manager plans by simply setting an initialisation parameter. </p><p>At the moment, we have one plan in force at all times. It is also possible to change plans on a schedule using scheduler windows, and you can still intervene manually by opening a window.</p><h3 style="text-align: left;">TL;DR In the Cloud, Performance is Instrumented as Cost</h3><p><i><b>You can have as much CPU and performance as you are willing to pay for.</b></i></p><p>By configuring the resource manager to prioritise CPU allocation to high-priority processes, ones for which users are waiting, over lower-priority ones, a system can achieve its performance objectives while consuming fewer resources.</p><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-51128631223364580152023-04-17T08:46:00.001+01:002023-06-02T14:52:20.869+01:00Investigating Unfamiliar PL/SQL with the Hierarchical ProfilerThe PL/SQL profiler can tell you how much time you spend where in your PL/SQL code (see my presentation <a href="https://www.go-faster.co.uk/p/tuning-with-plsql-performance-profiler.html" target="_blank">Performance Tuning with the PL/SQL Performance Profilers</a>). Therefore it also tells you which code blocks were executed, and in which test run. If you are debugging code with which you are unfamiliar, this can provide insight into where to focus attention and determine what is going on.<h4 style="text-align: left;">Example Problem</h4><p>I was looking at a third-party application that uses the database job scheduler to run multiple concurrent batch jobs. I needed to work out why they were not balancing properly between the database instances. This application has its own scheduler package. It is driven by metadata to define the jobs to be submitted. This package then calls the delivered DBMS_SCHEDULER package. However, this third-party package is quite complicated, there are lots of similar calls, and it is difficult to work out what was executed by just reading the code.</p><p>I ran the application having enabled the hierarchical profiler, <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_HPROF.html#GUID-D4E1F35B-25D9-45A2-8CB8-4E174931380A" target="_blank">DBMS_HPROF</a>. I was able to query the profiler tables to find the calls to DBMS_SCHEDULER that were executed.</p><h4 style="text-align: left;">Querying the Profiler Tables</h4><p>Each time DBMSHP is run, the data is tagged with a separate run ID, so if I do different tests I can easily separate them. However, in this example, I have run only one test.</p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>SELECT *
FROM dbmshp_runs
ORDER BY runid;
Run
ID RUN_TIMESTAMP TOTAL_ELAPSED_TIME RUN_COMMENT TRACE_ID
---- ------------------------------ ------------------ -------------------------------------------------- ----------
3 28-MAR-23 19.23.20.891595000 78254498 2</code></span></pre>
Usually, I am interested in improving performance, so I look for the code that took the most time and profile the code blocks by elapsed time. However, this time, I have sorted them by module and line number so I can see which code blocks were executed.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: 70%;"><code>BREAK ON OWNER ON TYPE ON module skip 1
SELECT fi.symbolid, fi.owner, fi.type, fi.module, fi.function, fi.line#, fi.namespace, fi.calls, fi.function_elapsed_time, fi.sql_id
FROM dbmshp_function_info fi
WHERE fi.runid = 3
ORDER BY fi.owner, fi.module, fi.line#;<br /></code></span></pre>
The profile includes the application code and also the Oracle packages owned by SYS
Function
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: 70%;"><code> Symbol Line Name Elapsed
ID OWNER TYPE MODULE FUNCTION # Space CALLS Time SQL_ID
------- ------------------ --------------- ------------------------- ---------------------------------------- ----- ----- ------- ---------- -------------
8 XXXXX_CUST PACKAGE BODY CUST_PARALLEL_JOBS ISJOBSRUNNING 6 PLSQL 38 708
9 ISJOBSRUNNING.C_RUNNING_JOBS_CNT 12 PLSQL 38 266
137 __static_sql_exec_line13 13 SQL 38 9681 2y7y7t8bf4ykw
133 __sql_fetch_line23 23 SQL 38 2026632 2y7y7t8bf4ykw
6 GETJOBSSTATUS 42 PLSQL 7 150
7 GETJOBSSTATUS.C_JOB_STATUS 48 PLSQL 7 59
138 __static_sql_exec_line49 49 SQL 7 565 d5g73bnmxjuqd
134 __sql_fetch_line59 59 SQL 7 238 d5g73bnmxjuqd
3 <u>CUST_SIMULATE_SURRENDER</u> 105 PLSQL 1 1232
5 CUST_SIMULATE_SURRENDER.C_JOB_GROUPS 110 PLSQL 1 16
135 __static_sql_exec_line111 111 SQL 1 159 1xrrajz8mgbhs
4 CUST_SIMULATE_SURRENDER.C_CFG 119 PLSQL 1 12
136 __static_sql_exec_line120 120 SQL 1 90 9ytv0rhjjp3mr
131 __sql_fetch_line160 160 SQL 1 118 1xrrajz8mgbhs
132 __sql_fetch_line165 165 SQL 1 135 9ytv0rhjjp3mr
…
54 XXXXX_SCHEDULER PACKAGE BODY SCHEDULER_ENGINE __pkg_init 0 PLSQL 1 5
55 XXXXX_SCHEDULER PACKAGE SPEC SCHEDULER_ENGINE __pkg_init 0 PLSQL 1 5
52 XXXXX_SCHEDULER PACKAGE BODY SCHEDULER_ENGINE <u>RUN_JOB</u> 770 PLSQL 7 176
53 SET_JOB_ARGUMENT 1317 PLSQL 21 202
178 __static_sql_exec_line1355 1355 SQL 21 4733 3h8uatusjv84c
…
118 SYS PACKAGE BODY DBMS_SCHEDULER CREATE_PROGRAM 15 PLSQL 1 24
121 DROP_PROGRAM 43 PLSQL 2 98
119 DEFINE_PROGRAM_ARGUMENT 112 PLSQL 3 186
122 DROP_PROGRAM_ARGUMENT 211 PLSQL 6 363
117 CREATE_JOB 432 PLSQL 7 428
124 <u>RUN_JOB</u> 546 PLSQL 7 239
120 DROP_JOB 696 PLSQL 14 7484
123 <u>ENABLE</u> 2992 PLSQL 1 87
125 SET_ATTRIBUTE 3063 PLSQL 14 957
126 SET_ATTRIBUTE 3157 PLSQL 14 2923
127 SET_ATTRIBUTE_NULL 3274 PLSQL 7 42
116 CHECK_SYS_PRIVS 3641 PLSQL 69 153470
…
</code></span></pre>
The hierarchical profiler tracks which code blocks call which code blocks, so I can perform a hierarchical query starting where the parent is null.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>SELECT symbolid, parentsymid,
RPAD(' ', (level-1)*2, ' ') || a.name AS name,
a.line#, a.calls,
a.subtree_elapsed_time,
a.function_elapsed_time
FROM (SELECT fi.symbolid,
pci.parentsymid,
RTRIM(fi.owner || '.' || fi.module || '.' || NULLIF(fi.function, fi.module), '.') AS name,
fi.line#,
NVL(pci.subtree_elapsed_time, fi.subtree_elapsed_time) AS subtree_elapsed_time,
NVL(pci.function_elapsed_time, fi.function_elapsed_time) AS function_elapsed_time,
NVL(pci.calls, fi.calls) AS calls
FROM dbmshp_function_info fi
LEFT JOIN dbmshp_parent_child_info pci ON fi.runid = pci.runid AND fi.symbolid = pci.childsymid
WHERE fi.runid = 3
AND NOT fi.module LIKE 'DBMS_HPROF%'
) a
CONNECT BY a.parentsymid = PRIOR a.symbolid
START WITH a.parentsymid IS NULL;
</code></span></pre>
I can see that CUST_PARALLEL_JOBS.CUST_SIMULATE_SURRENDER calls XXXXX_SCHEDULER.SCHEDULER_ENGINE.RUN_JOB and that calls DBMS_SCHEDULER.RUN_JOB.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: 70%;"><code>
Symbol Parent Line Elapsed Elapsed
ID Sym ID NAME # CALLS Time Time
------- ------- ---------------------------------------------------------------------------------------------------- ----- ------- ---------- ----------
18 XXXXX_CUST_ADDON.CUST_SCHED_SIMSURRENDERS 1 1 78254334 570
3 18 <u>XXXXX_CUST.CUST_PARALLEL_JOBS.CUST_SIMULATE_SURRENDER</u> 105 1 77139478 1232
4 3 XXXXX_CUST.CUST_PARALLEL_JOBS.CUST_SIMULATE_SURRENDER.C_CFG 119 1 102 12
136 4 XXXXX_CUST.CUST_PARALLEL_JOBS.__static_sql_exec_line120 120 1 90 90
…
52 3 <u>XXXXX_SCHEDULER.SCHEDULER_ENGINE.RUN_JOB</u> 770 7 58708 176
56 52 XXXXX_SCHEDULER.SCHEDULER_UTILS.LOG_AUDIT_EVENT 173 7 45 40
115 56 SYS.DBMS_OUTPUT.PUT_LINE 109 41 43 43
57 52 XXXXX_SCHEDULER.SCHEDULER_UTILS.SCHEMA_OWNER 238 7 24 24
124 52 SYS.<u>DBMS_SCHEDULER.RUN_JOB</u> 546 7 58463 239
104 124 SYS.DBMS_ISCHED.CHECK_COMPAT 3509 7 11 11
112 124 SYS.DBMS_ISCHED.RUN_JOB 242 7 44391 44391
…
</code></span></pre>
Now I know which code to examine. This query outer joins the profiler data to the source code.
NB. Any wrapped code will not be available in the <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/ALL_SOURCE.html" target="_blank">ALL_SOURCE</a> view. You might want to unwrap it, at least in a test environment (see <a href="https://www.salvis.com/blog/2015/05/17/introducing-plsql-unwrapper-for-sql-developer/" target="_blank">Philipp Salisberg's PL/SQL Unwrapper for SQL Developer</a>).
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>break on owner on name skip 1 on type
SELECT s.owner, s.type, s.name, h.function, s.line,
h.function_elapsed_time/1e6 function_elapsed_time, h.calls, s.text
FROM all_source s
LEFT OUTER JOIN dbmshp_function_info h
ON s.owner = h.owner and s.name = h.module and s.type = h.type and s.line = h.line# and h.runid = 3
WHERE (( s.owner = 'XXXXX_CUST'
AND s.name = 'CUST_PARALLEL_JOBS'
AND s.type = 'PACKAGE BODY'
AND s.line between 100 and 300
) OR ( s.owner = 'XXXXX_SCHEDULER'
AND s.name = 'SCHEDULER_ENGINE'
AND s.type = 'PACKAGE BODY'
AND s.line between 770 and 858
))
ORDER BY s.owner, s.name, s.type, s.line
/
</code></span></pre>
Now, I can scan through the code and see how the code blocks were called.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: 50%;"><code> Function
Elapsed
OWNER TYPE NAME FUNCTION LINE Time CALLS TEXT
--------------- ------------ -------------------- ------------------------- ----- ----------- ------- -------------------------------------------------------------------------------------------------------------
XXXXX_CUST PACKAGE BODY CUST_PARALLEL_JOBS CUST_SIMULATE_SURRENDER 105 .001232 1 PROCEDURE Cust_Simulate_Surrender (pi_bus_in IN SrvContext, pio_err IN OUT SrvErr)
106 IS
…
213 -- run current job when it is not started yet
214 IF l_cfg_tbl(indx_job).allowed = 'Y' -- flag Y - to be started
215 THEN
216 -- run current job
217 <u>XXXXX_scheduler.scheduler_engine.Run_Job (</u>l_cfg_tbl(indx_job).XXXXX_job_name);
218 --<u>XXXXX_scheduler.scheduler_engine.enable_Job (</u>l_cfg_tbl(indx_job).XXXXX_job_name);
…
XXXXX_SCHEDULER PACKAGE BODY SCHEDULER_ENGINE RUN_JOB 770 .000176 7 PROCEDURE RUN_JOB( PI_JOB_NAME SCHEDULER_JOBS.JOB_NAME%TYPE )
…
778 IS
779 BEGIN
780 <u>DBMS_SCHEDULER.RUN_JOB(</u>
781 SCHEDULER_UTILS.SCHEMA_OWNER || '."' || PI_JOB_NAME || '"', USE_CURRENT_SESSION=>FALSE );
782
783
784
785 SCHEDULER_UTILS.LOG_AUDIT_EVENT( 'RunJob', TRUE, PI_OBJECT_NAME => PI_JOB_NAME );
786 EXCEPTION
787 WHEN OTHERS THEN
…
798 END;
799
800
801 --dmk 29.3.2023 added
802 PROCEDURE ENABLE_JOB( PI_JOB_NAME SCHEDULER_JOBS.JOB_NAME%TYPE )
…
810 IS
811 BEGIN
812 <u>DBMS_SCHEDULER.Enable(</u>
813 SCHEDULER_UTILS.SCHEMA_OWNER || '."' || PI_JOB_NAME || '"');
814
815
816
817 SCHEDULER_UTILS.LOG_AUDIT_EVENT( 'Enable_Job', TRUE, PI_OBJECT_NAME => PI_JOB_NAME );
818 EXCEPTION
819 WHEN OTHERS THEN
…
830 END;
</code></span></pre>By following the profiler data, I have found that <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-796DC891-31D3-4DEC-9C18-AAA40A51C67E" target="_blank">DBMS_SCHEDULER.RUN_JOB</a> was used. I was then able to add an alternative procedure that calls <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html#GUID-33CD9F19-8448-4BA8-AAB3-3B82A670085D" target="_blank">DBMS_SCHEDULER.ENABLE</a> and call that from the custom application code.<div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-46801075584903615022023-04-13T08:55:00.002+01:002023-04-13T12:57:06.760+01:00Using SQL Profiles to Tackle High Parse Time and CPU Consumption<h4>The Challenge of Dynamic SQL with Literals </h4>
<div>The following example is taken from a PeopleSoft General Ledger system. The SQL was generated by the nVision reporting tool (some literal values have been obfuscated).</div>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>SELECT L4.TREE_NODE_NUM,SUM(A.POSTED_TOTAL_AMT)
FROM PS_XX_SUM_XXXXX_VW A, PSTREESELECT10 L4, PSTREESELECT10 L2
WHERE A.LEDGER='X_UKMGT'
AND A.FISCAL_YEAR=2022 AND A.ACCOUNTING_PERIOD=1
AND L4.SELECTOR_NUM=415 AND A.CHARTFIELD3=L4.RANGE_FROM_10
AND L2.SELECTOR_NUM=416 AND A.ACCOUNT=L2.RANGE_FROM_10
AND (A.DEPTID BETWEEN '10000' AND '18999' OR
A.DEPTID BETWEEN '20000' AND '29149' OR A.DEPTID='29156' OR
A.DEPTID='29158' OR A.DEPTID BETWEEN '29165' AND '29999' OR A.DEPTID
BETWEEN '30000' AND '39022' OR A.DEPTID BETWEEN '39023' AND '39999' OR
A.DEPTID BETWEEN '40000' AND '49999' OR A.DEPTID BETWEEN '50000' AND
'59999' OR A.DEPTID BETWEEN '60000' AND '69999' OR A.DEPTID BETWEEN
'70000' AND '79999' OR A.DEPTID BETWEEN '80000' AND '89999' OR
A.DEPTID='29150' OR A.DEPTID=' ')
AND A.CHARTFIELD1='0120413'
AND A.CURRENCY_CD='GBP'
GROUP BY L4.TREE_NODE_NUM
Plan hash value: 1653134809
</code></span><span style="font-size: 60%;">
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 27 (100)| | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10006 | 1 | 29 | 27 (63)| 00:00:01 | | | Q1,06 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | 1 | 29 | 27 (63)| 00:00:01 | | | Q1,06 | PCWP | |
| 4 | PX RECEIVE | | 1 | 29 | 27 (63)| 00:00:01 | | | Q1,06 | PCWP | |
| 5 | PX SEND HASH | :TQ10005 | 1 | 29 | 27 (63)| 00:00:01 | | | Q1,05 | P->P | HASH |
| 6 | HASH GROUP BY | | 1 | 29 | 27 (63)| 00:00:01 | | | Q1,05 | PCWP | |
| 7 | HASH JOIN | | 1 | 29 | 27 (63)| 00:00:01 | | | Q1,05 | PCWP | |
| 8 | JOIN FILTER CREATE | :BF0000 | 1 | 16 | 25 (68)| 00:00:01 | | | Q1,05 | PCWP | |
| 9 | PX RECEIVE | | 1 | 16 | 25 (68)| 00:00:01 | | | Q1,05 | PCWP | |
| 10 | PX SEND HYBRID HASH | :TQ10003 | 1 | 16 | 25 (68)| 00:00:01 | | | Q1,03 | P->P | HYBRID HASH|
| 11 | STATISTICS COLLECTOR | | | | | | | | Q1,03 | PCWC | |
| 12 | VIEW | VW_GBC_10 | 1 | 16 | 25 (68)| 00:00:01 | | | Q1,03 | PCWP | |
| 13 | HASH GROUP BY | | 1 | 67 | 25 (68)| 00:00:01 | | | Q1,03 | PCWP | |
| 14 | PX RECEIVE | | 1 | 67 | 25 (68)| 00:00:01 | | | Q1,03 | PCWP | |
| 15 | PX SEND HASH | :TQ10002 | 1 | 67 | 25 (68)| 00:00:01 | | | Q1,02 | P->P | HASH |
| 16 | HASH GROUP BY | | 1 | 67 | 25 (68)| 00:00:01 | | | Q1,02 | PCWP | |
| 17 | HASH JOIN | | 60 | 4020 | 24 (67)| 00:00:01 | | | Q1,02 | PCWP | |
| 18 | JOIN FILTER CREATE | :BF0001 | 60 | 3120 | 22 (73)| 00:00:01 | | | Q1,02 | PCWP | |
| 19 | PX RECEIVE | | 60 | 3120 | 22 (73)| 00:00:01 | | | Q1,02 | PCWP | |
| 20 | PX SEND HYBRID HASH | :TQ10000 | 60 | 3120 | 22 (73)| 00:00:01 | | | Q1,00 | P->P | HYBRID HASH|
| 21 | STATISTICS COLLECTOR | | | | | | | | Q1,00 | PCWC | |
| 22 | PX BLOCK ITERATOR | | 60 | 3120 | 22 (73)| 00:00:01 | 29 | 29 | Q1,00 | PCWC | |
| 23 | MAT_VIEW REWRITE ACCESS INMEMORY FULL| PS_XX_SUM_XXXXX_MV | 60 | 3120 | 22 (73)| 00:00:01 | 29 | 29 | Q1,00 | PCWP | |
| 24 | PX RECEIVE | | 306 | 4590 | 2 (0)| 00:00:01 | | | Q1,02 | PCWP | |
| 25 | PX SEND HYBRID HASH | :TQ10001 | 306 | 4590 | 2 (0)| 00:00:01 | | | Q1,01 | P->P | HYBRID HASH|
| 26 | JOIN FILTER USE | :BF0001 | 306 | 4590 | 2 (0)| 00:00:01 | | | Q1,01 | PCWP | |
| 27 | PX BLOCK ITERATOR | | 306 | 4590 | 2 (0)| 00:00:01 | 416 | 416 | Q1,01 | PCWC | |
| 28 | TABLE ACCESS STORAGE FULL | PSTREESELECT10 | 306 | 4590 | 2 (0)| 00:00:01 | 416 | 416 | Q1,01 | PCWP | |
| 29 | PX RECEIVE | | 202 | 2626 | 2 (0)| 00:00:01 | | | Q1,05 | PCWP | |
| 30 | PX SEND HYBRID HASH | :TQ10004 | 202 | 2626 | 2 (0)| 00:00:01 | | | Q1,04 | P->P | HYBRID HASH|
| 31 | JOIN FILTER USE | :BF0000 | 202 | 2626 | 2 (0)| 00:00:01 | | | Q1,04 | PCWP | |
| 32 | PX BLOCK ITERATOR | | 202 | 2626 | 2 (0)| 00:00:01 | 415 | 415 | Q1,04 | PCWC | |
| 33 | TABLE ACCESS STORAGE FULL | PSTREESELECT10 | 202 | 2626 | 2 (0)| 00:00:01 | 415 | 415 | Q1,04 | PCWP | |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
</span><span style="font-size: x-small;">
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$6240F0FF
12 - SEL$B80655F7 / VW_GBC_10@SEL$9C8D6CC0
13 - SEL$B80655F7
23 - SEL$B80655F7 / PS_XX_SUM_XXXXX_MV@SEL$CAD4EEF6
28 - SEL$B80655F7 / L2@SEL$1
33 - SEL$6240F0FF / L4@SEL$1
… </span></pre></div>In my example, ASH sampled 276 different SQL IDs. Each one was only executed once. There may have been more statements, but ASH only persists one sample every 10s. Cumulatively, they consumed 2843 seconds of DB time in SQL hard parse.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: 70%;"><code> Plan
SQL Plan Force Matching SQL Plan Parse
# OPRID RUNCNTLID ACTION SQL_ID Hash Value Signature IDs Execs Secs Table Name
-- ------------ ---------------------- -------------------------------- ------------- ------------ --------------------- ------ ------ ------- ------------------
1 NVISION NVS_RPTBOOK_99 PI=9984520:UKGL999I:12345 01g5hvs91k4hn 1653134809 1995330195085985689 276 276 2843 PS_XX_SUM_XXXXX_MV
…
</code></span></pre>This is one of at least 276 different SQL statements that all have the same force-matching signature. The statements are essentially the same but differ in some of their literal values. That means that the database has to treat each one as a different SQL statement that must be fully parsed separately. <div>SQL Parse involves checking the statement is syntactically correct, and that the user has permission to access the objects, then during the SQL optimization stage the optimizer decides how to execute the statement before it moves to row source generation. </div><div>If the statement has been parsed previously and is still in the shared pool, Oracle can skip the optimization and row source generation stages. This is often called soft parse. </div><div><h4>SQL Optimization </h4></div><div>During the optimization stage, the optimizer calculates the 'cost' of different possible execution plans. Depending upon the SQL, the optimizer considers different table join orders, different table join methods, and different SQL transformations. The optimizer cost is an estimation of the time that it will take to execute a particular plan. The unit of cost is roughly equivalent to the duration of a single block read. More expensive plans are abandoned as they become more expensive than the cheapest known plan so far. Thus the 'cost-based' optimizer produces the cheapest plan. However, the process of optimization consumes time and CPU. </div><div>If I write SQL that is executed many times with bind variables rather than literals, then I should avoid some hard parses and the associated CPU consumption. Oracle has always recommended using bind variables rather than literals to improve performance as well as protect against SQL injection. However, there are many applications that still use literals, particularly in dynamically generated SQL. Every statement has to be hard parsed, and the cumulative CPU consumption can start to become significant. PeopleSoft is one such application that does this in some areas of the product, but it is by no means an isolated example. </div><div>Oracle produced a feature called <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/tgsql/improving-rwp-cursor-sharing.html#GUID-1DEE6AD7-C30E-4ABB-9BFF-B5895A6E386B" target="_blank">Cursor Sharing</a>. Literals in statements are automatically converted to bind variables. It can be very effective. It does reduce SQL parse, but can sometimes also produce undesirable side effects where the execution plan may not change as the bind variable values change. </div><h4 style="text-align: left;">Hints </h4><div>Hints are directives to the optimizer. They tell it to do something or more generally not to do something else. If I were to add some optimizer hints to a statement that will produce the same, or a similar, execution plan, then the optimizer should do less work, consume less CPU, and less time coming to the same or similar conclusion. </div><div> For example, if I add a LEADING hint to force the optimizer to start with a particular object, that will reduce the number of join orders to be considered. </div><div><ul style="text-align: left;"><li>A two-table query has 2 possible join orders; a LEADING hint will reduce it to 1. </li><li>A three-table query has 6 possible join orders; a LEADING hint on a single table will reduce it to 2. </li></ul>Often, it is not possible to add hints directly to the code in the application because it is all dynamically generated inside a package, or it may not be desirable to alter third-party code. In my example, the SQL was generated by compiled code within the nVision reporting tool that I cannot alter. I can't use a SQL Patch because I would need a patch for every SQL_ID and I can't predicate the SQL_IDs. Instead, I can create a force-matching SQL profile that will match every statement with the same force-matching signature. </div><div><i>N.B. SQL Profiles require the SQL Tuning pack licence. </i></div><h4 style="text-align: left;">Example SQL Profile </h4><div>I don't have to use the full outline of hints from the execution plan, I have chosen to apply just a few. </div><div><ul style="text-align: left;"><li>LEADING(L2): I want the query to start with the dimension table PSTREESELECT10. This will result in a change to the execution plan </li><li>REWRITE: PS_XX_SUM_XXXXX_MV is a materialized view built on the view PS_XX_SUM_XXXXX_VW of an underlying summary ledger. Rewriting the SQL to use the materialized view is a cost-based decision. Oracle usually decides to rewrite it to use the materialized view, but I want to ensure that this always happens with this hint. </li><li>NO_PARALLEL: This query selects only a single accounting period, so it is only scanning a single partition, therefore I don't want to invoke a parallel query. </li><li>PX_JOIN_FILTER(PS_XX_SUM_XXXXX_MV@SEL$CAD4EEF6): The dimension table is equijoined to the fact table. Therefore, it is a good candidate for using a Bloom filter on the look-up fact table. This doesn't always happen naturally on this statement. I have had to use the query block name taken from the execution plan of the rewritten statement. The query block name is stable, it is a hash value based on the object name and the operation.<br /></li></ul>The profile is then created with DBMS_SQLTUNE.IMPORT_SQL_PROFILE.</div><div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>set serveroutput on
DECLARE
l_sql_text CLOB;
l_signature NUMBER;
h SYS.SQLPROF_ATTR;
…
BEGIN
…
h := SYS.SQLPROF_ATTR(
q'[BEGIN_OUTLINE_DATA]',
q'[NO_PARALLEL]',
q'[LEADING(L2)]',
q'[PX_JOIN_FILTER(PS_XX_SUM_XXXXX_MV@SEL$CAD4EEF6)]',
q'[REWRITE]',
q'[END_OUTLINE_DATA]');
l_signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(l_sql_text);
DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
sql_text => l_sql_text,
profile => h,
name => 'NVS_UKGL999I_FUNC_ACEXP1',
category => 'DEFAULT',
validate => TRUE,
replace => TRUE,
force_match => TRUE);
…
END;
/</code></span></pre>
This is the execution plan with the SQL Profile. The note confirms that a SQL profile was used. The hint report shows the hints from the SQL Profile. </div><div>Note that the SELECTOR_NUM and CHARTFIELD1 predicates have changed.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>SELECT L4.TREE_NODE_NUM,SUM(A.POSTED_TOTAL_AMT)
FROM PS_XX_SUM_XXXXX_VW A, PSTREESELECT10 L4, PSTREESELECT10 L2
WHERE A.LEDGER='X_UKMGT'
AND A.FISCAL_YEAR=2023 AND A.ACCOUNTING_PERIOD=1
AND L4.SELECTOR_NUM=433 AND A.CHARTFIELD3=L4.RANGE_FROM_10
AND L2.SELECTOR_NUM=434 AND A.ACCOUNT=L2.RANGE_FROM_10
AND (A.DEPTID BETWEEN '10000' AND '18999' OR
A.DEPTID BETWEEN '20000' AND '29149' OR A.DEPTID='29156' OR
A.DEPTID='29158' OR A.DEPTID BETWEEN '29165' AND '29999' OR A.DEPTID
BETWEEN '30000' AND '39022' OR A.DEPTID BETWEEN '39023' AND '39999' OR
A.DEPTID BETWEEN '40000' AND '49999' OR A.DEPTID BETWEEN '50000' AND
'59999' OR A.DEPTID BETWEEN '60000' AND '69999' OR A.DEPTID BETWEEN
'70000' AND '79999' OR A.DEPTID BETWEEN '80000' AND '89999' OR
A.DEPTID='29150' OR A.DEPTID=' ')
AND A.CHARTFIELD1='0051001'
AND A.CURRENCY_CD='GBP'
GROUP BY L4.TREE_NODE_NUM
Plan hash value: 3033847137
</code></span><span style="font-size: 80%;">
---------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 214 (100)| | | |
| 1 | SORT GROUP BY | | 5 | 400 | 214 (62)| 00:00:01 | | |
| 2 | HASH JOIN | | 2347 | 183K| 213 (62)| 00:00:01 | | |
| 3 | HASH JOIN | | 2347 | 153K| 210 (63)| 00:00:01 | | |
| 4 | JOIN FILTER CREATE | :BF0000 | 306 | 4590 | 3 (0)| 00:00:01 | | |
| 5 | PARTITION RANGE SINGLE | | 306 | 4590 | 3 (0)| 00:00:01 | 434 | 434 |
| 6 | TABLE ACCESS STORAGE FULL | PSTREESELECT10 | 306 | 4590 | 3 (0)| 00:00:01 | 434 | 434 |
| 7 | JOIN FILTER USE | :BF0000 | 26468 | 1344K| 206 (64)| 00:00:01 | | |
| 8 | PARTITION RANGE SINGLE | | 26468 | 1344K| 206 (64)| 00:00:01 | 42 | 42 |
| 9 | MAT_VIEW REWRITE ACCESS INMEMORY FULL| PS_XX_SUM_XXXXX_MV | 26468 | 1344K| 206 (64)| 00:00:01 | 42 | 42 |
| 10 | PARTITION RANGE SINGLE | | 202 | 2626 | 3 (0)| 00:00:01 | 433 | 433 |
| 11 | TABLE ACCESS STORAGE FULL | PSTREESELECT10 | 202 | 2626 | 3 (0)| 00:00:01 | 433 | 433 |
---------------------------------------------------------------------------------------------------------------------------------
</span><span style="font-size: x-small;">
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$38F8C49D
6 - SEL$38F8C49D / L2@SEL$1
9 - SEL$38F8C49D / PS_XX_SUM_XXXXX_MV@SEL$CAD4EEF6
11 - SEL$38F8C49D / L4@SEL$1
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 4 (U - Unused (1))
---------------------------------------------------------------------------
0 - STATEMENT
U - NO_PARALLEL
…
1 - SEL$38F8C49D
- LEADING(L2)
- REWRITE
9 - SEL$38F8C49D / PS_XX_SUM_XXXXX_MV@SEL$CAD4EEF6
- PX_JOIN_FILTER(PS_XX_SUM_XXXXX_MV@SEL$CAD4EEF6)
Note
-----
…
- SQL profile "NVS_UKGL999I_FUNC_ACEXP1" used for this statement
</span></pre></div><ul style="text-align: left;"><li>The new execution plan does indeed start with the dimension table </li><li>The query was rewritten to use the materialized view </li><li>A Bloom filter was used on the materialized view that is now the fact table</li><li>The NO_PARALLEL hint wasn't used because Oracle chose not to parallelise this statement anyway.
</li></ul><div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: 70%;"><code>
Plan
SQL Plan Force Matching SQL Plan Parse
# OPRID RUNCNTLID ACTION SQL_ID Hash Value Signature IDs Execs Secs Table Name
-- ------------ ---------------------- -------------------------------- ------------- ------------ --------------------- ------ ------ ------- ------------------
…
1 NVISION NVS_RPTBOOK_99 PI=9984933:UKGL278I:12345 03nwc4yy1r1r7 3033847137 1995330195085985689 138 138 1428 PS_XX_SUM_XXXXX_MV</code></span><span style="font-size: x-small;">
</span></pre></div>
Now just 1428s is spent on parse time. We only found 138 SQL IDs, but that is just because there are fewer ASH samples because it is taking less time. <div>In this case, adding these hints with a SQL Profile has halved the time spent parsing this set of SQL statements.
</div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-45677059314475502522023-04-11T14:51:00.001+01:002023-04-28T10:38:13.683+01:00Reading Trace files with SQLOracle 12.2 provided some new views that enable trace files to be read via SQL. Previously, it had been possible to do this by creating external tables, but the new views make it much easier.
You can simply query what trace files exist with SQL, and then access them without need for server access. <div><br /></div><div>This is particularly useful on some cloud platforms such as autonomous database, where there is no server access, even for the DBA. However, this technique is applicable to all Oracle databases. Now, not just the DBA, but developers can easily obtain trace files.</div><div><br /></div><div>Lots of other people have blogged about this, but Chris Antognini makes the point extremely well:<div><div><ul style="text-align: left;"><li><a href="https://antognini.ch/2018/03/tkprofs-argument-pdbtrace/" target="_blank">TKPROF’s Argument PDBTRACE</a></li><li><a href="https://antognini.ch/2016/09/sql-trace-in-oracle-database-exadata-express-cloud-service/" target="_blank">SQL Trace in Oracle Database Exadata Express Cloud Service</a> </li></ul></div><div><a href="https://blog.psftdba.com/2023/04/oracle-sql-tracing-processes-from.html">In a post on my PeopleSoft blog</a>, I demonstrated enabling trace on an application server process. I also specified that as a trace file identifier. Now I can query the trace files that exist, and restrict the query by filename or date.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span><code><span style="font-size: x-small;">set pages 99
select * from gv$diag_trace_file f
where 1=1
and f.modify_time > trunc(sysdate)-1
and f.trace_filename like 'finprod%ora%.trc'
order by modify_time desc
/
</span><span style="font-size: xx-small;"> INST_ID ADR_HOME TRACE_FILENAME CHANGE_TIME MODIFY_TIME CON_ID
---------- ------------------------------------------------------------ ------------------------------ ------------------------------------ ------------------------------------ ----------
1 /u02/app/oracle/diag/rdbms/finprod/finprod1 finprod1_ora_306641.trc 23/03/2023 21.25.41.000000000 -05:00 23/03/2023 21.25.41.000000000 -05:00 0
</span></code></span></pre>
Then I can also query the trace file contents, and even just spool it to a local file.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>clear screen
set head off pages 0 feedback off
with x as (
select /*+LEADING(F)*/ f.trace_filename, c.line_number, c.payload
--, max(c.line_number) over (partition by c.trace_filename) max_line_number
from gv$diag_trace_file f, gv$diag_trace_File_contents c
where c.adr_home = f.adr_home
and c.trace_filename = f.trace_filename
and f.modify_time > trunc(sysdate)-1
and f.trace_filename like 'finprod%ora%306641.trc'
)
select payload from x
ORDER BY line_number
/
</code></span></pre>The contents of the spool file looks just like the trace file. I can profile it with <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/tgsql/performing-application-tracing.html#GUID-31EF2BD5-28DB-488F-A855-8DA324F6970B" rel="nofollow" target="_blank">tkprof</a> or another trace profiler.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>Trace file /u02/app/oracle/diag/rdbms/finprod/finprod1/trace/finprod1_ora_306641.trc
Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production
Version 19.16.0.0.0
Build label: RDBMS_19.16.0.0.0DBRU_LINUX.X64_220701
ORACLE_HOME: /u02/app/oracle/product/19.0.0.0/dbhome_1
System name: Linux
Node name: naukp-aora101
Release: 4.14.35-2047.514.5.1.2.el7uek.x86_64
Version: #2 SMP Thu Jul 28 15:33:31 PDT 2022
Machine: x86_64
Storage: Exadata
Instance name: finprod1
Redo thread mounted by this instance: 1
Oracle process number: 225
Unix process pid: 306641, image: oracle@xxxxp-aora102
*** 2023-03-23T21:46:34.632063-04:00
*** SESSION ID:(2337.13457) 2023-03-23T21:46:34.632080-04:00
*** CLIENT ID:(NVRUNCNTL) 2023-03-23T21:46:34.632086-04:00
*** SERVICE NAME:(finprod.acme.com) 2023-03-23T21:46:34.632161-04:00
*** MODULE NAME:(RPTBOOK) 2023-03-23T21:46:34.632166-04:00
*** ACTION NAME:(PI=9980346:NVGL0042:42001) 2023-03-23T21:46:34.632171-04:00
*** CLIENT DRIVER:() 2023-03-23T21:46:34.632177-04:00
IPCLW:[0.0]{-}[RDMA]:RC: [1679622394631549]Connection 0x7f83ee131550 not formed (2). Returning retry.
IPCLW:[0.1]{E}[RDMA]:PUB: [1679622394631549]RDMA lport 0x400012c62778 dst 100.107.2.7:40056 bid 0x1805ea7b58 rval 2
</code></span></pre>
</div></div></div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-2417304510886646252023-03-06T16:07:00.015+00:002023-05-22T14:54:58.372+01:00"In the Cloud, Performance is Instrumented as Cost"<div class="separator" style="clear: both; text-align: left;"><img align="left" border="1" data-original-height="2172" data-original-width="2156" height="88" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj0i5DWrj-d16dsUJmDx1S8Pq5eowayQvhEdIGX_2xPpclk4V4qReLVX6iDC6k5Q8InBhSw11oQnm-qrSI8qwpUxoWS_ASYwiXF4J9k4Px9qJChz5dqTcVyCfdjFWrU5k8Z8NrEvNRC-FP9XKYlevrvi8chBVRpKfzevk9aP2i6ukVfhbjmuio/w199-h200/cloud.jpg" style="border: 0px; padding: 0px,5px,0px,0px;" width="88" />About 5 years ago, I was at a conference where someone put this statement up in a PowerPoint slide. (I would like to be able to correctly credit the author, but I can't remember who it was). We all looked it at, thought about and said 'yes, of course' to ourselves. However, as a consultant who specialises in performance optimisation, it has taken until only recently that I started to have conversations with clients that reflect that idea.<p></p>
<h3 style="text-align: left;">In the good old/bad old days of 'on premises'</h3>
<p>It is not that long ago that the only option for procuring new hardware was to go through a sizing exercise that involved guessing how much you needed, allowing for future growth in data and processing volumes, and then deciding how much you were actually willing to afford, purchase it, and finally wheel it into your data centre and hope for the best.</p><p>It was then normal to want to get the best possible performance out of whatever system was installed on that hardware. It would inevitably slow down over time. Eventually, after the hardware purchase had been fully depreciated, you would have to start the whole cycle again and replace the hardware with newer hardware.</p><p>Similarly, Oracle licencing. You would have to licence Oracle for all your CPUs (there are a few exceptions where you can associate specific CPUs to specific VMs and only licence Oracle for the CPUs in those VMs). You would also have to decide how many Oracle features you licenced. Standard or Enterprise Edition? Diagnostics? Tuning? RAC? Partitioning? Compression? In-Memory?</p>
<h3 style="text-align: left;">"You are gonna need a bigger boat"</h3>
<p>Then when you encountered performance problems you did the best you could with what you had. As a consultant, there was rarely any point in saying to a customer that they had run out of resource and they needed more. The answer was usually along the lines of 'we have spent our money on that, and it has to last for five years, we have no additional budget and it has to work'. So you got on with finding the rabbit in the hat.</p><p>In the cloud, instead of purchasing hardware as a capital expense, you rent hardware as an operational expense.</p><p>You can bring your own Oracle licence (BYOL), and then you have exactly what you were previously licenced for. "<a href="C:\Users\david\OneDrive\Documents\Word\Frequently asked questions: Oracle Bring Your Own License (BYOL) - https:\www.oracle.com\uk\cloud\bring-your-own-license\faq\" target="_blank">At a high level, one Oracle Processor License maps to two OCPUs.</a>"</p><p>With Oracle's cloud licencing there are still lots of choices to make, not just how many CPUs and how much memory. You can choose Infrastructure as a Service (IAAS) where you rent the server and install and licence Oracle on it just as you did on-premises. You can choose different storage systems with different I/O profiles. There are different levels of PAAS that have different database features. You can go all the way up to Extreme performance on Exadata. All of these choices have a cost consequence. Oracle provides a <a href="https://www.oracle.com/cloud/costestimator.html" target="_blank">Cloud cost estimator tool</a> (other consultancies have produced their own versions). These tools clearly show the link between these choices and their costs very clear.</p><h3 style="text-align: left;">You can have as much performance as you are willing to pay for</h3><p>I have been working with a customer who is moving a PeopleSoft system from Supercluster on-premises to Exadata Cloud-at-Customer (so it is physically on-site, but in all other respects it is in the cloud). They are not bringing their own licence (BYOL). Instead, they are on a tariff of US$1.3441/OCPU/hr, we have found it easier to talk about US$1000/OCPU/month.</p>
<p>Just as you would with an on-premises system, they went through a sizing exercise that predicted they needed 6 OCPU on each of 2 RAC nodes during the day, and 10 at night. </p>
<p>It has been very helpful to have a clear quantitative definition of acceptable performance for the critical part of the system, the overnight reporting batch. "The reports need to be available to users by the start of the working day in continental Europe, at 8am CET", which is 2am EST. There is no benefit in providing additional resources to allow the batch to finish any earlier. Instead, we only need to provide as much as is necessary to reliably meet the target.</p>
<p>A performance tuning/testing exercise quickly showed that fewer than the predicted number of CPUs were actually needed. 2-4 OCPUs/node during the day is looking comfortable. The new Exadata has fewer but much faster CPUs. As we adjusted the application configuration to match we found we are able to reduce the number of OCPUs. </p>
<p>If we hadn't already been using base-level In Memory feature on Supercluster, then to complete the overnight batch in time for the start of the European working day, we would probably have needed 10 OCPUs/node. The base-level In Memory option brought that down to around 7. This shows the huge value of the careful use of database features and techniques to reduce CPU overhead.</p>
<p>We are not using BYOL, so we can use fully featured In Memory with a larger store. Increasing the In Memory store from 16Gb to 40Gb per node saved another OCPU, but cost nothing. If we had been using BYOL we would have had to pay additionally for fully featured In Memory. I doubt the marginal benefit would have justified the cost.</p>
<p>The customer has been considering switching on the extra OCPUs overnight to facilitate the batch. Doing so costs $1.33/hour, and at the end of the month, they get an invoice from Oracle. That has concentrated minds and changed behaviours. The customer understands that there is a real $ cost/saving to their business decisions.</p>
<p>One day I was asked: "What happens if we reduce the number of CPUs from 6 to 4?"</p>
<p>Essentially the batch will take longer. We are already using the database resource manager to prioritise processes when all the CPU is in use. The resource manager plan has been built to reflect the business priorities, and so keeps it fair for all users. For example, it ensures that users of the online part of the application get CPU in preference to batch processes, this is important for users in Asia who are online when the batch runs overnight in North America. We also use the resource plan to impose different parallel query limits to different groups of processes. If we are going to vary the number of CPUs we will have to switch between different resource manager plans with different limits. We will also have to reduce the number of reports that can be concurrently executed by the application, so some application configuration has to go hand in hand with the database configuration.</p>
<p>Effective caching by the database meant we already did relatively little physical I/O during the reporting. Most of the time was already spent on CPU. Use of In Memory further reduced physical I/O, and now nearly all the time is spent on CPU, but it also reduced the overall CPU consumption and therefore response time.</p>
<p>When we did vary the number of CPUs, we were not surprised to observe, from the Active Session History (ASH), that the total amount of database time spent on CPU by the nVision reporting processes is roughly constant (indicated by the blue area in the below charts). If we reduce the number of concurrent processes, then the batch simply runs for longer.</p><p><span style="text-align: center;"></span></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifPAxVFFafBuFISX7d4tP1WGNrdVIKcLGWxPEt0SUd6I-lokjDPrbdrZ1saUUANJtHehGgZ7rimOb0kLSTdqurYDLbB3IPWUGKsTMaU-4AcozuNbZuj2U-o0x7xTQiEMxVGyVv5VPcPzDiJc-mQlU9yivHQHEPaGfNK9l97rPsB6EFZ-FnqmY/s6103/4ocpu.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="3987" data-original-width="6103" height="196" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifPAxVFFafBuFISX7d4tP1WGNrdVIKcLGWxPEt0SUd6I-lokjDPrbdrZ1saUUANJtHehGgZ7rimOb0kLSTdqurYDLbB3IPWUGKsTMaU-4AcozuNbZuj2U-o0x7xTQiEMxVGyVv5VPcPzDiJc-mQlU9yivHQHEPaGfNK9l97rPsB6EFZ-FnqmY/s320/4ocpu.png" width="300" /></a><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLlhXWp0ftszZTH4d_2SpBq5sAPyYh_IiHLDkpTnwdnm72J8u6TuPcqtAlv5w9P4SDtXzU1Obu45mrU8OseqWGsB6kjdB0ODmY_gWiNeec9Q9wZL1S34_yJYy0MwyDNEDuzXwFLQ0fC_DdDy2sWWegc4dc6ByrRJnlGegGPWs-YEWdwQjGERs/s6103/6ocpu.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="3987" data-original-width="6103" height="196" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLlhXWp0ftszZTH4d_2SpBq5sAPyYh_IiHLDkpTnwdnm72J8u6TuPcqtAlv5w9P4SDtXzU1Obu45mrU8OseqWGsB6kjdB0ODmY_gWiNeec9Q9wZL1S34_yJYy0MwyDNEDuzXwFLQ0fC_DdDy2sWWegc4dc6ByrRJnlGegGPWs-YEWdwQjGERs/s320/6ocpu.png" width="300" /></a></div><br />There is no question that effective design and tuning are as important as they ever were. The laws of physics are the same in the cloud as they are in your own data centre. We worked hard to get the reporting to this level of performance and down to this CPU usage. </div><div class="separator" style="clear: both; text-align: left;">The difference is that now you can measure exactly how much that effort is saving you on your cloud subscription, and you can choose to spend more or less on that cloud subscription in order to achieve your business objectives.
<p>Determining the benefit to the business, in terms of the quantity and cost of users' time, remains as difficult as ever. However, it was not a major consideration in this example because this all happens before the users are at work.</p></div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0London, UK51.5072178 -0.127586223.196983963821154 -35.2838362 79.817451636178845 35.0286638tag:blogger.com,1999:blog-14654018.post-57192929747584125932022-10-14T16:36:00.003+01:002022-10-17T12:43:25.808+01:00There is no BITOR() in Oracle SQLIn Oracle SQL, I can do a bitwise AND of two numbers, but there is no equivalent function to do a bitwise OR. However, it turns out to be really easy to do using <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/BITAND.html" target="_blank">BITAND()</a>.
<br />
I was manipulating some trace values where each binary digit, or bit, corresponds to a different function (see <a href="https://blog.psftdba.com/2022/10/add-flags-to-trace-level-overrides-in.html">PeopleSoft DBA Blog: Add Flags to Trace Level Overrides in Process Definitions</a>). I wanted to ensure certain attributes were set. So, I wanted to do a bitwise OR between the current flag value and the value of the bits I wanted to set. <br />In bitwise OR, if either or both bits are set, then the answer is 1. It is like addition, but when both the bits are 1, the answer is 1 rather than 2. I can add the bits up and then deduct BITAND(). Thus:
<br />
<blockquote><table border="1" cellspacing="0" style="border-width: 0px; text-align: left;">
<tbody><tr>
<td style="border-bottom-style: solid; border-bottom-width: 2px; border-left-style: none; border-left-width: medium; border-right-style: solid; border-right-width: 2px; border-style: none solid solid none; border-top-style: none; border-top-width: medium; border-width: medium 2px 2px medium; text-align: center;"><span style="font-size: medium;">BITOR</span></td>
<td style="border-bottom-style: solid; border-bottom-width: 2px; border-left-style: solid; border-left-width: 1px; border-right-style: solid; border-right-width: 1px; border-style: none solid solid; border-top-style: none; border-top-width: medium; border-width: medium 1px 2px; text-align: center;"><span style="font-size: medium;">0</span></td>
<td style="border-bottom-style: solid; border-bottom-width: 2px; border-left-style: solid; border-left-width: 1px; border-right-style: none; border-right-width: medium; border-style: none none solid solid; border-top-style: none; border-top-width: medium; border-width: medium medium 2px 1px; text-align: center;"><span style="font-size: medium;">1</span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;"> </span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;"> </span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;"> </span></td>
<td style="border-bottom-style: solid; border-bottom-width: 2px; border-left-style: none; border-left-width: medium; border-right-style: solid; border-right-width: 2px; border-style: none solid solid none; border-top-style: none; border-top-width: medium; border-width: medium 2px 2px medium; text-align: center;"><span style="font-size: medium;">+</span></td>
<td style="border-bottom-style: solid; border-bottom-width: 2px; border-left-style: solid; border-left-width: 1px; border-right-style: solid; border-right-width: 1px; border-style: none solid solid; border-top-style: none; border-top-width: medium; border-width: medium 1px 2px; text-align: center;"><span style="font-size: medium;">0</span></td>
<td style="border-bottom-style: solid; border-bottom-width: 2px; border-left-style: solid; border-left-width: 1px; border-right-style: none; border-right-width: medium; border-style: none none solid solid; border-top-style: none; border-top-width: medium; border-width: medium medium 2px 1px; text-align: center;"><span style="font-size: medium;">1</span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;"> </span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;"> </span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;"> </span></td>
<td style="border-bottom-style: solid; border-bottom-width: 2px; border-left-style: none; border-left-width: medium; border-right-style: solid; border-right-width: 2px; border-style: none solid solid none; border-top-style: none; border-top-width: medium; border-width: medium 2px 2px medium; text-align: center;"><span style="font-size: medium;">BITAND</span></td>
<td style="border-bottom-style: solid; border-bottom-width: 2px; border-left-style: solid; border-left-width: 1px; border-right-style: solid; border-right-width: 1px; border-style: none solid solid; border-top-style: none; border-top-width: medium; border-width: medium 1px 2px; text-align: center;"><span style="font-size: medium;">0</span></td>
<td style="border-bottom-style: solid; border-bottom-width: 2px; border-left-style: solid; border-left-width: 1px; border-right-style: none; border-right-width: medium; border-style: none none solid solid; border-top-style: none; border-top-width: medium; border-width: medium medium 2px 1px; text-align: center;"><span style="font-size: medium;">1</span></td>
</tr>
<tr>
<td style="border-bottom-style: solid; border-bottom-width: 1px; border-left-style: none; border-left-width: medium; border-right-style: solid; border-right-width: 2px; border-style: solid solid solid none; border-top-style: solid; border-top-width: 1px; border-width: 1px 2px 1px medium; text-align: right;"><span style="font-size: medium;">0</span></td>
<td style="border-style: solid; border-width: 1px; text-align: right;"><span style="font-size: medium;">0</span></td>
<td style="border-bottom-style: solid; border-bottom-width: 1px; border-left-style: solid; border-left-width: 1px; border-right-style: none; border-right-width: medium; border-style: solid none solid solid; border-top-style: solid; border-top-width: 1px; border-width: 1px medium 1px 1px; text-align: right;"><span style="font-size: medium;">1</span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;"> </span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;">=</span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;"> </span></td>
<td style="border-bottom-style: solid; border-bottom-width: 1px; border-left-style: none; border-left-width: medium; border-right-style: solid; border-right-width: 2px; border-style: solid solid solid none; border-top-style: solid; border-top-width: 1px; border-width: 1px 2px 1px medium; text-align: center;"><span style="font-size: medium;">0</span></td>
<td style="border-style: solid; border-width: 1px; text-align: center;"><span style="font-size: medium;">0</span></td>
<td style="border-bottom-style: solid; border-bottom-width: 1px; border-left-style: solid; border-left-width: 1px; border-right-style: none; border-right-width: medium; border-style: solid none solid solid; border-top-style: solid; border-top-width: 1px; border-width: 1px medium 1px 1px; text-align: center;"><span style="font-size: medium;">1</span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;"> </span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;">-</span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;"> </span></td>
<td style="border-bottom-style: solid; border-bottom-width: 1px; border-left-style: none; border-left-width: medium; border-right-style: solid; border-right-width: 2px; border-style: solid solid solid none; border-top-style: solid; border-top-width: 1px; border-width: 1px 2px 1px medium; text-align: right;"><span style="font-size: medium;">0</span></td>
<td style="border-style: solid; border-width: 1px; text-align: right;"><span style="font-size: medium;">0</span></td>
<td style="border-bottom-style: solid; border-bottom-width: 1px; border-left-style: solid; border-left-width: 1px; border-right-style: none; border-right-width: medium; border-style: solid none solid solid; border-top-style: solid; border-top-width: 1px; border-width: 1px medium 1px 1px; text-align: right;"><span style="font-size: medium;">0</span></td>
</tr>
<tr>
<td style="border-bottom-style: none; border-bottom-width: medium; border-left-style: none; border-left-width: medium; border-right-style: solid; border-right-width: 2px; border-style: solid solid none none; border-top-style: solid; border-top-width: 1px; border-width: 1px 2px medium medium; text-align: right;"><span style="font-size: medium;">1</span></td>
<td style="border-bottom-style: none; border-bottom-width: medium; border-left-style: solid; border-left-width: 1px; border-right-style: solid; border-right-width: 1px; border-style: solid solid none; border-top-style: solid; border-top-width: 1px; border-width: 1px 1px medium; text-align: right;"><span style="font-size: medium;">1</span></td>
<td style="border-bottom-style: none; border-bottom-width: medium; border-left-style: solid; border-left-width: 1px; border-right-style: none; border-right-width: medium; border-style: solid none none solid; border-top-style: solid; border-top-width: 1px; border-width: 1px medium medium 1px; text-align: right;"><span style="font-size: medium;">1</span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;"> </span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;"> </span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;"> </span></td>
<td style="border-bottom-style: none; border-bottom-width: medium; border-left-style: none; border-left-width: medium; border-right-style: solid; border-right-width: 2px; border-style: solid solid none none; border-top-style: solid; border-top-width: 1px; border-width: 1px 2px medium medium; text-align: center;"><span style="font-size: medium;">1</span></td>
<td style="border-bottom-style: none; border-bottom-width: medium; border-left-style: solid; border-left-width: 1px; border-right-style: solid; border-right-width: 1px; border-style: solid solid none; border-top-style: solid; border-top-width: 1px; border-width: 1px 1px medium; text-align: center;"><span style="font-size: medium;">1</span></td>
<td style="border-bottom-style: none; border-bottom-width: medium; border-left-style: solid; border-left-width: 1px; border-right-style: none; border-right-width: medium; border-style: solid none none solid; border-top-style: solid; border-top-width: 1px; border-width: 1px medium medium 1px; text-align: center;"><span style="font-size: medium;">2</span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;"> </span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;"> </span></td>
<td style="border-style: none; border-width: medium; text-align: center;"><span style="font-size: medium;"> </span></td>
<td style="border-bottom-style: none; border-bottom-width: medium; border-left-style: none; border-left-width: medium; border-right-style: solid; border-right-width: 2px; border-style: solid solid none none; border-top-style: solid; border-top-width: 1px; border-width: 1px 2px medium medium; text-align: right;"><span style="font-size: medium;">1</span></td>
<td style="border-bottom-style: none; border-bottom-width: medium; border-left-style: solid; border-left-width: 1px; border-right-style: solid; border-right-width: 1px; border-style: solid solid none; border-top-style: solid; border-top-width: 1px; border-width: 1px 1px medium; text-align: right;"><span style="font-size: medium;">0</span></td>
<td style="border-bottom-style: none; border-bottom-width: medium; border-left-style: solid; border-left-width: 1px; border-right-style: none; border-right-width: medium; border-style: solid none none solid; border-top-style: solid; border-top-width: 1px; border-width: 1px medium medium 1px; text-align: right;"><span style="font-size: medium;">1</span></td>
</tr>
</tbody></table></blockquote><div>
Or I could write it as </div><div><blockquote><b>BITOR(x,y) = x + y - BITAND(x,y)</b></blockquote>
<p>Here is a simple example with two decimal numbers expressed in binary. The results of AND and OR operations are below, with their decimal values.
</p><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: small;"><code> 27 = 00011011
42 = 00101010
AND = 00001010 = 10
OR = 00111011 = 59
</code></span></pre>
I can then write a simple SQL expression to calculate this, and perhaps put it into a PL/SQL function thus:<br />
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: small;"><code>WITH FUNCTION bitor(p1 INTEGER, p2 INTEGER) RETURN INTEGER IS
BEGIN
RETURN <b>p1+p2-bitand(p1,p2);</b>
END;
SELECT BITAND(27,42)
, 27+42-BITAND(27,42)
, bitor(27,42)
FROM DUAL
/
BITAND(27,42) 27+42-BITAND(27,42) BITOR(27,42)
------------- ------------------- ------------
10 59 59
</code></span></pre><p></p></div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-34199107041517743282022-09-20T09:00:00.012+01:002022-09-27T15:13:53.329+01:00No Execution Plan Survives Contact with the Optimizer UntransformedOne of the benefits of attending Oracle conferences is that by listening and talking to other people I get a different perspective on things. Sometimes, something gives me an idea or reminds me of the importance of something that I don't use often enough.
I was talking with <a href="https://chandlerdba.com/" target="_blank">Neil Chandler</a> about SQL Query Transformation. We came up with a variation of a <a href="https://www.oxfordreference.com/view/10.1093/acref/9780191826719.001.0001/q-oro-ed4-00007547" target="_blank">well known quote</a>:<div><blockquote><blockquote class="twitter-tweet"><p dir="ltr" lang="en">Chatting with <a href="https://twitter.com/ChandlerDBA?ref_src=twsrc%5Etfw">@ChandlerDBA</a> <a href="https://twitter.com/hashtag/aced?src=hash&ref_src=twsrc%5Etfw">#aced</a> at <a href="https://twitter.com/hashtag/OUGIreland?src=hash&ref_src=twsrc%5Etfw">#OUGIreland</a> <a href="https://twitter.com/UKOUG?ref_src=twsrc%5Etfw">@UKOUG</a> today we came to the conclusion that <br />"No plan of execution survives contact with the optimizer untransformed!"</p>— David Kurtz - /*+Go-Faster*/ Consultancy (@davidmkurtz) <a href="https://twitter.com/davidmkurtz/status/1566913472167874560?ref_src=twsrc%5Etfw">September 5, 2022</a></blockquote> <script async="" charset="utf-8" src="https://platform.twitter.com/widgets.js"></script></blockquote>
<div>It isn't completely accurate. Not every query gets transformed, but it occurs commonly, it made a good title, and you are reading this blog!</div><div>During SQL parse, the optimizer can transform a SQL query into another SQL query that is functionally identical but that results in an execution plan with a lower cost (and therefore should execute more quickly). Sometimes, multiple transformations can be applied to a single statement. </div>
<div>The Oracle documentation describes <a href="https://docs.oracle.com/en/database/oracle/oracle-database/21/tgsql/query-transformations.html#GUID-B2914447-CD6D-411C-8467-6E10E78F3DE0" target="_blank">various forms of transformation</a>. You can see in the execution plan that something has happened, but you can't see the transformed SQL statement directly. However, it can be obtained from the optimizer trace that can be enabled by setting event 10053.
</div>
<h4 style="text-align: left;">Demonstration </h4><div>I am going to take a simple SQL query</div><div><ul style="text-align: left;"><li>For the first execution, a <a href="https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Comments.html#GUID-AD766C93-F601-48E3-A339-BCA7604B10D3" target="_blank">NO_UNNEST</a> hint is used to prevent the subquery from being unnested. </li><li>Optimizer trace is enabled and disabled by setting and resetting event 10053. </li><li>Trace file names are enhanced with <a href="https://docs.oracle.com/en/database/oracle/oracle-database/21/refrn/TRACEFILE_IDENTIFIER.html" target="_blank">TRACEFILE_IDENTIFIER</a>, so I know which trace file relates to which test. </li><li>Finally, I use my <a href="https://github.com/davidkurtz/orascripts/blob/master/spooltrc.sql" target="_blank">spooltrc</a> script to spool the trace file locally from <a href="https://docs.oracle.com/en/database/oracle/oracle-database/21/refrn/V-DIAG_TRACE_FILE_CONTENTS.html#GUID-D5750193-4789-4D39-B57C-250A38961605" target="_blank">V$DIAG_TRACE_FILE_CONTENTS</a> (see previous blog post <a href="https://blog.go-faster.co.uk/2022/09/obtaining-database-trace-files.html" target="_blank">Obtaining Trace Files without Access to the Database Server</a>).
</li></ul></div><div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: small;"><code>set pages 99 lines 200 autotrace off
alter session set tracefile_identifier='no_unnest';
alter session set events '10053 trace name context forever, level 1';
select emplid, name, effdt, last_name
from ps_names x
where x.last_name = 'Smith'
and x.name_type = 'PRI'
and x.effdt = (
SELECT /*+NO_UNNEST*/ MAX(x1.effdt)
FROM ps_names x1
WHERE x1.emplid = x.emplid
AND x1.name_type = x.name_type
AND x1.effdt <= SYSDATE)
/
alter session set events '10053 trace name context off';
@spooltrc
</code></span></pre><ul style="text-align: left;"><li>For the second execution, an <a href="https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Comments.html#GUID-9F03EB3B-382E-4B11-97E9-D7FC14CF92E7" target="_blank">UNNEST</a> hint is used to force the optimizer to unnest the sub-query.
</li></ul><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: small;"><code>alter session set tracefile_identifier='unnest';
alter session set events '10053 trace name context forever, level 1';
select emplid, name, effdt, last_name
from ps_names x
where x.last_name = 'Smith'
and x.name_type = 'PRI'
and x.effdt = (
SELECT /*+UNNEST*/ MAX(x1.effdt)
FROM ps_names x1
WHERE x1.emplid = x.emplid
AND x1.name_type = x.name_type
AND x1.effdt <= SYSDATE)
/
alter session set events '10053 trace name context off';
@spooltrc
</code></span></pre>
This is the execution plan from the first trace file for the statement with the NO_UNNEST hint. The select query blocks are simply numbered sequentially and thus are called SEL$1 and SEL$2. SEL$2 is the sub-query that references PS_NAMES with the row source alias X1. No query transformation has occurred.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: small;"><code>-------------------------------------------------------+-----------------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time |
-------------------------------------------------------+-----------------------------------+
| 0 | SELECT STATEMENT | | | | 122 | |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED | PS_NAMES| 1 | 44 | 120 | 00:00:02 |
| 2 | INDEX SKIP SCAN | PS_NAMES| 11 | | 112 | 00:00:02 |
| 3 | SORT AGGREGATE | | 1 | 21 | | |
| 4 | FIRST ROW | | 1 | 21 | 2 | 00:00:01 |
| 5 | INDEX RANGE SCAN (MIN/MAX) | PS_NAMES| 1 | 21 | 2 | 00:00:01 |
-------------------------------------------------------+-----------------------------------+
Query Block Name / Object Alias (identified by operation id):
------------------------------------------------------------
1 - SEL$1 / "X"@"SEL$1"
2 - SEL$1 / "X"@"SEL$1"
3 - SEL$2
5 - SEL$2 / "X1"@"SEL$2"
------------------------------------------------------------
Predicate Information:
----------------------
1 - filter("X"."LAST_NAME"='Smith')
2 - access("X"."NAME_TYPE"='PRI')
2 - filter(("X"."NAME_TYPE"='PRI' AND "X"."EFFDT"=))
5 - access("X1"."EMPLID"=:B1 AND "X1"."NAME_TYPE"=:B2 AND "X1"."EFFDT"<=SYSDATE@!)
</code></span></pre>
Now, let's look at the optimizer trace file for the statement with the UNNEST hint. First, we can see the statement as submitted with its SQL_ID.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: small;"><code>Trace file /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM/trace/CDBHCM_ora_21909_unnest.trc
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.7.0.0.0
…
----- Current SQL Statement for this session (sql_id=7r3mwa86fma5t) -----
select emplid, name, effdt, last_name
from ps_names x
where x.last_name = 'Smith'
and x.name_type = 'PRI'
and x.effdt = (
SELECT /*+UNNEST*/ MAX(x1.effdt)
FROM ps_names x1
WHERE x1.emplid = x.emplid
AND x1.name_type = x.name_type
AND x1.effdt <= SYSDATE)
…
</code></span></pre>
Later in the trace, we can see the fully expanded SQL statement preceded by the 'UNPARSED QUERY IS' message. </div><div><ul style="text-align: left;"><li>All the SQL language keywords have been forced into upper case.</li><li>All the object and column names have been made upper case to match the objects.</li><li>Every column and table is double-quoted which makes them case sensitive. </li><li>The columns all have row source aliases. </li><li>The row sources (tables in this case) are fully qualified.</li><li>Only the literal 'Smith' is in mixed case.</li></ul></div><div>Various unparsed queries may appear in the trace as the optimizer tries and costs different transformations. These are not nicely formatted, the expanded statements are just a long string of text. The first one is the expanded form of the untransformed statement.</div><div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>Stmt: ******* UNPARSED QUERY IS *******
SELECT "X"."EMPLID" "EMPLID","X"."NAME" "NAME","X"."EFFDT" "EFFDT","X"."LAST_NAME" "LAST_NAME" FROM "SYSADM"."PS_NAMES"
"X" WHERE "X"."LAST_NAME"='Smith' AND "X"."NAME_TYPE"='PRI' AND "X"."EFFDT"= (SELECT /*+ UNNEST */ MAX("X1"."EFFDT")
"MAX(X1.EFFDT)" FROM "SYSADM"."PS_NAMES" "X1" WHERE "X1"."EMPLID"="X"."EMPLID" AND "X1"."NAME_TYPE"="X"."NAME_TYPE" AND
"X1"."EFFDT"<=SYSDATE@!)</code></span></pre>
Here the sub-query has been transformed into an in-line view. I have reformatted it to make it easier to read.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>CVM: Merging complex view SEL$683B0107 (#2) into SEL$C772B8D1 (#1).
qbcp:******* UNPARSED QUERY IS *******
SELECT "X"."EMPLID" "EMPLID","X"."NAME" "NAME","X"."EFFDT" "EFFDT","X"."LAST_NAME" "LAST_NAME"
FROM (SELECT /*+ UNNEST */ MAX("X1"."EFFDT") "MAX(X1.EFFDT)","X1"."EMPLID" "ITEM_0","X1"."NAME_TYPE" "ITEM_1"
FROM "SYSADM"."PS_NAMES" "X1"
WHERE "X1"."EFFDT"<=SYSDATE@!
GROUP BY "X1"."EMPLID","X1"."NAME_TYPE") "VW_SQ_1"
,"SYSADM"."PS_NAMES" "X"
WHERE "X"."LAST_NAME"='Smith'
AND "X"."NAME_TYPE"='PRI'
AND "X"."EFFDT"="VW_SQ_1"."MAX(X1.EFFDT)"
AND "VW_SQ_1"."ITEM_0"="X"."EMPLID"
AND "VW_SQ_1"."ITEM_1"="X"."NAME_TYPE"</code></span></pre>This is the final form of the statement that was executed and that produced the execution plan. The in-line view has been merged into the parent query. There will only be a final query section if any transformations have occurred. Again, I have reformatted it to make it easier to read. <pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>Final query after transformations:******* UNPARSED QUERY IS *******
SELECT /*+ UNNEST */ "X"."EMPLID" "EMPLID","X"."NAME" "NAME","X"."EFFDT" "EFFDT",'Smith' "LAST_NAME"
FROM "SYSADM"."PS_NAMES" "X1"
,"SYSADM"."PS_NAMES" "X"
WHERE "X"."LAST_NAME"='Smith'
AND "X"."NAME_TYPE"='PRI'
AND "X1"."EMPLID"="X"."EMPLID"
AND "X1"."NAME_TYPE"="X"."NAME_TYPE"
AND "X1"."EFFDT"<=SYSDATE@!
AND "X1"."NAME_TYPE"='PRI'
GROUP BY "X1"."NAME_TYPE","X".ROWID,"X"."EFFDT","X"."NAME","X"."EMPLID"
HAVING "X"."EFFDT"=MAX("X1"."EFFDT")
…
</code></span></pre><ul style="text-align: left;"><li>PS_NAMES X1 has been moved from the subquery into the main from clause. Instead of a correlated subquery, we now have a two-table join. </li><li>The query is grouped by the ROWID on row source X and the other selected columns. </li><li>Instead of joining the tables on NAME_TYPE, the literal criterion has been duplicated in X1 </li><li>A having clause is used to join X.EFFDT to the maximum value of X1.EFFDT. </li><li>Instead of selecting LAST_NAME from X, the literal value in the predicate has been put in the select clause. </li></ul>If we look at the execution plan for the unnested statement we can see X and X1 are now in query block SEL$841DDE77 that has been unnested and merged.</div>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: small;"><code>…
----- Explain Plan Dump -----
…
----------------------------------------+-----------------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time |
----------------------------------------+-----------------------------------+
| 0 | SELECT STATEMENT | | | | 139 | |
| 1 | FILTER | | | | | |
| 2 | SORT GROUP BY | | 1 | 77 | 139 | 00:00:02 |
| 3 | NESTED LOOPS | | 3 | 231 | 138 | 00:00:02 |
| 4 | TABLE ACCESS FULL | PS_NAMES| 2 | 112 | 136 | 00:00:02 |
| 5 | INDEX RANGE SCAN | PS_NAMES| 1 | 21 | 1 | 00:00:01 |
----------------------------------------+-----------------------------------+
Query Block Name / Object Alias (identified by operation id):
------------------------------------------------------------
1 - SEL$841DDE77
4 - SEL$841DDE77 / "X"@"SEL$1"
5 - SEL$841DDE77 / "X1"@"SEL$2"
------------------------------------------------------------
Predicate Information:
----------------------
1 - filter("EFFDT"=MAX("X1"."EFFDT"))
4 - filter(("X"."LAST_NAME"='Smith' AND "X"."NAME_TYPE"='PRI'))
5 - access("X1"."EMPLID"="X"."EMPLID" AND "X1"."NAME_TYPE"='PRI' AND "X1"."EFFDT"<=SYSDATE@!)
…
</code></span></pre>
The new query block name is a hash value based on the names of other blocks. The presence of such a block name is an indication of query transformation occurring. The query block name is stable and it is referenced in the outline of hints. </div><div><i>"A question that we could ask about the incomprehensible query block names that Oracle generates is: 'are they deterministic?' – is it possible for the same query to give you the same plan while generating different query block names on different versions of Oracle (or different days of the week). The answer is (or should be) no; when Oracle generates a query block name (after supplying the initial defaults of sel$1, sel$2 etc.) it applies a hashing function to the query block names that have gone INTO a transformation to generate the name that it will use for the block that comes OUT of the transformation."</i> - <a href="https://www.red-gate.com/simple-talk/databases/oracle-databases/execution-plans-part-7-query-blocks-and-inline-views/" target="_blank">Jonathan Lewis: Query Blocks and Inline Views</a> </div><div>As Jonathan points out <i>"the 'Outline Data' section of the report tells us that query block"</i> in my example SEL$841DDE77 <i>"is an 'outline_leaf', in other words, it is a 'final' query block that has actually been subject to independent optimization".</i> We can also see other query block names referenced in OUTLINE hints.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code> Outline Data:
/*+
BEGIN_OUTLINE_DATA
…
OUTLINE_LEAF(@"SEL$841DDE77")
MERGE(@"SEL$683B0107" >"SEL$C772B8D1")
OUTLINE(@"SEL$C772B8D1")
UNNEST(@"SEL$2")
OUTLINE(@"SEL$683B0107")
OUTLINE(@"SEL$7511BFD2")
OUTLINE(@"SEL$2")
OUTLINE(@"SEL$1")
FULL(@"SEL$841DDE77" "X"@"SEL$1")
INDEX(@"SEL$841DDE77" "X1"@"SEL$2" ("PS_NAMES"."EMPLID" "PS_NAMES"."NAME_TYPE" "PS_NAMES"."EFFDT"))
LEADING(@"SEL$841DDE77" "X"@"SEL$1" "X1"@"SEL$2")
USE_NL(@"SEL$841DDE77" "X1"@"SEL$2")
END_OUTLINE_DATA
*/
</code></span></pre>
We can see these query block names being generated in the trace as a number of transformations are applied with some description of the transformation.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: small;"><code>Registered qb: SEL$683B0107 0xfc6e3030 (SUBQ INTO VIEW FOR COMPLEX UNNEST SEL$2)
Registered qb: SEL$7511BFD2 0xfc6c5c68 (VIEW ADDED SEL$1)
Registered qb: SEL$C772B8D1 0xfc6c5c68 (SUBQUERY UNNEST SEL$7511BFD2; SEL$2)
Registered qb: SEL$841DDE77 0xfc6d91e0 (VIEW MERGE SEL$C772B8D1; SEL$683B0107; SEL$C772B8D1)
</code></span></pre>
</div></div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-26073772764687469842022-09-14T12:28:00.005+01:002022-09-15T16:17:52.771+01:00Obtaining Trace Files without Access to the Database Server<h4 style="text-align: left;">
Why Trace? </h4><div>For many years, I used database SQL Trace to investigate SQL performance problems. I would trace a process, obtain the trace file, profile it (with Oracle's <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/tgsql/performing-application-tracing.html#GUID-A1F41137-03E2-43AD-98E4-AD49760C4C35" target="_blank">TKPROF</a> or another profiling tool such as the <a href="https://method-r.com/software/workbench/" target="_blank">Method R profiler</a>, <a href="https://antognini.ch/2008/10/introduce-tvdxtat/" target="_blank">TVD$XTAT</a>, or <a href="http://oracledba.ru/orasrp/" target="_blank">OraSRP</a>), and analyse the profile. </div><div>Active Session History (ASH) was introduced in Oracle 10g. Today, it is usually where I start to investigate performance problems. It has the advantage that it is always on, and I can just query ASH data from the Automatic Workload Repository (AWR). However, ASH is only available on Enterprise Edition and requires the Diagnostics Pack licence. </div><div>Sometimes, even if available, ASH isn't enough. ASH is based on sampling database activity, while trace is a record of all the SQL activity in a session. Some short-lived behaviour, that doesn't generate many samples, is difficult to investigate with ASH. Sometimes, it is necessary to dig deeper and use SQL trace. </div>
<div>On occasion, you might want to generate other forms trace. For example, an optimizer trace (event 10053) in order to understand how an execution plan was arrived at.</div>
<h4 style="text-align: left;">Where is my Trace File? </h4>
<div>A trend that I have observed over the years is that is is becoming ever more difficult to get hold of the trace files. If you are not the production DBA, you are unlikely to get access to the database server. Frequently, I find that pre-production performance test databases, which are often clones of the production database, are treated as production systems. After all, they contain production data. The move to the cloud has accelerated that trend. On some cloud services, you have no access to the database server at all! </div><div>In the past, I have blogged about using an <a href="https://blog.psftdba.com/2006/12/retrieving-oracle-trace-files-via.html" target="_blank">external table</a> from which the trace file can be queried, a variation of a theme others had also written about. It required certain privileges, a new external table was required for each trace file, and you had to know the name of the trace file, and on which RAC instance it was located. </div>
<div>However, in version 12.2, it is much easier. Oracle has provided some new views that report what trace files are available and then query their contents. </div>
<h4 style="text-align: left;">Where Is This Session Writing Its Trace File?</h4><div>The Automatic Diagnostic Repository (ADR) was first documented in 11g. The view <a href="https://docs.oracle.com/en/database/oracle/oracle-database/21/refrn/V-DIAG_INFO.html" target="_blank">V$DIAG_INFO</a> was introduced in 12c, from which you can query the state of the ADR. This includes the various directory paths to which files are written and the name of the current trace file.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: x-small;"><code>select dbid, con_dbid, name from v$database;
column inst_id format 99 heading 'Inst|ID'
column con_id format 99 heading 'Con|ID'
column name format a22
column value format a95
select * from v$diag_info;
</code></span><span style="font-size: 80%;">
Inst Con
ID NAME VALUE ID
---- ---------------------- ----------------------------------------------------------------------------------------------- ---
1 Diag Enabled TRUE 0
1 ADR Base /opt/oracle/psft/db/oracle-server 0
1 ADR Home /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM 0
1 Diag Trace /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM/trace 0
1 Diag Alert /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM/alert 0
1 Diag Incident /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM/incident 0
1 Diag Cdump /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM/cdump 0
1 Health Monitor /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM/hm 0
1 Default Trace File /opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM/trace/CDBHCM_ora_27009_unnest.trc 0
1 Active Problem Count 0 0
1 Active Incident Count 0 0
1 ORACLE_HOME /opt/oracle/psft/db/oracle-server/19.3.0.0 0
</span></pre></div><h4 style="text-align: left;">
What files have been written? </h4><div>The available files are reported by <a href="https://docs.oracle.com/en/database/oracle/oracle-database/21/refrn/V-DIAG_TRACE_FILE.html#GUID-368F8ECA-33CA-4972-8535-B8F536046F67" target="_blank">V$DIAG_TRACE_FILE</a><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; text-align: left; width: 95%;"><span style="font-size: small;"><code>column adr_home format a60
column trace_filename format a40
column change_time format a32
column modify_time format a32
column con_id format 999
select *
from v$DIAG_TRACE_FILE
where adr_home = '&adr_Home'
order by modify_time
/
</code></span><span style="font-size: 63%;">
ADR_HOME TRACE_FILENAME CHANGE_TIME MODIFY_TIME CON_ID
------------------------------------------------------------ ---------------------------------------- -------------------------------- -------------------------------- ------
…
/opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM CDBHCM_ora_27674_no_unnest.trc 13-SEP-22 02.06.10.000 PM +00:00 13-SEP-22 02.06.10.000 PM +00:00 3
/opt/oracle/psft/db/oracle-server/diag/rdbms/cdbhcm/CDBHCM CDBHCM_ora_27674_unnest.trc 13-SEP-22 02.06.11.000 PM +00:00 13-SEP-22 02.06.11.000 PM +00:00 3
</span></pre></div><h4 style="text-align: left;">
What is in the file? </h4>
<div>I can then extract the contents of the file from <a href="https://docs.oracle.com/en/database/oracle/oracle-database/21/refrn/V-DIAG_TRACE_FILE_CONTENTS.html#GUID-D5750193-4789-4D39-B57C-250A38961605" target="_blank">V$DIAG_TRACE_FILE_CONTENTS</a>. Each line of the trace is returned in a different row. </div>
<div>This script spools the contents of the current trace file from SQL Plus locally to a file of the same name. It stores the name of the ADR home and its file path and the trace file name to SQL*Plus variables and then uses these to query the trace file contents. </div>
<div>I can generate a trace and then run this script to extract it locally.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: small;"><code>REM <a href="https://github.com/davidkurtz/psscripts/blob/master/spooltrc.sql" target="_blank">spooltrc.sql</a>
clear screen
set heading on pages 99 lines 180 verify off echo off trimspool on termout on feedback off
column value format a95
column value new_value adr_home heading 'ADR Home'
select value from v$diag_info where name = 'ADR Home';
column value new_value diag_trace heading 'Diag Trace'
select value from v$diag_info where name = 'Diag Trace';
column value new_value trace_filename heading 'Trace File'
select SUBSTR(value,2+LENGTH('&diag_trace')) value from v$diag_info where name = 'Default Trace File'
/
column adr_home format a60
column trace_filename format a40
column change_time format a32
column modify_time format a32
column con_id format 999
select *
from v$DIAG_TRACE_FILE
where adr_home = '&adr_home'
and trace_filename = '&trace_filename'
/
set head off pages 0 lines 5000 verify off echo off timi off termout off feedback off long 5000
spool &trace_filename
select payload
from v$diag_trace_file_contents
where adr_home = '&adr_home'
and trace_filename = '&trace_filename'
order by line_number
/
spool off
set head on pages 99 lines 180 verify on echo on termout on feedback on</code></span></pre>
The <a href="https://github.com/davidkurtz/psscripts/blob/master/spooltrc.sql" target="_blank">spooltrc.sql</a> script is available on <a href="https://github.com/davidkurtz/psscripts" target="_blank">Github</a>. In a subsequent blog, I will demonstrate how to use it.</div>
<div>The payload is a VARCHAR2 column, so it is easy to search one or several trace files for specific text. This is useful if you are having trouble identifying the trace file of interest. </div>
<div>See also:</div><div><ul style="text-align: left;"><li>Franck Pachot's Blog: <a href="https://www.dbi-services.com/blog/exadata-express-cloud-service-sql-and-optimizer-trace/">Exadata Express Cloud Service: SQL and Optimizer trace</a>.</li></ul></div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-43376746090591406102021-08-03T10:03:00.000+01:002021-08-03T10:03:23.964+01:00Alter SQL Profiles from Exact to Force Matching<p>You can use <a href="http://DBMS_SQLTUNE.ALTER_SQL_PROFILE" target="_blank">DBMS_SQLTUNE.ALTER_SQL_PROFILE</a> to change the status, name, description, or category of a SQL profile, but you can't alter it from exact to force matching. Instead, you would have to recreate it. That is easy if you have the script that you used to create it in the first place. There is another way.</p><p>Oracle support note <a href="https://support.oracle.com/epmos/faces/DocContentDisplay?id=457531.1" target="_blank">How to Move SQL Profiles from One Database to Another (Including to Higher Versions) (Doc ID 457531.1)</a> describes a process to export SQL profiles to a staging table that can be imported into another database. This provides an opportunity to alter a profile by updating the data in the staging table. There are two columns in the staging table that have to be updated.</p><p></p><ul style="text-align: left;"><li><i>SQLFLAGS </i>must be updated from 0 (indicating an exact match profile) to 1 (indicating a force match profile)</li><li><i>SIGNATURE </i>must be recalculated as a force matching signature using <a href="http://DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE" target="_blank">DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE</a>.</li></ul><p></p>
<h3 style="text-align: left;">Demonstration</h3>
<p>I am going to create a small table with a unique index. </p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: small;"><code>CREATE TABLE t (a not null, b) AS
SELECT rownum, ceil(sqrt(rownum)) FROM dual connect by level <= 100;
create unique index t_idx on t(a);
exec dbms_stats.gather_table_stats(user,'T');
ttitle off
select * from dba_sql_profiles where name like 'my%sql_profile%';
explain plan for SELECT * FROM t WHERE a = 42;
ttitle 'Default Execution plan without profiles (index scan)'
select * from table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));</code></span></pre>
<p>Without any SQL profiles, when I query by the unique key I get a unique index scan.</p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: small;"><code>Plan hash value: 2929955852
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 6 | 1 (0)| 00:00:01 |
|* 2 | INDEX UNIQUE SCAN | T_IDX | 1 | | 0 (0)| 00:00:01 |
-------------------------------------------------------------------------------------</code></span></pre>
<p>Now I am going to create two SQL profiles. I have deliberately put the same SQL text into both SQL Profiles.</p><p></p><ul style="text-align: left;"><li><i>my_sql_profile </i>is exact matching</li><li><i>my_sql_profile_force</i> is force matching.</li></ul><p></p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span><code><span style="font-size: small;">DECLARE
signature INTEGER;
sql_txt CLOB;
h SYS.SQLPROF_ATTR;
BEGIN
sql_txt := q'[
<b>SELECT * FROM t WHERE a = 54
</b>]';
h := SYS.SQLPROF_ATTR(
q'[BEGIN_OUTLINE_DATA]',
q'[IGNORE_OPTIM_EMBEDDED_HINTS]',
<b>q'[FULL(@"SEL$1" "T"@"SEL$1")]',
</b>q'[END_OUTLINE_DATA]');
signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt);
DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
sql_text => sql_txt,
profile => h,
<b>name => 'my_sql_profile',
</b>category => 'DEFAULT',
validate => TRUE,
replace => TRUE,
<b>force_match => FALSE
</b>);
END;
/
DECLARE
signature INTEGER;
sql_txt CLOB;
h SYS.SQLPROF_ATTR;
BEGIN
sql_txt := q'[
<b>SELECT * FROM t WHERE a = 54
</b>]';
h := SYS.SQLPROF_ATTR(
q'[BEGIN_OUTLINE_DATA]',
q'[IGNORE_OPTIM_EMBEDDED_HINTS]',
<b>q'[FULL(@"SEL$1" "T"@"SEL$1")]',
</b>q'[END_OUTLINE_DATA]');
signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt);
DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
sql_text => sql_txt,
profile => h,
<b>name => 'my_sql_profile_force',
</b>category => 'DEFAULT',
validate => TRUE,
replace => TRUE,
<b>force_match => TRUE
</b>);
END;
/
ttitle off
select * from dba_sql_profiles where name like 'my%sql_profile%';</span><span style="font-size: 60%;">
NAME CATEGORY SIGNATURE SQL_TEXT CREATED
------------------------------ ---------- --------------------- -------------------------------------------------------------------------------- ------------------------------
LAST_MODIFIED DESCRIPTION TYPE STATUS FOR TASK_ID TASK_EXEC_NAME TASK_OBJ_ID TASK_FND_ID TASK_REC_ID TASK_CON_DBID
------------------------------ -------------------- ------- -------- --- ---------- -------------------- ----------- ----------- ----------- -------------
my_sql_profile DEFAULT 9394869341287877934 31-JUL-21 10.47.34.243454
SELECT * FROM t WHERE a = 54
31-JUL-21 10.47.34.000000 MANUAL ENABLED NO
my_sql_profile_force DEFAULT 11431056000319719221 31-JUL-21 10.47.34.502721
SELECT * FROM t WHERE a = 54
31-JUL-21 10.47.34.000000 MANUAL ENABLED YES</span></code></span></pre>
<p>The force match profile works if the literal value is different from that in the profiles.</p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: small;"><code>explain plan for SELECT * FROM t WHERE a = 42;
select * from table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 1 | 6 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
…
Predicate Information (identified by operation id):
---------------------------------------------------
<b> 1 - filter("A"=42)
</b>
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------
0 - STATEMENT
- IGNORE_OPTIM_EMBEDDED_HINTS
1 - SEL$1 / T@SEL$1
- FULL(@"SEL$1" "T"@"SEL$1")
Note
-----
<b> - SQL profile "my_sql_profile_force" used for this statement</b></code></span></pre>
<p>The exact match profile takes precedence of the force match profile.</p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: small;"><code>explain plan for SELECT * FROM t WHERE a = 54;
select * from table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 1 | 6 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
…
Predicate Information (identified by operation id):
---------------------------------------------------
<b> 1 - filter("A"=54)
</b>
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------
0 - STATEMENT
- IGNORE_OPTIM_EMBEDDED_HINTS
1 - SEL$1 / T@SEL$1
- FULL(@"SEL$1" "T"@"SEL$1")
Note
-----
<b> - SQL profile "my_sql_profile" used for this statement</b></code></span></pre>
<p>I am now going to follow the process to export the SQL Profiles to a staging table, and subsequently reimport them.</p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: xx-small;"><code>exec DBMS_SQLTUNE.CREATE_STGTAB_SQLPROF(table_name=>'STAGE',schema_name=>user);
exec DBMS_SQLTUNE.PACK_STGTAB_SQLPROF (staging_table_name =>'STAGE',profile_name=>'my_sql_profile');
exec DBMS_SQLTUNE.PACK_STGTAB_SQLPROF (staging_table_name =>'STAGE',profile_name=>'my_sql_profile_force');</code></span></pre>
<p>There is a row in the staging table for each profile and you can see the differences between them.</p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: xx-small;"><code>select signature, sql_handle, obj_name, obj_type, sql_text, sqlflags from STAGE;
SIGNATURE SQL_HANDLE OBJ_NAME
--------------------- ------------------------------ ---------------------------------------------------------------------
OBJ_TYPE SQL_TEXT SQLFLAGS
------------------------------ -------------------------------------------------------------------------------- ----------
9394869341287877934 SQL_826147e3c6ac0d2e my_sql_profile
SQL_PROFILE <b>0</b>
<b> SELECT * FROM t WHERE a = 54
</b>
11431056000319719221 SQL_9ea344de32a78735 my_sql_profile_force
SQL_PROFILE <b>1</b>
<b> SELECT * FROM t WHERE a = 54</b></code></span></pre>
<p>I will update the staging table using this PL/SQL loop (because SQL doesn't recognise TRUE as a boolean constant).</p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: small;"><code>DECLARE
l_sig INTEGER;
BEGIN
FOR i IN (
SELECT rowid, stage.* FROM stage WHERE sqlflags = 0 FOR UPDATE
) LOOP
l_sig := dbms_sqltune.sqltext_to_signature(i.sql_text,TRUE);
UPDATE stage
SET signature = l_sig
, sqlflags = 1
WHERE sqlflags = 0
AND rowid = i.rowid;
END LOOP;
END;
/</code></span></pre>
<p>And now the profiles look the same.</p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: xx-small;"><code>select signature, sql_handle, obj_name, obj_type, sql_text, sqlflags from STAGE;
SIGNATURE SQL_HANDLE OBJ_NAME
--------------------- ------------------------------ ---------------------------------------------------------------------
OBJ_TYPE SQL_TEXT SQLFLAGS
------------------------------ -------------------------------------------------------------------------------- ----------
11431056000319719221 SQL_826147e3c6ac0d2e my_sql_profile
SQL_PROFILE 1
SELECT * FROM t WHERE a = 54
11431056000319719221 SQL_9ea344de32a78735 my_sql_profile_force
SQL_PROFILE 1
SELECT * FROM t WHERE a = 54</code></span></pre>
<p>But I can't just reimport my_sql_profile from the staging replacing the one in the database because I will get <i>ORA-13841: SQL profile named my_sql_profile already exists for a different signature/category pair</i>. To avoid this error I must either drop the profile or rename it. </p><p>I am going to rename the existing exact matching profile, and also disable it and move it to another category to stop it from matching my statement in preference to the force matching profile (see previous post <a href="https://blog.go-faster.co.uk/2021/07/clashing-sql-profiles-exact-matching.html">Clashing SQL Profiles - Exact Matching Profiles Take Precedence Over Force Matching Profiles</a>), and thus I can go back to it later if needed.</p><p>I will drop my example force matching profile. I no longer need that.</p><p>Then, I can reimport the profile from the staging table.</p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span><code>e<span style="font-size: x-small;">xec dbms_sqltune.alter_sql_profile(name=>'my_sql_profile', attribute_name=>'NAME',value=>'my_old_sql_profile');
exec dbms_sqltune.alter_sql_profile(name=>'my_old_sql_profile', attribute_name=>'CATEGORY',value=>'DO_NOT_USE');
exec dbms_sqltune.alter_sql_profile(name=>'my_old_sql_profile', attribute_name=>'STATUS',value=>'DISABLED');
exec dbms_sqltune.drop_sql_profile('my_sql_profile_force',TRUE);
EXEC DBMS_SQLTUNE.UNPACK_STGTAB_SQLPROF(profile_name => 'my_sql_profile', replace => TRUE, staging_table_name => 'STAGE');</span></code></span></pre>
<p>I can see in the SQL profile table that my SQL profile is now force matching, and it has a different signature to the old one that is exact matching.</p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: 60%;"><code>ttitle off
select * from dba_sql_profiles where name like 'my%sql_profile%';
NAME CATEGORY SIGNATURE SQL_TEXT CREATED
------------------------------ ---------- --------------------- -------------------------------------------------------------------------------- ------------------------------
LAST_MODIFIED DESCRIPTION TYPE STATUS FOR TASK_ID TASK_EXEC_NAME TASK_OBJ_ID TASK_FND_ID TASK_REC_ID TASK_CON_DBID
------------------------------ -------------------- ------- -------- --- ---------- -------------------- ----------- ----------- ----------- -------------
my_old_sql_profile DO_NOT_USE 9394869341287877934 31-JUL-21 10.54.58.694037
SELECT * FROM t WHERE a = 54
31-JUL-21 10.55.00.000000 MANUAL DISABLED NO
my_sql_profile DEFAULT 11431056000319719221 31-JUL-21 10.55.01.005377
SELECT * FROM t WHERE a = 54
31-JUL-21 10.55.01.000000 MANUAL <b>ENABLED YES</b></code></span></pre>
<p>Both my queries now match the new force matching version of the profile.</p>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: small;"><code>explain plan for SELECT * FROM t WHERE a = 42;
ttitle 'Execution plan with force match profile (full scan)'
select * from table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 1 | 6 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
…
Predicate Information (identified by operation id):
---------------------------------------------------
<b> 1 - filter("A"=42)</b>
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------
0 - STATEMENT
- IGNORE_OPTIM_EMBEDDED_HINTS
1 - SEL$1 / T@SEL$1
- FULL(@"SEL$1" "T"@"SEL$1")
Note
-----
<b> - SQL profile "my_sql_profile" used for this statement</b>
explain plan for SELECT * FROM t WHERE a = 54;
select * from table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 1 | 6 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
…
Predicate Information (identified by operation id):
---------------------------------------------------
<b> 1 - filter("A"=54)</b>
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------
0 - STATEMENT
- IGNORE_OPTIM_EMBEDDED_HINTS
1 - SEL$1 / T@SEL$1
- FULL(@"SEL$1" "T"@"SEL$1")
Note
-----
<b> - SQL profile "my_sql_profile" used for this statement</b></code></span></pre>The script used for this demonstration is available on <a href="https://github.com/davidkurtz/demoscripts/blob/master/sql_profiles/convert_to_force_match.sql" target="_blank">GitHub</a><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-271326796971287872021-08-02T15:14:00.006+01:002021-08-02T15:46:58.136+01:00Detecting Clashing SQL Profiles<p>In my <a href="https://blog.go-faster.co.uk/2021/07/clashing-sql-profiles-exact-matching.html">last post</a>, I discussed the possible undesirable consequences of force and exact matching SQL profiles on statements with the same force matching signature. The question is how do you detect such profiles?</p>
<p>I have created three profiles on very similar SQL statements that only differ in the literal value of a predicate. One of them is force matching, the others are exact matching. The signature reported by DBA_SQL_PROFILES is the force matching signature for force matching profiles, and the exact matching signature for exact matching profiles.</p><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: 66%;"><code>select * from dba_sql_profiles;
NAME CATEGORY SIGNATURE SQL_TEXT CREATED
------------------------------ ---------- --------------------- -------------------------------------------------- ------------------------------
LAST_MODIFIED DESCRIPTION TYPE STATUS FOR TASK_ID TASK_EXEC_NAME TASK_OBJ_ID TASK_FND_ID TASK_REC_ID TASK_CON_DBID
------------------------------ -------------------- ------- -------- --- ---------- -------------------- ----------- ----------- ----------- -------------
my_sql_profile_force DEFAULT 11431056000319719221 16:09:33 01/08/2021
SELECT * FROM t WHERE a = 54
16:09:33 01/08/2021 MANUAL ENABLED YES
my_sql_profile_24 DEFAULT 12140764948557749245 16:09:33 01/08/2021
SELECT * FROM t
WHERE a = 24
16:09:33 01/08/2021 MANUAL ENABLED NO
my_sql_profile_42 DEFAULT 14843900676141266266 16:09:33 01/08/2021
SELECT * FROM t WHERE a = 42
16:09:33 01/08/2021 MANUAL ENABLED NO</code></span></pre>In order to be able to compare the profiles, I need to calculate the force matching signature for the exact matching profiles using <a href="http://DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE" target="_blank">DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE</a>. I can't use the Boolean constant TRUE parameter in SQL. Instead, I have used a PL/SQL function in a with clause.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code>REM <a href="https://github.com/davidkurtz/demoscripts/blob/master/sql_profiles/dup_sql_profiles1.sql" target="_blank">dup_sql_profiles1.sql</a>
WITH function sig(p_sql_text CLOB, p_number INTEGER) RETURN NUMBER IS
l_sig NUMBER;
BEGIN
IF p_number > 0 THEN
l_sig := dbms_sqltune.sqltext_to_signature(p_sql_text,TRUE);
ELSIF p_number = 0 THEN
l_sig := dbms_sqltune.sqltext_to_signature(p_sql_text,FALSE);
END IF;
RETURN l_sig;
END;
x as (
select CASE WHEN force_matching = 'NO' THEN signature ELSE sig(sql_text, 0) END exact_sig
, CASE WHEN force_matching = 'YES' THEN signature ELSE sig(sql_text, 1) END force_sig
, p.*
from dba_sql_profiles p
where (status = 'ENABLED' or force_matching = 'NO')
), y as (
select x.*
, row_number() over (partition by category, force_sig order by force_matching desc, exact_sig nulls first) profile#
, count(*) over (partition by category, force_sig) num_profiles
from x
)
select profile#, num_profiles, force_sig, exact_sig, name, created, category, status, force_matching, sql_text
from y
where num_profiles > 1
order by force_sig, force_matching desc, exact_sig
/</code></span></pre> We can see these three profiles are grouped together. The force matching signature calculated on the exact matching profiles is the same as the signature on the force matching profile. Now I can start to make some decisions about whether I should retain the exact matching profiles or remove them and just use the force matching profile.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: 66%;"><code>Prof Num Force Matching Exact Matching
# Profs Signature Signature NAME CREATED CATEGORY STATUS FOR
---- ----- --------------------- --------------------- ------------------------------ ---------------------------- -------------------- -------- ---
SQL_TEXT
----------------------------------------------------------------------------------------------------------------------------------------------------
1 3 11431056000319719221 my_sql_profile_force 16:35:36 01/08/2021 DEFAULT ENABLED YES
SELECT * FROM t WHERE a = 54
2 3 12140764948557749245 my_sql_profile_24 16:35:36 01/08/2021 DEFAULT ENABLED NO
SELECT * FROM t
WHERE a = 24
3 3 14843900676141266266 my_sql_profile_42 16:35:36 01/08/2021 DEFAULT ENABLED NO
SELECT * FROM t WHERE a = 42</code></span></pre>The SQL statements in this example are absurdly simple. In real life that is rarely the case. Sometimes it can be a struggle to see where two complex statements differ.<div>In the next query, I compare enabled force matching SQL profiles to any exact matching profiles in the same category with the same force matching signature. The full query is on <a href="https://github.com/davidkurtz/demoscripts/blob/master/sql_profiles/dup_sql_profiles2.sql" target="_blank">GitHub</a>.</div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code>REM <a href="https://github.com/davidkurtz/demoscripts/blob/master/sql_profiles/dup_sql_profiles2.sql" target="_blank">dup_sql_profiles2.sql</a>
WITH function sig(p_sql_text CLOB, p_number INTEGER) RETURN NUMBER IS
…
END sig;
function norm(p_queryin CLOB) RETURN CLOB IS
…
END norm;
function str_diff(p_str1 CLOB, p_str2 CLOB) RETURN NUMBER IS
…
END str_diff;
x as (
select CASE WHEN force_matching = 'NO' THEN signature ELSE sig(sql_text, 0) END exact_sig
, CASE WHEN force_matching = 'YES' THEN signature ELSE sig(sql_text, 1) END force_sig
, p.*
from dba_sql_profiles p
), y as (
select f.force_matching, f.force_sig, f.name force_name, f.created force_created, f.status force_status
, e.force_matching exact_matching, e.exact_sig, e.name exact_name
, e.created exact_created, e.status exact_status, e.category
, norm(e.sql_text) esql_text, norm(f.sql_text) fsql_text
from x e
, x f
where f.force_matching = 'YES'
and e.force_matching = 'NO'
and e.force_sig = f.force_sig
and e.category = f.category
and e.name != f.name
and f.status = 'ENABLED'
), z as (
select y.*
, str_diff(fsql_Text, esql_text) diff_len
from y
)
select force_matching, force_Sig, force_name, force_created, force_status
, exact_matching, exact_sig, exact_name, exact_Created, exact_status
, substr(fsql_text,1,diff_len) common_text
, substr(fsql_text,diff_len+1) fdiff_text, substr(esql_text,diff_len+1) ediff_text
from z
order by force_sig
/</code></span></pre>I have shown the common part of both statements, from the start to the first difference, and then also how the rest of each statement continues.<div>It is not enough to simply compare two statements character by character. Both the force and exact matching signatures are "<i><a href="https://docs.oracle.com/en/database/oracle/oracle-database/21/refrn/V-SQL.html#GUID-2B9340D7-4AA8-4894-94C0-D5990F67BE75" target="_blank">calculated on the normalized SQL text. The normalization includes the removal of white space and the uppercasing of all non-literal strings</a></i>". However, neither the normalised SQL, nor the normalisation mechanism is exposed by Oracle. Therefore, in this query, I have included my own rudimentary normalisation function (based on an idea from AskTOM) that I apply first and a string comparison function. You can see that normalisation has eliminated the line feed in from the statement in <i>my_sql_profile_24</i>.</div><div>Now I can see my two exact matching profiles match my force matching profile. I can see the common part of the SQL up to the literal value, and the different parts of the text are just the literal value. </div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: 60%;"><code> Force Matching Force Force Force Exact Matching Exact Exact Exact
FOR Signature Name Created Date Status EXA Signature Name Created Date Status
--- --------------------- ------------------------------ ---------------------------- -------- --- --------------------- ------------------------------ ---------------------------- --------
Common Text
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Force Text Exact Text
--------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------
YES 11431056000319719221 my_sql_profile_force 16:35:36 01/08/2021 ENABLED NO 12140764948557749245 my_sql_profile_24 16:35:36 01/08/2021 ENABLED
SELECT * FROM T WHERE A =
54 24
ENABLED NO 14843900676141266266 my_sql_profile_42 16:35:36 01/08/2021 ENABLED
SELECT * FROM T WHERE A =
54</code></span></pre>Both the queries mentioned in this blog are available on <a href="https://github.com/davidkurtz/demoscripts/tree/master/sql_profiles" target="_blank">GitHub</a>.<div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-8686138834933116042021-07-31T13:53:00.003+01:002021-08-01T18:42:55.412+01:00Clashing SQL Profiles - Exact Matching Profiles Take Precedence Over Force Matching Profiles<p>Sometimes, you reach a point in performance tuning, where you use a SQL Baseline, or SQL Patch, or SQL Profile to stabilise an execution plan. These methods all effectively inject a hint or set of hints into a statement to produce the desired execution plan. Baselines and Patches will only exactly match a SQL ID and therefore a SQL statement. However, a SQL Profile can optionally do force matching so that it applies to <a href="https://docs.oracle.com/en/database/oracle/oracle-database/12.2/tgsql/managing-sql-profiles.html#GUID-5EF6DC38-6118-48B4-8162-56E7C4570C1B" target="_blank"><i>"all SQL statements that have the same text after the literal values in the WHERE clause have been replaced by bind variables. </i></a></p><p><i>This setting may be useful for applications that use only literal values because it enables SQL with text differing only in its literal values to share a SQL profile. If both literal values and bind variables are in the SQL text, or if force_match is set to false (default), then the literal values in the WHERE clause are not replaced by bind variables.</i>" <span style="font-size: xx-small;">[Oracle Database SQL Tuning Guide]</span></p>
<div><div>I often work with PeopleSoft, whose batch processes often dynamically generate SQL with literal values. Therefore, I usually create force matching profiles when I need to control an execution plan. However, sometimes I come across situations where some exact matching (i.e. not force matching) profiles have been created (often by production DBAs using the tuning advisor) on different statements that have the same force matching signature, and then maybe a force matching profile has also been applied.</div><div><i><br /></i></div><div><i>Note: SQL Profiles require the Tuning Pack licence.</i></div></div><div><div style="text-align: left;"><b>Where both exact and force matching profiles apply to a SQL statement, the exact matching profile will take precedence over the force matching profile, and even if disabled it will prevent the force matching profile from being applied.</b></div><div>I will demonstrate this with a simple test. I will create a table with a couple of indexes, collect statistics, and generate an execution plan for a query. I am using <i>explain plan for</i> command to force a parse of the statement every time.</div></div>
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code>CREATE TABLE t (a not null, b) AS
SELECT rownum, ceil(sqrt(rownum)) FROM dual CONNECT BY LEVEL <= 100;
CREATE UNIQUE INDEX t_idx on t(a);
CREATE INDEX t_idx2 on t(b,a);
EXEC dbms_stats.gather_table_stats(user,'T');</code></span></pre><h3 style="text-align: left;">Without Any SQL Profiles</h3><div>Without any profiles in place, I get a skip scan of T_IDX2, and there is no note in the execution plan.</div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"></pre><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: xx-small;"><code>EXPLAIN PLAN FOR SELECT * FROM t WHERE a = 42;
SELECT * FROM table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));
Plan hash value: 3418618943
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 1 (0)| 00:00:01 |
<b>|* 1 | INDEX SKIP SCAN | T_IDX2 | 1 | 6 | 1 (0)| 00:00:01 |</b>
---------------------------------------------------------------------------
…
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
INDEX_SS(@"SEL$1" "T"@"SEL$1" ("T"."B" "T"."A"))
OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
DB_VERSION('19.1.0')
OPTIMIZER_FEATURES_ENABLE('19.1.0')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"=42)
filter("A"=42)
…</code></span></pre><h3 style="text-align: left;">Force Matching Profile</h3><div>Now I will create an exact matching SQL profile on the query that will force the use of the unique index. The query is the same except the literal value is different (it is 54 instead of 42).</div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: xx-small;"><code>DECLARE
signature INTEGER;
sql_txt CLOB;
h SYS.SQLPROF_ATTR;
BEGIN
sql_txt := q'[
<b>SELECT * FROM t WHERE a = 54</b>
]';
h := SYS.SQLPROF_ATTR(
q'[BEGIN_OUTLINE_DATA]',
q'[IGNORE_OPTIM_EMBEDDED_HINTS]',
<b>q'[FULL(@"SEL$1" "T"@"SEL$1")]',</b>
q'[END_OUTLINE_DATA]');
signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt);
DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
sql_text => sql_txt,
profile => h,
name => 'clashing_profile_test_force',
category => 'DEFAULT',
validate => TRUE,
replace => TRUE,
<b>force_match => TRUE </b>
);
END;
/</code></span></pre>I only have a force-matching profile.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: 60%;"><code> Execution plan with force matching profile (full scan)
NAME CATEGORY SIGNATURE SQL_TEXT CREATED
------------------------------ ---------- --------------------- -------------------------------------------------------------------------------- ------------------------------
LAST_MODIFIED DESCRIPTION TYPE STATUS FOR TASK_ID TASK_EXEC_NAME TASK_OBJ_ID TASK_FND_ID TASK_REC_ID TASK_CON_DBID
------------------------------ -------------------- ------- -------- --- ---------- -------------------- ----------- ----------- ----------- -------------
clashing_profile_test_force DEFAULT 11431056000319719221 27-JUL-21 01.35.43.854691 PM
SELECT * FROM t WHERE a = 54
27-JUL-21 01.35.43.000000 PM MANUAL ENABLED YES</code></span></pre>
The execution plan uses the full plan as specified by the profile, there is a note confirming that the profile was matched and used, and the full hint was listed in the hint report.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: xx-small;"><code>EXPLAIN PLAN FOR SELECT * FROM t WHERE a = 42;
SELECT * FROM table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));
Plan hash value: 1601196873
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 3 (0)| 00:00:01 |
<b>|* 1 | TABLE ACCESS STORAGE FULL| T | 1 | 6 | 3 (0)| 00:00:01 |</b>
----------------------------------------------------------------------------------
…
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
<b> FULL(@"SEL$1" "T"@"SEL$1")</b>
OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
DB_VERSION('19.1.0')
OPTIMIZER_FEATURES_ENABLE('19.1.0')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
Predicate Information (identified by operation id):
---------------------------------------------------
1 - storage("A"=42)
filter("A"=42)
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------
0 - STATEMENT
- IGNORE_OPTIM_EMBEDDED_HINTS
1 - SEL$1 / T@SEL$1
<b> - FULL(@"SEL$1" "T"@"SEL$1")</b>
Note
-----
<b> - SQL profile "clashing_profile_test_force" used for this statement</b></code></span></pre><h3 style="text-align: left;">
Exact Matching Profile </h3><div>I will now add an exact matching profile
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: xx-small;"><code>DECLARE
signature INTEGER;
sql_txt CLOB;
h SYS.SQLPROF_ATTR;
BEGIN
sql_txt := q'[
SELECT * FROM t WHERE a = 42
]';
h := SYS.SQLPROF_ATTR(
q'[BEGIN_OUTLINE_DATA]',
q'[IGNORE_OPTIM_EMBEDDED_HINTS]',
q'[INDEX(@"SEL$1" "T"@"SEL$1" ("T"."A"))]',
q'[END_OUTLINE_DATA]');
signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt);
DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
sql_text => sql_txt,
profile => h,
name => 'clashing_profile_test_exact',
category => 'DEFAULT',
validate => TRUE,
replace => TRUE,
force_match => FALSE
);
END;
/</code></span></pre>I can see I now have two SQL Profiles; one force matched, and one exact matched.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: 60%;"><code> Execution plan with force matching profile (unique index lookup)
NAME CATEGORY SIGNATURE SQL_TEXT CREATED
------------------------------ ---------- --------------------- -------------------------------------------------------------------------------- ------------------------------
LAST_MODIFIED DESCRIPTION TYPE STATUS FOR TASK_ID TASK_EXEC_NAME TASK_OBJ_ID TASK_FND_ID TASK_REC_ID TASK_CON_DBID
------------------------------ -------------------- ------- -------- --- ---------- -------------------- ----------- ----------- ----------- -------------
clashing_profile_test_exact DEFAULT 14843900676141266266 27-JUL-21 01.35.46.825697 PM
SELECT * FROM t WHERE a = 42
27-JUL-21 01.35.46.000000 PM MANUAL ENABLED <b>NO</b>
clashing_profile_test_force DEFAULT 11431056000319719221 27-JUL-21 01.35.43.854691 PM
SELECT * FROM t WHERE a = 54
27-JUL-21 01.35.43.000000 PM MANUAL ENABLED <b>YES</b></code></span></pre>
The execution plan has changed to the unique index scan. The index hint from the profile appears hints report. The note at the bottom of the plan shows the exact matching profile has been used, taking precedence over the force matching profile.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: xx-small;"><code>EXPLAIN PLAN FOR SELECT * FROM t WHERE a = 42;
SELECT * FROM table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));
Plan hash value: 2929955852
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 6 | 1 (0)| 00:00:01 |
<b>|* 2 | INDEX UNIQUE SCAN | T_IDX | 1 | | 0 (0)| 00:00:01 |</b>
-------------------------------------------------------------------------------------
…
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
<b> INDEX_RS_ASC(@"SEL$1" "T"@"SEL$1" ("T"."A"))</b>
OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
DB_VERSION('19.1.0')
OPTIMIZER_FEATURES_ENABLE('19.1.0')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A"=42)
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------
0 - STATEMENT
- IGNORE_OPTIM_EMBEDDED_HINTS
1 - SEL$1 / T@SEL$1
<b> - INDEX(@"SEL$1" "T"@"SEL$1" ("T"."A"))</b>
Note
-----
<b> - SQL profile "clashing_profile_test_exact" used for this statement</b></code></span></pre>
<h3 style="text-align: left;">Different Query</h3>
If I run the query with a different literal value, the plan changes back to the full scan, and the note reports the force matching profile was used
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: xx-small;"><code>EXPLAIN PLAN FOR SELECT * FROM t WHERE a = 54;
SELECT * FROM table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));
Plan hash value: 1601196873
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 3 (0)| 00:00:01 |
<b>|* 1 | TABLE ACCESS STORAGE FULL| T | 1 | 6 | 3 (0)| 00:00:01 |</b>
----------------------------------------------------------------------------------
…
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
<b> FULL(@"SEL$1" "T"@"SEL$1")</b>
OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
DB_VERSION('19.1.0')
OPTIMIZER_FEATURES_ENABLE('19.1.0')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
Predicate Information (identified by operation id):
---------------------------------------------------
1 - storage("A"=54)
filter("A"=54)
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------
0 - STATEMENT
- IGNORE_OPTIM_EMBEDDED_HINTS
1 - SEL$1 / T@SEL$1
<b> - FULL(@"SEL$1" "T"@"SEL$1")
</b>
Note
-----
<b> - SQL profile "clashing_profile_test_force" used for this statement</b></code></span></pre>
<h3>Disable Exact Matching SQL Profile</h3>
I will now disable the exact matching profile.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: 60%;"><code><b>exec dbms_sqltune.alter_sql_profile(name=>'clashing_profile_test_exact', attribute_name=>'STATUS',value=>'DISABLED');</b>
SELECT * FROM dba_sql_profiles where name like 'clashing%';
Disable Exact Profile - Execution plan with no profile (skip scan) - Odd
NAME CATEGORY SIGNATURE SQL_TEXT CREATED
------------------------------ ---------- --------------------- -------------------------------------------------------------------------------- ------------------------------
LAST_MODIFIED DESCRIPTION TYPE STATUS FOR TASK_ID TASK_EXEC_NAME TASK_OBJ_ID TASK_FND_ID TASK_REC_ID TASK_CON_DBID
------------------------------ -------------------- ------- -------- --- ---------- -------------------- ----------- ----------- ----------- -------------
clashing_profile_test_exact DEFAULT 14843900676141266266 27-JUL-21 01.35.46.825697 PM
SELECT * FROM t WHERE a = 42
27-JUL-21 01.35.52.000000 PM MANUAL <b>DISABLED </b>NO
clashing_profile_test_force DEFAULT 11431056000319719221 27-JUL-21 01.35.43.854691 PM
SELECT * FROM t WHERE a = 54
27-JUL-21 01.35.43.000000 PM MANUAL ENABLED YES</code></span></pre>
I expected the profile to switch back to the force matching profile, but instead it goes back to the original plan using the skip scan with no profile at all. So the disabled exact matching profile prevents the force matching profile from matching the statement, and then doesn't get applied to the statement either! There is no note in the execution plan and no hint report.<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: xx-small;"><code>EXPLAIN PLAN FOR SELECT * FROM t WHERE a = 42;
SELECT * FROM table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));
Plan hash value: 3418618943
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 1 (0)| 00:00:01 |
<b>|* 1 | INDEX SKIP SCAN | T_IDX2 | 1 | 6 | 1 (0)| 00:00:01 |</b>
---------------------------------------------------------------------------
…
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
INDEX_SS(@"SEL$1" "T"@"SEL$1" ("T"."B" "T"."A"))
OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
DB_VERSION('19.1.0')
OPTIMIZER_FEATURES_ENABLE('19.1.0')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"=42)
filter("A"=42)</code></span></pre>
<h3>Alter Category of Exact Matching SQL Profile</h3>
I could have dropped the SQL Profile, but I might want to retain it for documentation and in case I need to reinstate it. So instead I will move it to a different category.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: 60%;"><code><b>exec dbms_sqltune.alter_sql_profile(name=>'clashing_profile_test_exact', attribute_name=>'CATEGORY',value=>'DO_NOT_USE');</b>
SELECT * FROM dba_sql_profiles where name like 'clashing%';
Change Category of Exact Profile - Execution plan with force matching profile (full scan)
NAME CATEGORY SIGNATURE SQL_TEXT CREATED
------------------------------ ---------- --------------------- -------------------------------------------------------------------------------- ------------------------------
LAST_MODIFIED DESCRIPTION TYPE STATUS FOR TASK_ID TASK_EXEC_NAME TASK_OBJ_ID TASK_FND_ID TASK_REC_ID TASK_CON_DBID
------------------------------ -------------------- ------- -------- --- ---------- -------------------- ----------- ----------- ----------- -------------
clashing_profile_test_exact <b>DO_NOT_USE</b> 14843900676141266266 27-JUL-21 02.57.11.343291 PM
SELECT * FROM t WHERE a = 42
27-JUL-21 02.57.19.000000 PM MANUAL <b>DISABLED NO</b>
clashing_profile_test_force DEFAULT 11431056000319719221 27-JUL-21 02.57.08.390801 PM
SELECT * FROM t WHERE a = 54
27-JUL-21 02.57.08.000000 PM MANUAL ENABLED YES</code></span></pre>
And now the execution plan goes back to the force matching profile and the unique index lookup.
<pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: xx-small;"><code>EXPLAIN PLAN FOR SELECT * FROM t WHERE a = 42;
SELECT * FROM table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));
Plan hash value: 1601196873
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 3 (0)| 00:00:01 |
<b>|* 1 | TABLE ACCESS STORAGE FULL| T | 1 | 6 | 3 (0)| 00:00:01 |</b>
----------------------------------------------------------------------------------
…
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
<b> FULL(@"SEL$1" "T"@"SEL$1")
</b> OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
DB_VERSION('19.1.0')
OPTIMIZER_FEATURES_ENABLE('19.1.0')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
Predicate Information (identified by operation id):
---------------------------------------------------
1 - storage("A"=42)
filter("A"=42)
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------
0 - STATEMENT
- IGNORE_OPTIM_EMBEDDED_HINTS
1 - SEL$1 / T@SEL$1
<b> - FULL(@"SEL$1" "T"@"SEL$1")</b>
Note
-----
<b> - SQL profile "clashing_profile_test_force" used for this statement</b></code></span></pre>
<h3>Conclusion</h3>
An exact matching profile will be matched to a SQL statement before a force matching SQL statement, <b><u>even if it is disabled</u></b>, in which case neither profile will be applied.</div><div>If you have exact matching SQL profiles that provide the same hints to produce the same execution plan on various similar SQL statements that have the same force matching signature (i.e. they only differ in their literal values), and you wish to replace them with a single force matching profile, then rather than disable the exact matching profiles you should either drop them or if you prefer to retain them for documentation then alter them to a different category. </div><div><ul style="text-align: left;"><li>The script <i><a href="https://github.com/davidkurtz/demoscripts/tree/master/sql_profiles" target="_blank">GitHub</a> </i>used in this blog to demonstrate this behaviour is available on <a href="https://github.com/davidkurtz/demoscripts/tree/master/sql_profiles" target="_blank">GitHub</a>. They were run in Oracle 19.9 for this post.</li><li>The script <i><a href="http://disabled_profiles_category.sql" target="_blank">disabled_profiles_category.sql</a></i> moves all disabled profiles from the category <i>DEFAULT </i>to <i>DO_NOT_USE</i>.</li></ul></div><div>In a <a href="https://blog.go-faster.co.uk/2021/08/detecting-clashing-sql-profiles.html">subsequent post</a>, I will show how to detect conflicting SQL profiles.</div><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0tag:blogger.com,1999:blog-14654018.post-3812104516520214832021-04-06T19:45:00.001+01:002021-10-15T08:34:39.563+01:00Spatial Data 6: Text Searching Areas by their Name, and the Names of Parent Areas<p><i>This blog is part of a <a href="https://blog.go-faster.co.uk/2021/02/spatialindex.html" target="_blank">series about my first steps in using Spatial Data</a> in the Oracle database. I am using the GPS data from my cycling activities collected by Strava. All of my files are available on <a href="https://github.com/davidkurtz/strava" target="_blank">GitHub</a>.</i></p><div style="text-align: left;">Now I have loaded all the areas, I want to be able to search for them by name. I am going to create an Oracle Text Index, but I need to index more than just the name of each area. I must index the full hierarchy of each area so I can search on combinations of names in different types of areas. For example, I might search for a village and county (e.g. Streatley and Berkshire), to distinguish it from a village of the same name in a different county (e.g. Streatley in Bedfordshire).</div><p>I can generate the full hierarchy of an area with a PL/SQL function (<i><a href="https://github.com/davidkurtz/strava/blob/764f262fe51f52d0a5a8ae35c2734fae1aa6cfd3/strava_pkg.sql#L379">strava_pkg.name_heirarchy_fn</a></i>) by navigating up the linked list and discarding repeated names. I could make that available in a virtual column. However, I cannot build a text index on a function or a virtual column. </p><h4 style="text-align: left;">Text Index Option 1: Store Hierarchy on Table, and Create a Multi-Column Text Index</h4><p>I could store the hierarchy of an area on the <i>my_areas </i>table, and generate the area from PL/SQL function <i><a href="https://github.com/davidkurtz/strava/blob/764f262fe51f52d0a5a8ae35c2734fae1aa6cfd3/strava_pkg.sql#L379" target="_blank">strava_pkg. name_heirarchy_fn</a></i>.</p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code>DECLARE
l_clob CLOB;
l_my_areas my_areas%ROWTYPE;
BEGIN
select m.*
into l_my_areas
FROM my_areas m
WHERE area_code = 'CPC'
And area_number = '40307';
dbms_output.put_line(strava_pkg.name_heirarchy_fn(l_my_areas.area_code,l_my_areas.area_number));
dbms_output.put_line(strava_pkg.name_heirarchy_fn(l_my_areas.parent_area_code,l_my_areas.parent_area_number));
END;
/</code></span></pre></div>
<p>If I pass the code and number for a particular area, I can get its full hierarchy including its name. I can see the parish of Streatley, is in the Unitary Authority of West Berkshire, which is in England, and England is in the United Kingdom. If I pass the code and number of its parent, I just get the hierarchy up to its parent. </p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code>Streatley, West Berkshire, England, United Kingdom
West Berkshire, England, United Kingdom</code></span></pre></div>
<p>I can store the hierarchy on <i>my_areas</i>, though I have to store results on a temporary table, rather than update it directly. Otherwise, I get a mutation error.</p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code>ALTER TABLE my_areas add name_heirarchy VARCHAR(4000)
/
CREATE GLOBAL TEMPORARY TABLE my_areas_temp ON COMMIT PRESERVE ROWS AS
SELECT area_code, area_number, strava_pkg.name_heirarchy_fn(parent_area_code,parent_area_number) name_heirarchy
FROM my_areas WHERE parent_area_code IS NOT NULL AND parent_area_number IS NOT NULL
/
MERGE INTO my_areas u
USING (SELECT * FROM my_areas_temp) s
ON (u.area_code = s.area_code AND u.area_number = s.area_number)
WHEN MATCHED THEN UPDATE
SET u.name_heirarchy = s.name_heirarchy
/</code></span></pre></div>
<p>Then I can create a multi-column text index on the name</p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code>begin
ctx_ddl.create_preference('my_areas_lexer', 'BASIC_LEXER');
ctx_ddl.set_attribute('my_areas_lexer', 'mixed_case', 'NO');
ctx_ddl.create_preference('my_areas_datastore', 'MULTI_COLUMN_DATASTORE');
ctx_ddl.set_attribute('my_areas_datastore', 'columns', 'name, name_heirarchy');
end;
/
CREATE INDEX my_areas_name_txtidx ON my_areas (name) INDEXTYPE IS ctxsys.context
PARAMETERS ('datastore my_areas_datastore lexer my_areas_lexer sync(on commit)');</code></span></pre></div>
<p>The index will sync if I have cause to update the hierarchy.</p><h4 style="text-align: left;">Text Index Option 2: Index a<i> user_datastore</i> based on the result of a PL/SQL function</h4><p>Alternatively, I can build a text index on a combination of data from various sources by creating a PL/SQL procedure that combines the data and returns the string to be indexed. </p>
<p>I have created a procedure (<i><a href="http://strava_pkg.name_heirarchy_txtidx" target="_blank">strava_pkg.name_heirarchy_txtidx</a></i>) that returns a string containing the hierarchy of a given area, and then I will create a text index on that. The format of the parameters must be exactly as follows: </p><p></p><ul style="text-align: left;"><li>The rowid of the row being indexed is passed to the procedure; </li><li>The string to be indexed is passed back as a CLOB parameter.</li></ul>See also: Oracle Text Indexing Elements: <a href="https://docs.oracle.com/database/121/CCREF/cdatadic.htm#GUID-F9BE863D-91E9-4515-92A9-084776279F71" target="_blank">USER_DATASTORE Attributes</a><br /><p></p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code>…
PROCEDURE name_heirarchy_txtidx
(p_rowid in rowid
,p_dataout IN OUT NOCOPY CLOB
) IS
l_count INTEGER := 0;
BEGIN
FOR i IN (
SELECT area_code, area_number, name, matchable
FROM my_areas m
START WITH rowid = p_rowid
CONNECT BY NOCYCLE prior m.parent_area_code = m.area_code
AND prior m.parent_area_number = m.area_number
) LOOP
IF i.matchable >= 1 THEN
l_count := l_count + 1;
IF l_count > 1 THEN
p_dataout := p_dataout ||', '|| i.name;
ELSE
p_dataout := i.name;
END IF;
END IF;
END LOOP;
END name_heirarchy_txtidx;
…</code></span></pre></div>
<p>As an example, if I pass a particular <i>rowid</i> to the procedure, I obtain the full hierarchy of areas as before.</p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code>set serveroutput on
DECLARE
l_rowid ROWID;
l_clob CLOB;
BEGIN
select rowid
into l_rowid
FROM my_areas m
WHERE area_code = 'CPC'
And area_number = '40307';
strava_pkg.name_heirarchy_txtidx(l_rowid, l_clob);
dbms_output.put_line(l_clob);
END;
/
<b>Streatley, West Berkshire, England, United Kingdom</b>
PL/SQL procedure successfully completed.</code></span></pre></div>
<p>The procedure is referenced as an attribute to a user datastore, I can then build a text index on the user datastore.</p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code>BEGIN
ctx_ddl.create_preference('my_areas_lexer', 'BASIC_LEXER');
ctx_ddl.set_attribute('my_areas_lexer', 'mixed_case', 'NO');
ctx_ddl.create_preference(<b>'my_areas_datastore', 'user_datastore'</b>);
ctx_ddl.set_attribute(<b>'my_areas_datastore', 'procedure', 'strava_pkg.name_heirarchy_txtidx'</b>);
ctx_ddl.set_attribute('my_areas_datastore', 'output_type', 'CLOB');
END;
/
CREATE INDEX my_areas_name_txtidx on my_areas (name) INDEXTYPE IS ctxsys.context
PARAMETERS ('datastore my_areas_datastore lexer my_areas_lexer');</code></span></pre></div>
<div style="text-align: left;">I have not been able to combine a multi-column datastore with a user datastore.</div><h4 style="text-align: left;">Text Search examples</h4><p>Both options produce an index that I can use in the same way. I can search for a particular name, for example, the village of Streatley.</p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code>SELECT score(1), area_Code, area_number, name, suffix, name_heirarchy
FROM my_areas m
WHERE <b>CONTAINS(name,'streatley',1)>0</b>
/</code></span></pre></div>
<p>I get the two Streatleys, one in Berkshire, and the other in Bedfordshire. </p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code> SCORE(1) AREA AREA_NUMBER NAME SUFFIX NAME_HEIRARCHY
---------- ---- ----------- -------------------- ---------- ------------------------------------------------------------
16 CPC 41076 Streatley CP Streatley, Central Bedfordshire, England, United Kingdom
16 CPC 40307 Streatley CP Streatley, West Berkshire, England, United Kingdom</code></span></pre></div>
<p>As I have indexed the full hierarchy, I can be more precise and search for both the village and the county, even though they are two different columns in the <i>my_areas</i> table.</p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: xx-small;"><code>SELECT score(1), area_Code, area_number, name, suffix, name_heirarchy
FROM my_areas m
WHERE <b>CONTAINS(name,'streatley and berks%',1)>0</b>
/</code></span></pre></div>
<p>Now I just get one result. The Streatley in Berkshire.</p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code> SCORE(1) AREA AREA_NUMBER NAME SUFFIX NAME_HEIRARCHY
---------- ---- ----------- -------------------- ---------- ------------------------------------------------------------
11 CPC 40307 Streatley CP Streatley, West Berkshire, England, United Kingdom</code></span></pre></div>
<h4 style="text-align: left;">Searching For the Top of Hierarchies</h4><p>My search query works satisfactorily if my search identifies areas with no children, but supposing I search for something higher up the hierarchy, like Berkshire? </p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code>SELECT score(1), area_Code, area_number, name, suffix, name_heirarchy
FROM my_areas m
WHERE <b>CONTAINS(name,'berkshire',1)>0</b>
/</code></span></pre></div>
<p>I get 184 areas, of different types within the areas called Berkshire, because the name of the parent area appears in the hierarchy of all the children and so is returned by the text index.</p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: xx-small;"><code> Area Area
SCORE(1) Code Number NAME SUFFIX NAME_HEIRARCHY
---------- ---- ----------- ------------------------- ---------- -----------------------------------------------------------
11 UTA 101678 Windsor and Maidenhead (B) Windsor and Maidenhead, Berkshire, England, United Kingdom
11 UTA 101680 Wokingham (B) Wokingham, Berkshire, England, United Kingdom
11 UTA 101681 Reading (B) Reading, Berkshire, England, United Kingdom
11 UTA 101685 West Berkshire West Berkshire, England, United Kingdom
11 UTW 40258 Norreys Ward Norreys, Wokingham, Berkshire, England, United Kingdom
11 UTW 40261 Barkham Ward Barkham, Wokingham, Berkshire, England, United Kingdom
…</code></span></pre></div>
<p>However, I am just interested in the highest points on each part of the hierarchy I have identified. So, I exclude any result where its parent is also in the result set.</p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code>WITH x AS (
SELECT area_code, area_number, parent_area_code, parent_area_number, name, name_heirarchy
FROM my_areas m
WHERE <b>CONTAINS(name,'berkshire',1)>0</b>
) SELECT * FROM x WHERE NOT EXISTS (
SELECT 'x' FROM x x1
WHERE x1.area_code = x.parent_area_code
AND x1.area_number = x.parent_area_number
)
/</code></span></pre></div>
<p>In this case, I still get two results because the boundaries of the unitary authority of West Berkshire are not entirely within the ceremonial county of Berkshire (<a href="https://en.wikipedia.org/wiki/List_of_Berkshire_boundary_changes#cite_note-SI89-6" target="_blank">some parts of Hungerford and Lambourne were exchanged with Wiltshire</a> in 1990), hence I could not make Berkshire the parent of West Berkshire.</p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code>Area Area
Code Number SCORE NAME_HEIRARCHY
---- ----------- ---------- ------------------------------------------------------------
UTA 101685 11 West Berkshire, England, United Kingdom
CCTY 7 11 Berkshire, England, United Kingdom</code></span></pre></div><p></p>
<p></p>
<h4 style="text-align: left;">Text Searching for Activities that pass through Areas</h4>
<p>It is a simple extension to join the pre-processed areas through which activities pass to the areas found by the text search, and then exclude areas whose parent was also found in the same activity.</p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: x-small;"><code>WITH x AS (
SELECT aa.activity_id, m.area_code, m.area_number, m.parent_area_code, m.parent_area_number, m.name, m.name_heirarchy
FROM my_areas m, activity_areas aa
WHERE m.area_Code = aa.area_code
AND m.area_number = aa.area_number
AND <b>CONTAINS(name,'berkshire',1)>0</b>
)
SELECT a.activity_id, a.activity_date, a.activity_name, a.activity_type, a.distance_km
, x.area_Code, x.area_number, x.name, x.name_heirarchy
FROM x, activities a
WHERE x.activity_id = a.activity_id
AND a.activity_date between <b>TO_DATE('01022019','DDMMYYYY') and TO_DATE('28022019','DDMMYYYY')</b>
AND NOT EXISTS (
SELECT 'x' FROM x x1
WHERE x1.area_code = x.parent_area_code
AND x1.area_number = x.parent_area_number
AND x1.activity_id = x.activity_id)
ORDER BY a.activity_date
/</code></span></pre></div>
<p>Now I can see the rides in Berkshire in February 2019. I get two rows returned for the ride that was in both Berkshire and West Berkshire. </p>
<div><pre style="background-color: #eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;"><span style="font-size: xx-small;"><code> Activity Activity Activity Distance Area Area
ID Date ACTIVITY_NAME Type (km) Code Number NAME NAME_HEIRARCHY
---------- --------- --------------------------------------------- -------- -------- ---- ------ --------------- -------------------------
2156308823 17-FEB-19 MV - Aldworth, CLCTC Aldworth-Reading Ride 120.86 CCTY 7 Berkshire England, United Kingdom
2156308823 17-FEB-19 MV - Aldworth, CLCTC Aldworth-Reading Ride 120.86 UTA 101685 West Berkshire England, United Kingdom
2172794879 24-FEB-19 MV - Maidenhead Ride 48.14 CCTY 7 Berkshire England, United Kingdom
2173048214 24-FEB-19 CLCTC: Maidenhead - Turville Heath Ride 53.15 CCTY 7 Berkshire England, United Kingdom
2173048406 24-FEB-19 Maidenhead - Burnham Beeches - West Drayton Ride 27.92 CCTY 7 Berkshire England, United Kingdom
…</code></span></pre></div><h4 style="text-align: left;">References</h4>I found these references useful while creating the Text index:<p></p><ul><li>Boyko Dimitrov: <a href="https://dreamix.eu/blog/frontpage/full-text-search-across-multiple-database-columns-with-oracle-text" target="_blank">Full text search across multiple database columns with Oracle Text</a></li><li>Oracle Blog about Oracle Text: <a href="https://blogs.oracle.com/searchtech/getting-started-part-3-index-maintenance" target="_blank">Getting started Part 3 - Index maintenance</a></li><li>Jonathan Lewis (at Redgate): <a href="https://www.red-gate.com/simple-talk/sql/oracle/text-indexes/" target="_blank">Text Indexes</a></li></ul><div class="blogger-post-footer"><a href="http://www.go-faster.co.uk/">©David Kurtz</a></div>David Kurtzhttp://www.blogger.com/profile/08139761793598085235noreply@blogger.com0