Thursday, February 15, 2024

Table Clusters: 3. Populating the Cluster with DBMS_PARALLEL_EXECUTE

This post is part of a series that discusses table clustering in Oracle.

The result tables being clustered are also large, containing hundreds of millions of rows.  Normally, when I have to rebuild these as non-clustered tables, I would do so in direct-path mode and with both parallel insert and parallel query. However, this is not effective for table clusters, particularly if you put multiple tables in one cluster, as rows with the same cluster key have to go into the same data blocks.

Instead, for each result table in the cluster, I have used DBMS_PARALLEL_EXECUTE to take a simple INSERT…SELECT statement, and break it into pieces that can be run concurrently on the database job scheduler.  I get the parallelism, though I also have to accept the redo on the insert.

exec DBMS_PARALLEL_EXECUTE.DROP_TASK('CLUSTER_GP_RSLT_ABS');

DECLARE
  l_recname VARCHAR2(15) := 'GP_RSLT_ABS';
  l_src_prefix VARCHAR2(10) := 'ORIG_';
  l_task VARCHAR2(30);
  l_sql_stmt CLOB;
  l_col_list CLOB;
BEGIN
  l_task := 'CLUSTER_'||l_recname;
  
  SELECT LISTAGG(column_name,',') WITHIN GROUP(ORDER BY column_id) 
  INTO l_col_list 
  FROM user_tab_cols WHERE table_name = l_src_prefix||l_recname;
  
  l_sql_stmt := 'insert into PSY'||l_recname||' ('||l_col_list||') SELECT '||l_col_list
              ||' FROM '||l_src_prefix||l_recname||' WHERE rowid BETWEEN :start_id AND :end_id';
  
  DBMS_PARALLEL_EXECUTE.CREATE_TASK (l_task);
  DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID(l_task, 'SYSADM', l_src_prefix||l_recname, true, 2e6);
  DBMS_PARALLEL_EXECUTE.RUN_TASK(l_task, l_sql_stmt, DBMS_SQL.NATIVE, parallel_level => 24);
END;
/

The performance of this process is the first indication as to whether the cluster key is correct.  Too few columns and the population of the table will be much slower because rows have to go in the block already allocated to that cluster key, or if full a new block must be allocated.  

NB: Chunking the data by ROWID only works where the source table is a regular table.  It does not work for clustered or index-organised tables.  The alternative is to chunk by the value of a numeric column, and that doesn't work well in this case because most of the key columns are strings or dates.

Monitoring DBMS_PARALLEL_EXECUTE

There are several views provided by Oracle that can be used to monitor tasks created by DBMS_PARALLEL_EXECUTE.
SELECT * FROM user_parallel_execute_tasks;
                                                                            Number                                                
TASK_NAME            CHUNK_TYPE   STATUS     TABLE_OWNER TABLE_NAME         Column     TASK_COMMENT                   JOB_PREFIX  
-------------------- ------------ ---------- ----------- ------------------ ---------- ------------------------------ ------------
                                                                                               Apply                                           
                                                                                 Lang          X Ed    Fire_ Parallel                     
SQL_STMT                                                                         Flag EDITION  Trigger Apply    Level JOB_CLASS           
-------------------------------------------------------------------------------- ---- -------- ------- ----- -------- -----------------
CLUSTER_GP_RSLT_ABS  ROWID_RANGE  FINISHED   SYSADM      PS_GP_RSLT_ABS                                               TASK$_38380  
insert into PSYGP_RSLT_ABS (EMPLID,CAL_RUN_ID,EMPL_RCD,GP_PAYGROUP,CAL_ID,ORIG_C    1 ORA$BASE         TRUE        24 DEFAULT_JOB_CLASS   

CLUSTER_GP_RSLT_ACUM ROWID_RANGE  FINISHED   SYSADM      PS_GP_RSLT_ACUM                                              TASK$_38382  
insert into PSYGP_RSLT_ACUM (EMPLID,CAL_RUN_ID,EMPL_RCD,GP_PAYGROUP,CAL_ID,ORIG_    1 ORA$BASE         TRUE        32 DEFAULT_JOB_CLASS
Each task is broken into chunks.
SELECT task_name, status, count(*) chunks
, min(start_ts) min_start_ts, max(end_ts) max_end_ts
, max(end_ts)-min(start_ts) duration
FROM user_parallel_execute_chunks 
group by task_name, status
order by min_start_ts nulls last
/TASK_NAME            STATUS         CHUNKS MIN_START_TS            MAX_END_TS              DURATION           
-------------------- ---------- ---------- ----------------------- ----------------------- -------------------
CLUSTER_GP_RSLT_ABS  PROCESSED          80 22/12/2023 09.58.37.712 22/12/2023 10.06.32.264 +00 00:07:54.551373
CLUSTER_GP_RSLT_ACUM PROCESSED         402 22/12/2023 10.08.58.257 22/12/2023 10.38.36.820 +00 00:29:38.562700
In this case, each chunk processes a range of ROWIDs.  Each chunk is allocated to a database scheduler job.
SELECT chunk_id, task_name, status, start_rowid, end_rowid, job_name, start_ts, end_ts, error_code, error_message
FROM user_parallel_execute_chunks 
WHERE task_name = 'CLUSTER_GP_RSLT_ABS'
ORDER BY chunk_id
/

Chunk                                                                                                                                                                                          
   ID TASK_NAME            STATUS     START_ROWID        END_ROWID          JOB_NAME        START_TS                END_TS                  ERROR_CODE ERROR_MESSAGE                           
----- -------------------- ---------- ------------------ ------------------ --------------- ----------------------- ----------------------- ---------- -----------------
    1 CLUSTER_GP_RSLT_ABS  PROCESSED  AAAUzUAAmAAAZgAAAA AAAUzUAAmAADp7VH// TASK$_38380_1   22/12/2023 09:58:37.712 22/12/2023 10:00:21.622                                                    
    2 CLUSTER_GP_RSLT_ABS  PROCESSED  AAAUzUAAmAADp7WAAA AAAUzUAAmAAGwkrH// TASK$_38380_3   22/12/2023 09:58:37.713 22/12/2023 10:00:20.107                                                    
    3 CLUSTER_GP_RSLT_ABS  PROCESSED  AAAUzUAAmAAGwksAAA AAAUzUAAmAAHvwBH// TASK$_38380_2   22/12/2023 09:58:37.713 22/12/2023 10:00:14.939                                                    
    4 CLUSTER_GP_RSLT_ABS  PROCESSED  AAAUzUAAmAAHvwCAAA AAAUzUAAmAAIn5XH// TASK$_38380_9   22/12/2023 09:58:37.864 22/12/2023 10:00:28.963                                                    
    5 CLUSTER_GP_RSLT_ABS  PROCESSED  AAAUzUAAmAAIn5YAAA AAAUzUAAmAAJ58tH// TASK$_38380_12  22/12/2023 09:58:37.865 22/12/2023 10:00:30.494                                                    
    6 CLUSTER_GP_RSLT_ABS  PROCESSED  AAAUzUAAmAAJ58uAAA AAAUzUAAmAAKzADH// TASK$_38380_8   22/12/2023 09:58:37.865 22/12/2023 10:00:26.049                                                    
    7 CLUSTER_GP_RSLT_ABS  PROCESSED  AAAUzUAAmAAKzAEAAA AAAUzUAAmAALf7ZH// TASK$_38380_4   22/12/2023 09:58:37.865 22/12/2023 10:00:28.017                                                    
    8 CLUSTER_GP_RSLT_ABS  PROCESSED  AAAUzUAAmAALf7aAAA AAAUzUAAmAAMHGvH// TASK$_38380_10  22/12/2023 09:58:37.885 22/12/2023 10:00:23.326                                                    
    9 CLUSTER_GP_RSLT_ABS  PROCESSED  AAAUzUAAmAAMHGwAAA AAAUzUAAmAAP5aFH// TASK$_38380_13  22/12/2023 09:58:37.907 22/12/2023 10:00:22.660                                                    
   10 CLUSTER_GP_RSLT_ABS  PROCESSED  AAAUzUAAmAAP5aGAAA AAAUzUAAnAACr1bH// TASK$_38380_5   22/12/2023 09:58:37.929 22/12/2023 10:00:21.959
…
However, one job may process many chunks.
SELECT t.task_name, t.chunk_type, t.table_name, c.chunk_id, c.job_name, c.start_ts, c.end_ts
, d.actual_start_date, d.run_duration, d.instance_id, d.session_id
FROM user_parallel_execute_tasks t
JOIN user_parallel_execute_chunks c ON c.task_name = t.task_name
JOIN user_scheduler_job_run_details d ON d.job_name = c.job_name
WHERE t.task_name = 'CLUSTER_GP_RSLT_ABS'
ORDER BY t.task_name, c.job_name, c.start_ts
/
                                                  Chunk                                                                                                             Inst             
TASK_NAME            CHUNK_TYPE   TABLE_NAME         ID JOB_NAME        START_TS                END_TS                  ACTUAL_START_DATE       RUN_DURATION          ID SESSION_ID  
-------------------- ------------ --------------- ----- --------------- ----------------------- ----------------------- ----------------------- ------------------- ---- ------------
CLUSTER_GP_RSLT_ABS  ROWID_RANGE  PS_GP_RSLT_ABS      1 TASK$_38380_1   22/12/2023 09:58:37.712 22/12/2023 10:00:21.622 22/12/2023 09:58:37.660 +00 00:07:52.000000    1 3406,24003  
CLUSTER_GP_RSLT_ABS  ROWID_RANGE  PS_GP_RSLT_ABS     23 TASK$_38380_1   22/12/2023 10:00:21.710 22/12/2023 10:02:01.916 22/12/2023 09:58:37.660 +00 00:07:52.000000    1 3406,24003  
CLUSTER_GP_RSLT_ABS  ROWID_RANGE  PS_GP_RSLT_ABS     44 TASK$_38380_1   22/12/2023 10:02:02.008 22/12/2023 10:03:31.546 22/12/2023 09:58:37.660 +00 00:07:52.000000    1 3406,24003  
CLUSTER_GP_RSLT_ABS  ROWID_RANGE  PS_GP_RSLT_ABS     57 TASK$_38380_1   22/12/2023 10:03:31.640 22/12/2023 10:05:05.398 22/12/2023 09:58:37.660 +00 00:07:52.000000    1 3406,24003  
CLUSTER_GP_RSLT_ABS  ROWID_RANGE  PS_GP_RSLT_ABS     73 TASK$_38380_1   22/12/2023 10:05:05.494 22/12/2023 10:06:29.262 22/12/2023 09:58:37.660 +00 00:07:52.000000    1 3406,24003  

CLUSTER_GP_RSLT_ABS  ROWID_RANGE  PS_GP_RSLT_ABS      8 TASK$_38380_10  22/12/2023 09:58:37.885 22/12/2023 10:00:23.326 22/12/2023 09:58:37.877 +00 00:07:54.000000    1 4904,44975  
CLUSTER_GP_RSLT_ABS  ROWID_RANGE  PS_GP_RSLT_ABS     27 TASK$_38380_10  22/12/2023 10:00:23.394 22/12/2023 10:01:59.096 22/12/2023 09:58:37.877 +00 00:07:54.000000    1 4904,44975  
CLUSTER_GP_RSLT_ABS  ROWID_RANGE  PS_GP_RSLT_ABS     42 TASK$_38380_10  22/12/2023 10:01:59.185 22/12/2023 10:03:37.657 22/12/2023 09:58:37.877 +00 00:07:54.000000    1 4904,44975  
CLUSTER_GP_RSLT_ABS  ROWID_RANGE  PS_GP_RSLT_ABS     61 TASK$_38380_10  22/12/2023 10:03:37.742 22/12/2023 10:05:12.680 22/12/2023 09:58:37.877 +00 00:07:54.000000    1 4904,44975  
CLUSTER_GP_RSLT_ABS  ROWID_RANGE  PS_GP_RSLT_ABS     79 TASK$_38380_10  22/12/2023 10:05:12.776 22/12/2023 10:06:32.142 22/12/2023 09:58:37.877 +00 00:07:54.000000    1 4904,44975  
…
You can also judge how well the clustering is working by looking at how much database time was consumed by the various events.  PS_GP_RSLT_ABS was inserted first, then PS_GP_RSLT_ACUM.  We can see that more time was spent on the second table that was inserted, and more time spent on physical read operations as rows have to go into specific blocks with the same cluster keys.
select c.task_name, c.status, count(distinct c.chunk_id) chunks, h.module, h.event
, sum(usecs_per_Row)/1e6 ash_secs
from gv$active_session_history h
, user_parallel_execute_chunks c
, user_parallel_execute_tasks t
where h.sample_time BETWEEN c.start_ts AND NVL(c.end_ts,SYSDATE)
and t.task_name = c.task_name
and h.action like c.job_name
group by c.task_name, c.status, h.module, h.event
order by task_name, ash_Secs desc
/
TASK_NAME            STATUS     CHUNKS MODULE          EVENT                                                            ASH_SECS
-------------------- ---------- ------ --------------- ---------------------------------------------------------------- --------
CLUSTER_GP_RSLT_ABS  PROCESSED      80 DBMS_SCHEDULER                                                                       3534
                     PROCESSED      78 DBMS_SCHEDULER  enq: FB - contention                                                 1184
                     PROCESSED      80 DBMS_SCHEDULER  db file parallel read                                                1161
                     PROCESSED      80 DBMS_SCHEDULER  buffer busy waits                                                     674
                     PROCESSED      79 DBMS_SCHEDULER  db file scattered read                                                490
…

CLUSTER_GP_RSLT_ACUM PROCESSED     401 DBMS_SCHEDULER                                                                      10174
                     PROCESSED     401 DBMS_SCHEDULER  db file sequential read                                              8813
                     PROCESSED      32 DBMS_SCHEDULER  log file switch (archiving needed)                                   4623
                     PROCESSED     389 DBMS_SCHEDULER  db file parallel read                                                1396
                     PROCESSED     383 DBMS_SCHEDULER  db file scattered read                                               1346
                     PROCESSED     295 DBMS_SCHEDULER  buffer busy waits                                                     769
                     PROCESSED     287 DBMS_SCHEDULER  enq: FB - contention                                                  715
…

No comments :