Wednesday, October 21, 2009

Partition Maintenance with Global Indexes

(updated 3.12.2009 to deal with UPDATE GLOBAL INDEXES option)
When I decide to partition a table, I also need to consider whether to partition the indexes, and if so how. The easiest option is to locally partition the indexes.  I try to avoid globally partitioned indexes because they can become invalid when you do partition maintenance. However, where an index leads on a column other than the partitioning key, then you might have to scan all partitions of a locally partitioned index if you do not query by the partitioning column.  Sometimes, it is necessary to partition an index differently to the table, or not at all. However, I have found that there are some partition management operations that do not invalidate global indexes.

In the case of PeopleSoft Global Payroll I range partition tables to match the Payroll 'stream' definitions. Thus each concurrent payroll process only updates one partition, and more importantly, each partition is updated by one, and only one, payroll process. Thus eliminating consistent read on these objects. The number of payroll processes is usually a function of the hardware configuration. This doesn't change often, so I don't expect to do partition maintenance. If I need to change the payroll stream definitions, then I will completely rebuild the partitioned tables.

In the case of Time & Labor, I create periodic partitions (weekly, fortnightly, or monthly) for the reported time table, partitioning on the 'date under reporting' column. This column is used by many of the queries in T&L processing, resulting in partition elimination. However, it is also necessary to periodically add new partitions.

The Oracle documentation says this about
"Management of Global Partitioned Indexes
Global partitioned indexes are harder to manage than local indexes:

When the data in an underlying table partition is moved or removed (SPLIT, MOVE, DROP, or TRUNCATE), all partitions of a global index are affected. Consequently global indexes do not support partition independence."

The key point is that the global index is invalidated if rows of data in a partition table are affected by DDL on a partition. Here is a little demonstration. I will create a table with a locally partitioned unique index, and a second non-partitioned index.

DROP TABLE t PURGE;

CREATE TABLE t
(a NUMBER NOT NULL
,b VARCHAR2(100) NOT NULL
)
PARTITION BY RANGE(a)
(PARTITION T1 VALUES LESS THAN (100)
,PARTITION T2 VALUES LESS THAN (200)
);

CREATE UNIQUE INDEX ia ON t (a)
LOCAL
(PARTITION i1
,PARTITION i2
);

CREATE INDEX ib ON t (b);

INSERT INTO t
SELECT ROWNUM, TO_CHAR(TO_DATE(ROWNUM,'j'),'Jsp')
FROM dba_objects
WHERE ROWNUM < 200;
COMMIT;

Now I will add and populate a third partition

ALTER TABLE t
ADD PARTITION t3 VALUES LESS THAN (300);

INSERT INTO t
SELECT ROWNUM+199, TO_CHAR(TO_DATE(ROWNUM+199,'j'),'Jsp')
FROM dba_objects
WHERE ROWNUM+199 < 300 ; COMMIT;

And now I will check on the indexes.

column table_name     format a5 heading 'Table|Name'
column index_name     format a5 heading 'Index|Name'
column partition_name format a4 heading 'Part|Name'
column partitioned              heading 'Part?#'
column status                   heading 'Index|Status'
SELECT table_name, index_name, partitioned, status
FROM   user_indexes where table_name = 'T' ORDER BY 1,2;
SELECT index_name, partition_name, status
FROM   user_ind_partitions WHERE index_name IN (SELECT index_name FROM user_indexes WHERE table_name = 'T') order by 1,2,3;

Note that the global index IB is still usable

Table Index Index
Name  Name  Part?# Status
----- ----- ------ ------
T     IA    YES    N/A
T     IB    NO     VALID

Index Part Index
Name  Name Status
----- ---- ------
IA    I1   USABLE
IA    I2   USABLE
IA    T3   USABLE

The interesting thing is that if I delete all the rows from a partition then I can drop it, and the index does not go unusable.

DELETE FROM t WHERE a>=200;
ALTER TABLE t DROP PARTITION t3;

Table Index Index
Name   Name Part?# Status
----- ----- ------ ------
T     IA    YES    N/A
T     IB    NO     VALID

Index Part Index
Name  Name Status
----- ---- ------
IA    I1   USABLE
IA    I2   USABLE

but if I didn't delete all rows from partition t3 before I drop it, then the global index goes unusable.

ALTER TABLE t DROP PARTITION t3;

Table Index Index
Name  Name  Part?# Status
----- ----- ------ --------
T     IA    YES    N/A
T     IB    NO     UNUSABLE

Index Part Index
Name Name Status
----- ---- --------
IA I1 USABLE
IA I2 USABLE

If I truncate the table such that I remove rows, then the index immediately becomes unusable

ALTER TABLE t TRUNCATE PARTITION t3;

Table Index Index
Name Name Part?# Status
----- ----- ------ --------
T IA YES N/A
T IB NO UNUSABLE

Index Part Index
Name Name Status
----- ---- --------
IA I1 USABLE
IA I2 USABLE

However, if I had deleted those rows first the truncate would not have invalidated the index.

Addendum As Randolf points out in his comment, it is possible to prevent the global index from being invalidated by specifying the UPDATE GLOBAL INDEXES clauses.

ALTER TABLE t DROP PARTITION t1 UPDATE GLOBAL INDEXES;

However, this comes at a cost.  Oracle now full scans the partition being dropped and removes the entries from the index.  This recursive SQL statement can be found in an Oracle trace.

insert /*+ RELATIONAL(T) PARALLEL(T,1) APPEND */ into "SYSADM"."T" partition ("T1")
select /*+ RELATIONAL(T) PARALLEL(T,1) */ * 
from "SYSADM"."T" partition ("T1") delete global indexes


Conclusion

There are some things that I can do without invalidating global indexes
  • I can add a new partition
  • I can delete all rows from an existing partition and drop it.
  • I can truncate a partition from which all the rows have been deleted.
  • I can drop a partition with rows if I specify the UPDATE GLOBAL INDEXES clause, but this could take time.
But
  • If I truncate a partition that does contain rows then I will immediately invalidate the index.
  • If I do any DDL that removes rows from a partitioned table, I will invalidate global indexes because they contain rowids that are invalid.
Ironically, one of the advantages of partitioning is that you can archive a whole partition with a quick DDL operation, rather than working through each row. But this would invalidate a global index. However, if I use something like the generic PeopleSoft archiving tool, which purges data with DML, then when the partition is empty I can drop the table partition without invalidating any global indexes!

Tuesday, October 13, 2009

Global Temporary Tables Share Statistics Across Sessions

In another blog posting, I asserted that statistics collected by one session on a Global Temporary table (GTT) would be used by other sessions that reference that table, even though each session has their own physical instance of the table. I thought I should demonstrate that behaviour, so here is a simple test.

We will need two database sessions. I will create a test Global Temporary table with a unique index.

DROP TABLE t PURGE;
TRUNCATE TABLE t;

CREATE GLOBAL TEMPORARY TABLE t
(a NUMBER,b VARCHAR2(1000))
ON COMMIT PRESERVE ROWS;

CREATE UNIQUE INDEX t ON t(a);

In my first session, I’ll populate the table with 100 rows. The values in column A have the range 1 to 100.

INSERT INTO t SELECT rownum, RPAD(TO_CHAR(TO_DATE(rownum,'J'),'Jsp') ,500,'.')
FROM dba_objects
WHERE rownum <= 100;

COMMIT;

And I’ll collect statistics on it.

begin sys.dbms_stats.gather_table_stats(ownname=>user,tabname=>'T'); end;
/

The following settings are just to make the queries easy to read, and so I can use dbms_xplan to generate a plan.

alter session set nls_date_format = 'hh24:mi:ss dd.mm.yyyy';
alter session set statistics_level = ALL;
set autotrace off pages 40 lines 100
column table_name  format a1
column column_name format a1
column low_value   format a32
column high_value  format a32
column a           format 999
column b           format a30

So now let’s check the contents of the table. There are 100 rows in the range 1 to 100.

select count(*), min(a), max(a) from t;

  COUNT(*)     MIN(A)     MAX(A)
---------- ---------- ----------
       100          1        100

And the statistics agree with this.

select table_name, num_rows, last_analyzed
from user_tables 
where table_name = 'T';

T   NUM_ROWS LAST_ANALYZED
- ---------- -------------------
T        100 18:13:17 13.10.2009

select table_name, column_name, low_value, high_value
from user_tab_columns
where table_name = 'T';

T C LOW_VALUE                        HIGH_VALUE
- - -------------------------------- --------------------------------
T A C102                             C202
T B 45696768742E2E2E2E2E2E2E2E2E2E2E 54776F2E2E2E2E2E2E2E2E2E2E2E2E2E
    2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E 2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E

So now let’s try a test query. The database returns 42 rows using a full scan. The statistics in the execution plan1 also predict that there will be 42 rows. Perfectly reasonable.

select * from t where a<=42;

   A B
---- ------------------------------
   1 One
…
  42 Forty-Two

42 rows selected.

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------
SQL_ID  6g743z6c2m2ap, child number 0
-------------------------------------
select * from t where a<=42

Plan hash value: 1601196873
-------------------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------
|*  1 |  TABLE ACCESS FULL| T    |      1 |     42 |     4   (0)|     42 |00:00:00.01 |      15 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("A"<=42)

Now, let’s start a second session, and insert some slightly different data into the same GTT. There are still 100 rows, but this time column A is in the range 43 to 142.

INSERT INTO t SELECT rownum+42
, RPAD(TO_CHAR(TO_DATE(rownum+42,'J'),'Jsp') ,500,'.')
FROM dba_objects
WHERE rownum <= 100;

COMMIT;

I’ll collect statistics in the same way. First we will check that the data in the table is correct.

select count(*), min(a), max(a) from t;

  COUNT(*)     MIN(A)     MAX(A)
---------- ---------- ----------
       100         43        142

And I can also see that the statistics have changed.

select table_name, num_rows, last_analyzed
from user_tables 
where table_name = 'T';

T   NUM_ROWS LAST_ANALYZED
- ---------- -------------------
T        100 18:18:22 13.10.2009

select table_name, column_name, low_value, high_value
from user_tab_columns
where table_name = 'T';

T C LOW_VALUE                        HIGH_VALUE
- - -------------------------------- --------------------------------
T A C12C                             C2022B
T B 4569676874792D45696768742E2E2E2E 53697874792E2E2E2E2E2E2E2E2E2E2E
    2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E 2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E

If I run the same query, it correctly returns no rows, and uses an index scan to so. The statistics predict one row, but Oracle actually doesn’t find any. Again perfectly reasonable.

select * from t where a<=42;

no rows selected

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------
SQL_ID  6g743z6c2m2ap, child number 0
-------------------------------------
select * from t where a<=42

Plan hash value: 2795797496

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------------------
|   1 |  TABLE ACCESS BY INDEX ROWID| T    |      1 |      1 |     2   (0)|      0 |00:00:00.01 |       1 |
|*  2 |   INDEX RANGE SCAN          | T    |      1 |      1 |     1   (0)|      0 |00:00:00.01 |       1 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("A"<=42)

But now let’s go back to the first session and run the query again (having flushed the shared pool). Last time we ran it we got 42 rows with a full scan. We still get 42 rows, but now it is using the index range scan, the same execution plan as the other session. In fact, the costs in the execution plan are the same as in the other session. Oracle expected 1 row from the index, but this time it actually got 42. So changing the statistics in the other session has changed the plan in this session, and possibly not for the better. The two sessions are using the same execution plan for different sets of data.

A B
---- ------------------------------
…
  42 Forty-Two

42 rows selected.

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------
SQL_ID  6g743z6c2m2ap, child number 0
-------------------------------------
select * from t where a<=42

Plan hash value: 2795797496
-----------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------------------
|   1 |  TABLE ACCESS BY INDEX ROWID| T    |      1 |      1 |     2   (0)|     42 |00:00:00.01 |      10 |
|*  2 |   INDEX RANGE SCAN          | T    |      1 |      1 |     1   (0)|     42 |00:00:00.01 |       4 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("A"<=42)

If I now delete stats, I will use optimizer dynamic sampling.

begin sys.dbms_stats.delete_table_stats(ownname=>user,tabname=>'T'); end;
/

And the plan changes back to a full scan.

(correct version reposted 1.11.2009)
A B
---- ------------------------------
…
  42 Forty-Two

42 rows selected.

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------
SQL_ID  6g743z6c2m2ap, child number 0
-------------------------------------
select * from t where a>=42

Plan hash value: 1601196873

-------------------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------
|*  1 |  TABLE ACCESS FULL| T    |      1 |     42 |     4   (0)|     42 |00:00:00.01 |      15 |
-------------------------------------------------------------------------------------------------
 
 Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("A">=42)

Note
-----
   - dynamic sampling used for this statement

Conclusion 

Different instances of the same GTT do share statistics because there is only one location in the data dictionary to store statistics for each table. Therefore, collecting statistics on a GTT in one session will affect other sessions using the same table name, possibly adversely!

(Added 1.11.2009) Optimizer Dynamic Sampling may be a better option for GTTs, and is enabled by default from Oracle 9i, but the behaviour changes slightly in 10g.  However, as Cokan points out in his comments below, if a query on a GTT from one session is still in the shared pool when it is used in a different session, then Oracle will not re-parse the statement, and will not choose a different execution plan.

Footnote 1: The execution plans in this posting have been obtained using dbms_xplan thus:
select * from table(dbms_xplan.display_cursor(null,null,'IOSTATS +COST'));
I have also flushed the shared pool between each statement; otherwise the numbers are aggregated across multiple executions of the same statement.

Tuesday, January 20, 2009

Detecting and Fixing Row Migration

In my previous posting, I discussed how migrated rows led to latch connection problems on a system. In this entry I will explain how I identified and removed the migrated rows, and correctly set PCTFREE on each table so that the problem will not recur.

Instead, you must use the ANALYZE command if you want to know about chained and migrated rows. DBMS_STATS only collects the statistics required by the cost-based optimizer. It doesn't populate the other statistics, including CHAIN_CNT and AVG_FREE_SPACE. So I wrote a simple script (available on my website) to work though a set of tables, and list the chained or migrated rows into a table. I sometimes choose to analyse only certain tables. The criteria in the comment are examples of criteria I add to the driving query to restrict the list of tables to be analysed.

BEGIN
FOR x IN (
SELECT owner, table_name, num_rows
FROM   all_tables t
WHERE  1=1
/*-------------------------------------------------------
AND    NOT table_name IN(SELECT DISTINCT table_name FROM chained_rows)
AND    num_rows >= 1000000
AND    num_rows BETWEEN 100000 AND 1000000
/*--------------------------------------------------------*/
AND    temporary = 'N'
) LOOP
DELETE FROM chained_rows
WHERE owner_name = x.owner
AND table_name = x.table_name;

EXECUTE IMMEDIATE 'ANALYZE TABLE '||x.owner||'.'||x.table_name
||' LIST CHAINED ROWS INTO chained_rows';
END LOOP;
END;
/

Having identified the migrated rows, I produced a simple report

SELECT /*+LEADING(c)*/ c.*, t.num_rows
, c.chained_rows/t.num_rows*100 pct_chained
, t.pct_free, t.pct_used
, 100-FLOOR((100-t.pct_free)*(1-c.chained_rows/t.num_rows)) new_pct_free
FROM (
SELECT owner_name, table_name, COUNT(*) chained_rows
FROM chained_rows c
GROUP BY owner_name, table_name) c
, all_tables t
WHERE t.owner = c.owner_name
AND   t.table_name = c.table_name
AND   t.num_rows > 0
ORDER BY chained_rows desc, 1,2
/

The last column of the report is a calculated new value for PCTFREE. The amount of in a block that can be used to insert new rows (100-PCTFREE) space used is scaled back by the proportion of migrated rows.

NB: This approach doesn't take chained rows into account. Chained rows are too big to fit in a single block under any circumstances, as opposed to migrated rows that could have fitted in a block had there been space. However, Oracle counts both types and stores the result in the column CHAIN_CNT.

Chained   Number             %    %   New
Owner    Table Name                 Rows  of Rows %Chained Free Used %Free
-------- -------------------- ---------- -------- -------- ---- ---- -----
MY_APP   MY_TABLE                 239667  1193233     20.1   10   40    29
…

The script then generates certain commands for each table. First the table is rebuilt by moving it to the same tablespace.

I specify PCTFREE as 1. Usually, the whole table does not need to be rebuilt with the new higher value for PCTFREE. Most of the rows are fully populated and generally will not grow further because they have already migrated. Setting a high value for PCTFREE will simply result in leaving a lot of free space, and result in Oracle reading more blocks for the same data. Instead, I set a low value for PCTFREE in order to pack the data into the minimum number of blocks.

ALTER TABLE MY_APP.MY_TABLE MOVE TABLESPACE MYAPP_TABLE PCTFREE 1;
…

The new value for PCTFREE is then applied to the table. This approach is not foolproof. It is possible for existing rows, that have now been packed into the minimum number of blocks, to grow and to be migrated. However, I have found that this happens relatively rarely.

If necessary, I also reduce PCTUSED such that the sum of the new PCTFREE and the old PCTUSED does not exceed 90. The old value is shown in the comment for reference.

ALTER TABLE MY_APP.MY_TABLE PCTFREE 29 /*10*/ PCTUSED 40;
…

Finally, all the indexes on the rebuilt tables need to be rebuilt, because they are invalidated by the table rebuild.

ALTER INDEX IDX_MY_TABLE_BH25150A REBUILD TABLESPACE MYAPP_INDEX;
ALTER INDEX IDX_MY_TABLE_BH25150B REBUILD TABLESPACE MYAPP_INDEX;
ALTER INDEX IDX_MY_TABLE_BH25150C REBUILD TABLESPACE MYAPP_INDEX;
ALTER INDEX IDX_MY_TABLE_BH25150D REBUILD TABLESPACE MYAPP_INDEX;
ALTER INDEX IDX_MY_TABLE_BH25150E REBUILD TABLESPACE MYAPP_INDEX;
ALTER INDEX IDX_MY_TABLE_BH25150F REBUILD TABLESPACE MYAPP_INDEX;

Conclusion
  • The table no longer has migrated rows.
  • The existing data is packed with little free space, minimising the size of the table.
  • Free space will be left in new blocks to allow new rows to expand without migrating.
  • All of the DDL to fix the problem has been built dynamically.
(5.11.2009) See also Tanel Poder's blog: Core IT for Geeks and Pros: Detecting and Fixing Row Migration