Demonstration
I will create two identical tables, but on the first table, I will specify a table statistic preference to collect a histogram on column C.set serveroutput on verify on autotrace off
CREATE TABLE t1(a number, b varchar2(1000), c number);
CREATE TABLE t2(a number, b varchar2(1000), c number);
exec dbms_stats.set_table_prefs(user,'t1','METHOD_OPT','FOR ALL COLUMNS SIZE AUTO FOR COLUMNS SIZE 254 C');
To make the test rerunnable, I will truncate each table, delete any statistics (because truncate does not delete statistics) and then populate the table again in direct-path mode.
TRUNCATE TABLE t1;
EXEC dbms_stats.delete_table_stats(user,'T1');
INSERT /*+APPEND*/ INTO t1
SELECT ROWNUM a, TO_CHAR(TO_DATE(rownum,'J'),'Jsp') b, CEIL(SQRT(rownum)) c
FROM dual CONNECT BY level <= 1e5;
TRUNCATE TABLE t2;
EXEC dbms_stats.delete_table_stats(user,'T2');
INSERT /*+APPEND*/ INTO t2
SELECT ROWNUM a, TO_CHAR(TO_DATE(rownum,'J'),'Jsp') b, CEIL(SQRT(rownum)) c
FROM dual CONNECT BY level <= 1e5;
COMMIT;
I expect to get statistics on both tables.
alter session set nls_date_Format = 'hh24:mi:ss dd/mm/yy';
column table_name format a10
column column_name format a11
SELECT table_name, num_rows, last_analyzed FROM user_tables WHERE table_name LIKE 'T_' ORDER BY 1;
SELECT table_name, column_name, num_distinct, histogram, num_buckets FROM user_tab_columns WHERE table_name LIKE 'T_' ORDER BY 1,2;
But I only get table and column statistics on T2, the one without the statistics preference.
TABLE_NAME NUM_ROWS LAST_ANALYZED
---------- ---------- -----------------
T1
T2 100000 10:08:30 15/01/20
Table Column
Name Name NUM_DISTINCT HISTOGRAM NUM_BUCKETS
----- ------ ------------ --------------- -----------
T1 A NONE
T1 B NONE
T1 C NONE
T2 A 100000 NONE 1
T2 B 98928 NONE 1
T2 C 317 NONE 1
It appears that I don't get statistics on T1 because I have specified a METHOD_OPT table statistic preference that is specific to some named columns. It doesn't have to specify creating a histogram, it might be preventing a histogram from being created.For example, this preference does not disable online statistics collection.
EXEC dbms_stats.set_table_prefs(user,'t2','METHOD_OPT','FOR ALL COLUMNS SIZE 1');
But these preferences do disable online statistics collection.
EXEC dbms_stats.set_table_prefs(user,'t2','METHOD_OPT','FOR COLUMNS SIZE 1 B C');
EXEC dbms_stats.set_table_prefs(user,'t2','METHOD_OPT','FOR COLUMNS SIZE 1 A B C');
I have not found any other statistics preferences (for other DBMS_STATS parameters) that cause this behaviour.Conclusion
Table preferences are recommended as a method of controlling statistics collection declaratively and consistently. You don't have to specify parameters to DBMS_STATS into scripts that collect statistics ad-hoc. The table statistics preferences provide a method that every time statistics are collected on a particular table, they are collected consistently, albeit in a way that may be different from the default.However, take the example of an ETL process loading data into a data warehouse. If you rely on online statistics gathering to collect table statistics as a part of a data load process, you must now be careful not to disable statistics collection during the load with METHOD_OPT statistics preferences.
Autonomous & 19c Addendum
(Updated 21st January 2020) In the Oracle Autonomous Data Warehouse, in order to make statistics collection as self-managing as possible, Oracle sets two undocumented parameters.- _optimizer_gather_stats_on_load_hist=TRUE - so that histograms are created on every column when online statistics are gathered
- _optimizer_gather_stats_on_load_all=TRUE - so that online statistics are collected for every direct-path insert, not just the first one.
From 19c, on Engineered systems, Oracle collects Real-Time statistics. The METHOD_OPT preference does not disable this behaviour, but you only get Real-Time statistics if you have conventional statistics in the first place. If you don't get statistics after a bulk load, then you won't get Real-Time statistics until conventional statistics have been explicitly gathered, for example, by the maintenance job.
(Update 22nd January 2020) This has been filed as bug 30787109.
Oh! Interesting quirk!
ReplyDeleteOf course, a data load only gives you partial stats (e.g. no histograms) with the online gather anyway (you're supposed to perform a gather_table_stats with option=>'gather auto' following the load) so it seems strange to disable all of the stats given method_opt restrictions wouldn't be used anyway.