DOAG 2019, Nuremburg

Friday, September 06, 2019

Working with AWR: Old Statistics in Current Execution Plans

This is a blog about a little trap, into which it is easy to fall when analysing performance problems using ASH data and execution plans from AWR.
I will illustrate it with a real example taken from a production PeopleSoft system. A number of sessions were waiting on blocking locks.  I used ASH data to find statements running while holding a blocking lock.  That led me to a particular SQL ID and plan have value.
SQL_ID        SQL_PLAN_HASH_VALUE   ASH_SECS
------------- ------------------- ----------
                                       29210
5st32un4a2y92          2494504609      10670
652mx4tffq415          1888029394       7030
artqgxug4z0f1             8450529        580
7qxdrwcn4yzhh          3723363341        270
So, I used DBMS_XPLAN.DISPLAY_AWR to extract the execution plan from the AWR repository.
SELECT * FROM table(dbms_xplan.display_awr('5st32un4a2y92',2494504609,NULL,'ADVANCED'));
I can see the statement, the execution plan, the costs of each operation in the plan, and the value of the bind variable. This is a statement that has consumed a lot of execution time, it isn't executing that frequently, yet the cost of statement is quite low (only 22). That, in itself, does not mean there is a problem here, but it is perhaps a suggestion of an inconsistency.
SQL_ID 5st32un4a2y92
--------------------
SELECT 'X' FROM PS_CDM_LIST  WHERE CONTENTID = :1

Plan hash value: 2494504609

------------------------------------------------------------------------------------
| Id  | Operation            | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |       |       |    22 (100)|          |
|   1 |  INDEX FAST FULL SCAN| PS_CDM_LIST |     1 |     5 |    22  (10)| 00:00:01 |
------------------------------------------------------------------------------------
…
Peeked Binds (identified by position):
--------------------------------------
   1 - :1 (NUMBER): 17776
Familiarity with the application is an advantage here. The value of CONTENTID comes from a sequence number (albeit generated from a table, not an Oracle sequence because this is PeopleSoft). The system has been running for a while, and Process Instances and Context IDs were well in excess of 1,000,000, so why is the bind variable only 17776? That is another alarm bell.
I can extract the current execution plan from the library cache, if the statement is in memory, using DBMS_XPLAN.DISPLAY_CURSOR().
SELECT * FROM table(dbms_xplan.display_cursor('5st32un4a2y92',NULL,'ADVANCED')); 

Plan hash value: 2494504609

------------------------------------------------------------------------------------
| Id  | Operation            | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |     1 |     6 |  3178   (9)| 00:00:05 |
|*  1 |  INDEX FAST FULL SCAN| PS_CDM_LIST |     1 |     6 |  3178   (9)| 00:00:05 |
------------------------------------------------------------------------------------
…
Peeked Binds (identified by position):
--------------------------------------
   1 - :1 (NUMBER): 10897657

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("CONTENTID"=TO_NUMBER(:1))
DBMS_XPLAN.DISPLAY_CURSOR() shows the current bind variable value. It is immediately obvious it has moved on since the plan was captured by AWR. If the statement is not in memory, I could generate a new execution plan using EXPLAIN PLAN FOR, that reparses the statement, and then extract the execution plan with DBMS_XPLAN.DISPLAY(). However, this will not give me the bind variable value.
explain plan for SELECT 'X' FROM PS_CDM_LIST WHERE CONTENTID = :1
/

Explained.

select * from table(dbms_xplan.display(null,null,'ADVANCED'));
…
The execution plan has the same plan hash value, so it is the same plan. However, the cost has increased significantly (from 22 to 3178). This reflects how the table has grown over time, and that is the problem. The full scan of the index was fast when the table was small, but as the table has grown over time, the scan is also taking longer. By the time I came to look at it, it was quite significant.
In this case, we created a new index on PS_CDM_LIST leading on CONTENTID to satisfy this query. The result was to improve the performance of this statement, and a reduce in time spent holding blocking locks which has further benefits.

What is going on here? 

This is SQL statement is a regular part of the application. The SQL ID doesn't change because it uses a bind variable. The execution plan hasn't changed either because the optimizer can't find a better option. The execution plan we obtained from AWR was the first such plan captured for that combination of SQL ID and plan hash value. AWR may, in fact is very likely to, have captured that plan again subsequently. However, it did not update the plan in AWR, and old plans are never purged.
The original cost (22) was the cost then, not now. The value of the bind variable (17776) was the value then not now!
So how do you know you are looking at a plan with old costs and bind variables? You don't.  There is nothing explicit to warn you to go and check.
However, if things are not making sense, see if the cost for the same plan is close to the costs stored in AWR.
If you use EXPLAIN PLAN FOR you may get a different plan, in which case you will need to use the hints in the outline section of the execution plan (not shown above) to force it to produce the same execution plan before you can compare costs.

Monday, June 24, 2019

Importing and Working with Exported AWR/ASH data in an Oracle database in a VirtualBox VM

A lot of my performance tuning work involves analysis of ASH and AWR data.  Frequently, I do not have direct access to the databases in question.  Sometimes, I ask clients to run EDB360 on their databases and send me the results, but sometimes I also want to work directly with ASH or AWR metrics.  So, I ask for an export of their AWR repository.
Oracle distributes a pair of scripts in $ORACLE_HOME/rdbms/admin.
  • awrextr.sql exports the AWR repository (including the ASH data), optionally for a range of AWR snapshot IDs. 
  • awrload.sql the export into a staging table and then moves it into the AWR repository in the current database from where you can reference it via the usual DBA_HIST% views and work with it in the usual way.
I import this into an Oracle database in a virtual machine in VirtualBox and work on it there.  This note explains how I create and set up that machine and then work with the data.

Getting Started

1. Build Oracle VirtualBox Linux VM

I use Frits Hoogland's vagrant builder to create a standard Oracle database on a Linux VM in VirtualBox.  I always build the latest version of Oracle because I can import an AWR export into a higher version of the database, but not a lower one.

2. Make Life Easier For Yourself

I sometimes install other utilities into the virtual machine, such as git, using yum.
yum install git
The VirtualBox is only going to run a single Oracle database, and I am very lazy.  So, I set up the environment in the shell profile by adding the following commands the ~oracle/.bash_profile.
export ORACLE_SID=oracle
export ORAENV_ASK=NO
. oraenv
export ORAENV_ASK=YES

3. Share a Folder on the Host Machine

I usually leave the AWR export file on my host machine rather than copy it into the virtual machine.  I create a VirtualBox shared folder so that the VirtualBox VM can reference a folder on the host machine.  In this case, I have shared c:\temp on my host Windows laptop and called the filesystem temp.  Note that it is auto-mounted and permanent.

In the VirtualBox, the shared folder has been mounted as /media/sf_temp.
[oracle@oracle-database sql]$ df /media/sf_temp
Filesystem     1K-blocks      Used Available Use% Mounted on
temp           974661916 340278844 634383072  35% /media/sf_temp
The sf_temp directory is owned by root and accessible only to the vboxsf group.  Therefore, you must add any user who needs to reference this directory to that group.  In this case, Oracle must be made a member of the vboxsf group.  For safety, this should be done with the usermod command rather than by editing /etc/group directly.
usermod -G vboxsf -a oracle
You can see the result of this command in the /etc/group file
grep oracle /etc/group
vboxsf:x:994:oracle
When logged in as Oracle, the id command will also report the groups of which it is a member.
[oracle@oracle-database ~]$ id
uid=54321(oracle) gid=54321(oinstall) groups=54321(oinstall),100(users),994(vboxsf),54322(dba)
Create a directory in the database to reference this directory on the filesystem.
CREATE OR REPLACE DIRECTORY TEMP as '/media/sf_temp';

4. Oracle FILESYSTEMIO_OPTIONS Parameter

Initially, I got the following error when trying to run a data pump import from the shared folder:
ORA-39000: bad dump file specification
ORA-31640: unable to open dump file "/media/sf_temp/awrdat_xxxx_xxxx.dmp" for read
ORA-27041: unable to open file
Linux-x86_64 Error: 22: Invalid argument
Additional information: 2
On VirtualBox, the Oracle parameter FILESYSTEMIO_OPTIONS will be set to setall by default.  To work around the import error set it to none and restart the database.
ALTER SYSTEM SET FILESYSTEMIO_OPTIONS=none SCOPE=SPFILE;
SHUTDOWN IMMEDIATE
STARTUP

5. Data Pump Import

Now you can import an AWR export (previously exported with awrextr.sql) with the script awrload.sql.
cd $ORACLE_HOME/rdbms/admin
sqlplus / as sysdba
@awrload

Working with Imported Data

The AWR data exported from another database will be visible through all the usual database views.  It can be distinguished by having the database ID of the original database.  If, like me, you are used to mining ASH data with SQL scripts, then they will work if you specify the DBID.  However, I have a couple of tips for working with imported AWR data.

SQLDB360 (EDB360 and SQLD360)

Sometimes I run an EDB360 report on the imported data.  The reports and graphs of ASH data and the system time model are particularly useful.
I can generate AWR reports for any pair of snapshots in the imported repository, not just the ones generated by the client and provided to me.  EDB360 generates AWR reports for peak periods by various criteria.
Having installed git, I can use it to download the latest version of SQLDB360.
cd ~/
git clone https://github.com/sqldb360/sqldb360.git
cd ~/sqldb360
Although, I could have used curl instead.
cd ~/
curl -LO https://github.com/sqldb360/sqldb360/archive/unstable.zip
unzip unstable.zip
Obtain the DBID of the database that you have just imported by looking at one of the AWR views
col max(startup_time) format a26
col min(begin_interval_time) format a26
col max(end_interval_time) format a26
SELECT dbid, db_name, count(*), max(startup_time)
FROM   dba_hist_database_instance
GROUP BY dbid, db_name
/
SELECT dbid, min(begin_interval_time), max(end_interval_time)
FROM   dba_hist_snapshot
GROUP BY dbid
/
In this case, my database is ORACLE, and XXXXXXX is the database that I have just imported.
      DBID DB_NAME     COUNT(*) MAX(STARTUP_TIME)
---------- --------- ---------- --------------------------
1884370030 ORACLE             9 20-JUN-19 10.51.50.000 AM
3743596248 XXXXXXX            4 05-MAY-19 01.03.57.000 AM

      DBID MIN(BEGIN_INTERVAL_TIME)   MAX(END_INTERVAL_TIME)
---------- -------------------------- --------------------------
1884370030 18-JUN-19 07.03.44.000 PM  20-JUN-19 11.02.49.684 AM
3743596248 18-JUN-19 05.00.13.342 PM  19-JUN-19 11.00.20.020 AM
Create a custom configuration file in the sqldb360/sql directory
  • Set the database ID with the DBID from the export.
  • Specify edb360_conf_date_from/to parameters to match the range of dates in the import.
- this is a sample custom configuration requesting to produce just one section
-- for all possible DEF values please review edb360_00_config.sql

-- section to report. null means all (default)
-- report column, or section, or range of columns or range of sections i.e. 3, 3-4, 3a, 3a-4c, 3-4c, 3c-4
DEF edb360_sections = '';

-- range of dates below superceed history days when values are other than YYYY-MM-DD
-- default values YYYY-MM-DD mean: use edb360_conf_days
-- actual values sample: 2016-04-26
DEF edb360_conf_date_from = '2019-06-18';
DEF edb360_conf_date_to = '2019-06-19';

-- use if you need tool to act on a dbid stored on AWR, but that is not the current v$database.dbid
DEF edb360_config_dbid = '3743596248';
Run EDB360 referencing the custom config file
@edb360 T custom_config_01.sql
@sqld360 5dcbnw1d9yfna T custom_config_01.sql
NB: EDB360 and SQLD360 reports generated in this way will contain a mixture of data local database and the database from which the AWR export was taken.
  • Some reports are based on v$ and x$ views or data dictionary views (DBA_TABLES etc.).  These will report on the database in the virtual machine.
  • Only the reports based on DBA_HIST% views will report on the database from which the AWR data was exported.
It is important to keep this distinction in mind.
Nearly every section of the EDB360 report contains reports based on the AWR data, so it is not worth trying to exclude any sections when running EDB360 imported data.

Extracting Execution Plans from AWR

We would normally extract the execution plan of a particular statement from the AWR for the local database using the DISPLAY_AWR function in DBMS_XPLAN package as follows.
SELECT * FROM table(dbms_xplan.display_awr('burz0q5qjtgmv'));
You can specify the database ID to DISPLAY_AWR(), but if it is not the DBID of the current database, then it returns nothing.
SELECT * FROM table(dbms_xplan.display_awr('5dcbnw1d9yfna',db_id=>3743596248));
The various DISPLAY% functions in DBMS_XPLAN are formatted report generators for data held in a particular data structure.  DISPLAY_CURSOR() references V$SQL_PLAN, DISPLAY_AWR() references DBA_HIST_SQL_PLAN, and you can pass a table name to DISPLAY so that it can reference any table or view.  That is useful if you have your own private plan table.
In this case, instead of DISPLAY_AWR(), we can use the DISPLAY() function directly on the DBA_HIST_SQLPLAN view, thus:
set long 5000 lines 150 pages 999
column sql_text format a150
SELECT sql_text from dba_hist_sqltext
where dbid=3743596248 AND sql_id = '5dcbnw1d9yfna'
/
SELECT * FROM table(dbms_xplan.display(
  table_name=>'DBA_HIST_SQL_PLAN'
, format=>'ADVANCED +ADAPTIVE'
, filter_preds=>'dbid=3743596248 AND sql_id = ''5dcbnw1d9yfna'''))
/
Note that:
  • The DBID and SQL_ID are passed as quoted literals to the function.
  • DBMS_XPLAN.DISPLAY() does not report the SQL statement, so that must be queried separately.

Monday, February 04, 2019

Regular Expression Functions are Considered To Be Non-Deterministic from Oracle 12.2

You cannot put non-deterministic functions into virtual columns or function-based index.  Regular expressions have always been sensitive to National Language Support (NLS) settings because what is a character is different in different languages.  However, from Oracle 12.2. regular expressions are considered to be non-deterministic because Oracle has fixed bug 20804063.

Background 

A developer came to me with the following query, complaining it was slow.
SELECT …
FROM   …
AND    LOWER(REGEXP_REPLACE(ASCIISTR(my_column), '[^a-zA-Z0-9]')) ='2732018'
…
Obviously, functions on columns in predicates prevent the use of indexes, so I suggested creating and indexing a virtual column (slightly more elegant and almost the same as a function-based index but without needing the function in the code to match the index).
This works fine in Oracle 12.1, but they came back saying it didn't in 12.2.  Note the error messages in bold below.
DROP TABLE t PURGE
/
CREATE TABLE t AS 
SELECT TO_CHAR(sysdate-level,'YYYY/DDD') a
FROM   dual
CONNECT BY LEVEL < 1000
/

CREATE INDEX t1 on t (REGEXP_REPLACE(a,'[^a-zA-Z0-9]'))
/
ORA-01743: only pure functions can be indexed

DROP INDEX t1
/
ALTER TABLE t 
ADD b AS (REGEXP_REPLACE(a,'[^a-zA-Z0-9]')) VIRTUAL
/
ORA-54002: only pure functions can be specified in a virtual column expression
I eventually came across this question and answer by Connor McDonald on AskTom: Adding constraint with REGEXP_REPLACE fails that explains that Oracle has fixed bug 20804063 in 12.2.  This fix affects virtual columns, function-based indexes (because they also create virtual columns), and constraints.
There are a number of workarounds, but none are completely satisfactory.

Workaround 1: Deterministic Function 

As Connor suggests, you can work around this unwanted behaviour by creating your own PL/SQL function and telling the database it is deterministic. This is not simply lying to the database.  Not all regular expressions are actually NLS sensitive, it depends what you are doing.  However, it does appear that the validation is applied regardless. If the function is NLS sensitive that might have unwanted consequences, including incorrect query results. You would have to decide whether you can live with the risk depending on what is actually in the column and function in question.
CREATE OR REPLACE FUNCTION my_regexp_replace(p1 VARCHAR2, p2 VARCHAR2) 
RETURN varchar2 
DETERMINISTIC IS
BEGIN
RETURN REGEXP_REPLACE(p1, p2);
END;
/
show error

CREATE INDEX t1 ON t (my_regexp_replace(a,'[^a-zA-Z0-9]'))
/
DROP INDEX t1
/
ALTER TABLE t 
ADD b AS (my_REGEXP_REPLACE(a,'[^a-zA-Z0-9]')) VIRTUAL
/
CREATE INDEX t1 ON t (b)
/

Workaround 2: Refresh on Commit Materialized View 

Another possible workaround would be a materialized view that refreshes on commit, with the expression creating an additional column that is then indexed.
CREATE TABLE t AS 
SELECT rownum n
,      TO_CHAR(sysdate-level,'YYYY/DDD') a
FROM  dual
CONNECT BY LEVEL < 1000
/
ALTER TABLE t ADD CONSTRAINT t_pk PRIMARY KEY (n)
/
CREATE MATERIALIZED VIEW LOG ON t
/

CREATE MATERIALIZED VIEW T1_MV 
REFRESH ON COMMIT 
FAST 
WITH PRIMARY KEY 
ENABLE QUERY REWRITE AS 
SELECT t.*
,REGEXP_REPLACE(a,'[^a-zA-Z0-9]') b
FROM t
/

CREATE INDEX t1_mv_b ON t1_mv(b);

INSERT INTO t
SELECT ROWNUM+1000 n
,      TO_CHAR(sysdate-level-1000,'YYYY/DDD') a
FROM  dual
CONNECT BY LEVEL < 1000
/
commit
/

set autotrace on
SELECT * FROM t
WHERE REGEXP_REPLACE(a,'[^a-zA-Z0-9]') like '201720%';
And Oracle does indeed rewrite the query to use the materialised view and then uses the index on the materialized view.
Plan hash value: 3543552962

----------------------------------------------------------------------------------------------------------
| Id  | Operation                                      | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                               |         |     1 |    21 |     3   (0)| 00:00:01 |
|   1 |  MAT_VIEW REWRITE ACCESS BY INDEX ROWID BATCHED| T1_MV   |     1 |    21 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                             | T1_MV_B |     2 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------
However, this solution has the same weakness of being dependent on NLS settings – it is just that there is no validation to stop you! There is the further threat that the refresh on commit could become a performance problem if there is intense DDL on the underlying table.

Workaround 3: Use a Non-NLS Dependent Function 

Ideally, it would be better to use a non-NLS dependent function instead of a REGEXP% function. However, this may be easier said than done.  Regular expressions solve problems that are hard to do with basic string handling functions and may require a PL/SQL function anyway to hold procedural code.  Although in this relatively simple example it is possible to remove the unwanted characters with the translate and replace functions. There is no NLS restriction here.
SELECT a
,      REGEXP_REPLACE(a,'[^0-9]')
,      REPLACE(TRANSLATE(a,'/',' '),' ','')
FROM   t
WHERE rownum <= 10
/

A        REGEXP_REPLACE(A,'[^0-9]')       REPLACE(TRANSLATE(A,'/',''),'','
-------- -------------------------------- --------------------------------
2019/031 2019031                          2019031
2019/030 2019030                          2019030
2019/029 2019029                          2019029
2019/028 2019028                          2019028
2019/027 2019027                          2019027
2019/026 2019026                          2019026
2019/025 2019025                          2019025
2019/024 2019024                          2019024
2019/023 2019023                          2019023
2019/022 2019022                          2019022
…
And you can put this into a function-based index or virtual column thus
DROP INDEX t1 
/ 
ALTER TABLE t ADD b AS (REPLACE(TRANSLATE(a,'/',' '),' ','')) VIRTUAL 
/ 
CREATE INDEX t1 ON t (b) 
/

Friday, November 30, 2018

A Brief Look Inside Oracle's Autonomous Data Warehouse Cloud

This post is part of a series that discusses some common issues in data warehouses.
There is lots of documentation for Autonomous Data Warehouse Cloud (ADWC), in which I found this bold claim:
  •  'Additionally, Autonomous Data Warehouse does not require any tuning. Autonomous Data Warehouse is designed as a ''load and go” service: you start the service, define tables, load data, and then run queries. When you use Autonomous Data Warehouse, no tuning is necessary. You do not need to consider any details about parallelism, partitioning, indexing, or compression. The service automatically configures the database for high-performance queries.' - Getting Started with Autonomous Data Warehouse 
I also found these references were helpful to get me going:
The documentation describes how to import data with the DBMS_CLOUD package or with data pump. You have to upload files onto cloud storage and import from there.  For data pump import, the various object types in the exclude parameter recommended in the documentation are a consequence of the 'load and go' approach.
impdp admin/password@ADWC1_high \       
     directory=data_pump_dir \       
     credential=def_cred_name \       
     dumpfile= https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/v1/adwc/adwc_user/export%u.dmp \
     parallel=16 \
     partition_options=merge \ 
     transform=segment_attributes:n \
     transform=dwcs_cvt_iots:y \
     transform=constraint_use_default_index:y \
     exclude=index, cluster, indextype, materialized_view, materialized_view_log, materialized_zonemap, db_link
Clearly, Oracle expects you to periodically incrementally bulk load data and then work with essentially static data.

What is Autonomous Data Warehouse Cloud? 

It is built on Exadata, so it is an engineered system.  This is what I found I was running on:
  • 12.2.0.1.0 Single Instance RAC
  • SGA: 3.3Gb, PGA: 5Gb, RAM: 708Gb
  • Intel(R) Xeon(R) CPU E7-8867 v4 @ 2.40GHz 84 CPUs/Core/Threads
  • 12 storage servers
There are two pre-installed locked sample schemas
  • The sales history schema SH, that I have been using so far, is a bit different to the one you would build with the scripts on Github. 
  • SSB – Star Schema Benchmark.

How is the Oracle delivered Sales History demo schema in ADWC built? 

There are some significant differences with the delivered Sales History schema:
  • Primary key constraints are DISABLED, NOVALIDATE, RELY.  There is corresponding no unique index because the constraint was created DISABLED. 
    • Thus, there is nothing to prevent you loading duplicate data!
  • Foreign key constraints are present, but also DISABLED, NOVALIDATE, RELY.
    • Foreign keys are also not indexed, but as they are not enforced there is no risk of TM locking.
    • Foreign key join elimination can still happen because QUERY_REWRITE_INTEGRITY=TRUSTED
It is clearly not intended that referential integrity is going to be enforced by the database in ADWC. That is not an unreasonable decision in a data warehouse because you expect the data to be validated in the source system.  However, while it is one thing not to enforce referential integrity, I think that not enforcing uniqueness is courageous!  Although, part of the cost of doing so is maintaining a unique index.
I am surprised that the SALES table is not partitioned.  Partition pruning is a very effective way of eliminating data from a query, but perhaps Oracle is relying on the Bloom filtering and the storage indexes to instead.
PCTFREE is set to 0, so that the data is packed into data blocks without leaving room for updates.  This makes good sense.  In a data warehouse, we don't expect to update the data after it has been loaded, or cater also for OLTP behaviour.  This fits with the 'load and go' statement.
The tables have been Hybrid Columnar Compressed (HCC). This also makes sense, because the data is bulk loaded in direct-path mode, and again we don't expect to be updating the data after it has been loaded.  However, Oracle has used QUERY HIGH HCC.  As early as 2016, Roger MacNichol pointed out in Compression in a well-balanced system that “since HCC was released …, DBAs have almost always used Query High” but as CPU speeds have increased “the time has come for a reassessment of role HCC Query Low”.  I have also written about just such a case.

Building My Own Sales History Demo Schema 

I also built my own sales history demo per the standard public scripts. The difference between the two is quite telling. I found that:
  • I can build indexes, not just primary keys and foreign keys (though I understand that has only recently become possible).
  • I can specify tablespaces and storage 
  • I can do some alter session commands but not others due to PDB security profiles.
    • I can't ALTER SESSION SET star_transformation_enabled = TRUE;
    • However, I can set ALTER SESSION SET optimizer_ignore_hints =FALSE, and then I get star transformation by using the STAR_TRANSFORMATION hint. 
  • I can also build B-tree or bitmap secondary indexes. 
 For example, you might choose to index attribute columns on your dimensions.
CREATE UNIQUE INDEX "COUNTRIES_ISO" ON "COUNTRIES" ("COUNTRY_ISO_CODE");
And ADWC will use it.
Plan hash value: 2437708077

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name          | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | Pstart| Pstop | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |               |      1 |        |       |   427 (100)|       |          |       |     45 |00:00:00.11 |    1651 |       |       |          |
|   1 |  SORT GROUP BY                    |               |      1 |    102 |  8874 |   427   (6)| 00:00:01 |       |       |     45 |00:00:00.11 |    1651 |  6144 |  6144 | 6144  (0)|
|*  2 |   HASH JOIN                       |               |      1 |   4210 |   357K|   426   (6)| 00:00:01 |       |       |  64818 |00:00:00.16 |    1651 |  5443K|  5443K| 5260K (0)|
|*  3 |    TABLE ACCESS STORAGE FULL      | PRODUCTS      |      1 |     26 |   208 |     2   (0)| 00:00:01 |       |       |     26 |00:00:00.01 |       7 |  1025K|  1025K|          |
|*  4 |    HASH JOIN                      |               |      1 |  11886 |   916K|   424   (6)| 00:00:01 |       |       |    141K|00:00:00.08 |    1644 |  4266K|  4266K| 4262K (0)|
|   5 |     JOIN FILTER CREATE            | :BF0001       |      1 |    364 |  4368 |     2   (0)| 00:00:01 |       |       |    364 |00:00:00.01 |      16 |       |       |          |
|   6 |      PART JOIN FILTER CREATE      | :BF0000       |      1 |    364 |  4368 |     2   (0)| 00:00:01 |       |       |    364 |00:00:00.01 |      16 |       |       |          |
|*  7 |       TABLE ACCESS STORAGE FULL   | TIMES         |      1 |    364 |  4368 |     2   (0)| 00:00:01 |       |       |    364 |00:00:00.01 |      16 |  1025K|  1025K|          |
|*  8 |     HASH JOIN                     |               |      1 |  48360 |  3164K|   422   (6)| 00:00:01 |       |       |    143K|00:00:00.07 |    1628 |  3376K|  3178K| 4759K (0)|
|   9 |      JOIN FILTER CREATE           | :BF0002       |      1 |   2921 |   128K|    21   (5)| 00:00:01 |       |       |  18520 |00:00:00.02 |     755 |       |       |          |
|  10 |       NESTED LOOPS                |               |      1 |   2921 |   128K|    21   (5)| 00:00:01 |       |       |  18520 |00:00:00.01 |     755 |       |       |          |
|  11 |        TABLE ACCESS BY INDEX ROWID| COUNTRIES     |      1 |      1 |    18 |     1   (0)| 00:00:01 |       |       |      1 |00:00:00.01 |       2 |       |       |          |
|* 12 |         INDEX UNIQUE SCAN         | COUNTRIES_ISO |      1 |      1 |       |     0   (0)|       |          |       |      1 |00:00:00.01 |       1 |  1025K|  1025K|          |
|* 13 |        VIEW                       | VW_GBF_25     |      1 |   2921 | 78867 |    20   (5)| 00:00:01 |       |       |  18520 |00:00:00.01 |     753 |       |       |          |
|  14 |         TABLE ACCESS STORAGE FULL | CUSTOMERS     |      1 |  55500 |  1138K|    20   (5)| 00:00:01 |       |       |  55500 |00:00:00.01 |     753 |  1025K|  1025K|          |
|  15 |      JOIN FILTER USE              | :BF0001       |      1 |    918K|    19M|   396   (5)| 00:00:01 |       |       |    158K|00:00:00.04 |     873 |       |       |          |
|  16 |       JOIN FILTER USE             | :BF0002       |      1 |    918K|    19M|   396   (5)| 00:00:01 |       |       |    158K|00:00:00.04 |     873 |       |       |          |
|  17 |        PARTITION RANGE JOIN-FILTER|               |      1 |    918K|    19M|   396   (5)| 00:00:01 |:BF0000|:BF0000|    158K|00:00:00.04 |     873 |       |       |          |
|* 18 |         TABLE ACCESS STORAGE FULL | SALES         |      5 |    918K|    19M|   396   (5)| 00:00:01 |:BF0000|:BF0000|    158K|00:00:00.04 |     873 |  1025K|  1025K|          |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Whether I should do any of this is another matter!

Delivered non-default parameters

There are a number of database initialisation parameters set to non-default values that are noteworthy:
  • QUERY_REWRITE_INTEGRITY=TRUSTED: This ties in with making the foreign key constraints RELY, so that foreign key join elimination will still occur.
  • RESULT_CACHE_MODE=FORCE enables the result cache by default for all queries.  Whether this is going to be a significant benefit depends upon your application.
select  /*+ gather_plan_statistics opt_param('result_cache_mode','MANUAL')*/ 
 c.country_name
, u.cust_state_province
,  COUNT(*) num_sales
,  SUM(s.amount_sold) total_amount_sold
from  sales s
,   customers u
,   products p
,  times t
, countries c
WHERE  s.time_id = t.time_id
AND   s.prod_id = p.prod_id
AND   u.cust_id = s.cust_id
AND u.country_id = c.country_id
AND c.country_iso_code = '&&iso_country_code'
AND p.prod_category_id = 205
and  t.fiscal_year = 1999
GROUP BY c.country_name
, u.cust_state_province
ORDER BY 1,2
/
Without the result cache, I get the standard full scan/Bloom filter execution plan.
Plan hash value: 2719715383

-------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name      | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |           |      1 |        |       |    38 (100)|          |     45 |00:00:00.16 |     657 |       |       |          |
|   1 |  SORT GROUP BY                   |           |      1 |    102 |  8262 |    38  (48)| 00:00:01 |     45 |00:00:00.16 |     657 |  6144 |  6144 | 6144  (0)|
|*  2 |   HASH JOIN                      |           |      1 |   3478 |   275K|    37  (46)| 00:00:01 |  64818 |00:00:00.03 |     657 |  5443K|  5443K| 5259K (0)|
|*  3 |    TABLE ACCESS STORAGE FULL     | PRODUCTS  |      1 |     26 |   208 |     2   (0)| 00:00:01 |     26 |00:00:00.01 |       3 |  1025K|  1025K|          |
|*  4 |    HASH JOIN                     |           |      1 |   9819 |   699K|    35  (49)| 00:00:01 |    141K|00:00:00.18 |     654 |  4266K|  4266K| 4289K (0)|
|   5 |     JOIN FILTER CREATE           | :BF0000   |      1 |    364 |  4368 |     2   (0)| 00:00:01 |    364 |00:00:00.01 |       3 |       |       |          |
|*  6 |      TABLE ACCESS STORAGE FULL   | TIMES     |      1 |    364 |  4368 |     2   (0)| 00:00:01 |    364 |00:00:00.01 |       3 |  1025K|  1025K|          |
|*  7 |     HASH JOIN                    |           |      1 |  39950 |  2379K|    33  (52)| 00:00:01 |    141K|00:00:00.18 |     651 |  3411K|  3411K| 4776K (0)|
|   8 |      JOIN FILTER CREATE          | :BF0001   |      1 |   2413 | 94107 |     9  (12)| 00:00:01 |  18520 |00:00:00.01 |     215 |       |       |          |
|*  9 |       HASH JOIN                  |           |      1 |   2413 | 94107 |     9  (12)| 00:00:01 |  18520 |00:00:00.01 |     215 |  2642K|  2642K|  681K (0)|
|  10 |        JOIN FILTER CREATE        | :BF0002   |      1 |      1 |    18 |     2   (0)| 00:00:01 |      1 |00:00:00.01 |       3 |       |       |          |
|* 11 |         TABLE ACCESS STORAGE FULL| COUNTRIES |      1 |      1 |    18 |     2   (0)| 00:00:01 |      1 |00:00:00.01 |       3 |  1025K|  1025K|          |
|  12 |        JOIN FILTER USE           | :BF0002   |      1 |  55500 |  1138K|     7  (15)| 00:00:01 |  18520 |00:00:00.01 |     212 |       |       |          |
|* 13 |         TABLE ACCESS STORAGE FULL| CUSTOMERS |      1 |  55500 |  1138K|     7  (15)| 00:00:01 |  18520 |00:00:00.01 |     212 |  1025K|  1025K|          |
|  14 |      JOIN FILTER USE             | :BF0000   |      1 |    918K|    19M|    19  (58)| 00:00:01 |    156K|00:00:00.07 |     436 |       |       |          |
|  15 |       JOIN FILTER USE            | :BF0001   |      1 |    918K|    19M|    19  (58)| 00:00:01 |    156K|00:00:00.07 |     436 |       |       |          |
|* 16 |        TABLE ACCESS STORAGE FULL | SALES     |      1 |    918K|    19M|    19  (58)| 00:00:01 |    156K|00:00:00.07 |     436 |  1025K|  1025K|          |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Normally result cache is enabled.  Here the query was not executed because the result was in the cache.
Plan hash value: 2719715383

------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name                       | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                            |      1 |        |       |    38 (100)|          |     45 |00:00:00.01 |
|   1 |  RESULT CACHE                     | 8992dgrw00p4p9zu2vmq8p3nwg |      1 |        |       |            |          |     45 |00:00:00.01 |
|   2 |   SORT GROUP BY                   |                            |      0 |    102 |  8262 |    38  (48)| 00:00:01 |      0 |00:00:00.01 |
|   3 |    HASH JOIN                      |                            |      0 |   3478 |   275K|    37  (46)| 00:00:01 |      0 |00:00:00.01 |
|   4 |     TABLE ACCESS STORAGE FULL     | PRODUCTS                   |      0 |     26 |   208 |     2   (0)| 00:00:01 |      0 |00:00:00.01 |
|   5 |     HASH JOIN                     |                            |      0 |   9819 |   699K|    35  (49)| 00:00:01 |      0 |00:00:00.01 |
|   6 |      JOIN FILTER CREATE           | :BF0000                    |      0 |    364 |  4368 |     2   (0)| 00:00:01 |      0 |00:00:00.01 |
|   7 |       TABLE ACCESS STORAGE FULL   | TIMES                      |      0 |    364 |  4368 |     2   (0)| 00:00:01 |      0 |00:00:00.01 |
|   8 |      HASH JOIN                    |                            |      0 |  39950 |  2379K|    33  (52)| 00:00:01 |      0 |00:00:00.01 |
|   9 |       JOIN FILTER CREATE          | :BF0001                    |      0 |   2413 | 94107 |     9  (12)| 00:00:01 |      0 |00:00:00.01 |
|  10 |        HASH JOIN                  |                            |      0 |   2413 | 94107 |     9  (12)| 00:00:01 |      0 |00:00:00.01 |
|  11 |         JOIN FILTER CREATE        | :BF0002                    |      0 |      1 |    18 |     2   (0)| 00:00:01 |      0 |00:00:00.01 |
|  12 |          TABLE ACCESS STORAGE FULL| COUNTRIES                  |      0 |      1 |    18 |     2   (0)| 00:00:01 |      0 |00:00:00.01 |
|  13 |         JOIN FILTER USE           | :BF0002                    |      0 |  55500 |  1138K|     7  (15)| 00:00:01 |      0 |00:00:00.01 |
|  14 |          TABLE ACCESS STORAGE FULL| CUSTOMERS                  |      0 |  55500 |  1138K|     7  (15)| 00:00:01 |      0 |00:00:00.01 |
|  15 |       JOIN FILTER USE             | :BF0000                    |      0 |    918K|    19M|    19  (58)| 00:00:01 |      0 |00:00:00.01 |
|  16 |        JOIN FILTER USE            | :BF0001                    |      0 |    918K|    19M|    19  (58)| 00:00:01 |      0 |00:00:00.01 |
|  17 |         TABLE ACCESS STORAGE FULL | SALES                      |      0 |    918K|    19M|    19  (58)| 00:00:01 |      0 |00:00:00.01 |
------------------------------------------------------------------------------------------------------------------------------------------------

Parameter
Value
Comment
_default_pct_free
1
Sets the default value for PCT_FREE to 1% (defaults to 10%) in order to pack data in blocks.
_optimizer_gather_stats_on_load_all
TRUE
Statistics gathered on-line during direct-path insert into a table that is not empty.  Otherwise, this only happens on initial load or during a create-table-as-select command.
_optimizer_gather_stats_on_load_hist
TRUE
Histograms also gathered on-line during direct-path load.
optimizer_ignore_hints
TRUE
Ignore hints embedded in SQL.
optimizer_ignore_parallel_hints
TRUE
Ignore embedded parallel hints.
result_cache_max_size
100M
Maximum size of result cache.
result_cache_max_result
1
Maximum percentage of result cache that one result can use (defaults to 5%).
Inmemory_size
1G
Size of In-memory Column Store, so this feature is enabled.
_cell_offload_vector_groupby
FALSE
In-Memory Aggregation optimisation is disabled (see related white paper).

Other useful references:

Summary

Everything that I have discussed in this series applies to ADWC just as it does to other data warehouses.
  • Your data warehouse should certainly use a strict star schema data model such that you can define foreign and either primary or unique keys in the database.  
  • You should only have equality joins between dimension and fact tables.
  • Wherever possible these should be single column joins and therefore single column foreign and primary keys.  You can consider multi-column keys from Oracle 12.2, however, there are bugs.
  • Whether the database enforces the foreign key constraints is a matter of choice, but you need to define them in order to achieve join elimination.
ADWC is built on Engineered systems
  • So full scan/Bloom filter is going to be able to take advantage of Engineer system optimisations.
    • Bloom Filters are pushed to storage server during smart scan
    • A Bloom Filter on one table can be used against the storage index on another table to skip I/O.
    • Hybrid Columnar Compression further assists smart scans.
  • I would not like to work without enforced primary keys, I think that is courageous!
  • If the database is not going to enforce referential integrity, then foreign key should be made RELY.  As QUERY_REWRITE_INTEGRITY=TRUSTED you will still get foreign key join elimination.
  • You are expected to load data by periodically incrementally bulk loading it.
    • Optimiser statistics are maintained automatically in direct path insert even after the initial insert.  The regular statistics maintenance window is disabled.  If you are going to further transform of data after the initial load, then you will need to think about whether and when to collect statistics.
  • RESULT_CACHE_MODE=FORCE, so the result cache is used by default across the board, with a maximum result size to keep things reasonable.
    • This raises the question whether this should be considered for non-autonomous engineered system data warehouses?
    • However, used incorrectly, the result cache can cause contention problems.

Will Anything run on ADWC?

At the end of the day, ADWC is an Oracle database on an engineered system, although it has been set up and configured in a very particular way.  It is possible to alter ADWC settings to get traditional Star Transformation behaviour and do all the things that you usually do to implement a data warehouse.  However, as I have demonstrated earlier in this series, you probably wouldn't want to do that.
If your legacy data warehouse doesn't follow the principles of good practice, including but not limited to those set out above, then you probably should not be trying to force it into ADWC.  Other platforms, where you have more discretion, are available.

Wednesday, November 28, 2018

Data Warehouse Design: Engineered Systems Considerations

This post is part of a series that discusses some common issues in data warehouses.

On an engineered system, a key feature is that Bloom filters are pushed to storage cells during smart scan,  Additionally, a Bloom filter computed from a join column of a one table can be used against another table.  Storage index can skip I/O against the large fact table based on a Bloom filter calculated from a small dimension table (see Tanel Poder's Blog: Combining Bloom Filter Offloading and Storage Indexes on Exadata)
This shifts the balance away from Star Transformation, so you are far less likely to want to add bitmap indexes.
I repeated the same test from my previous blog post (US by State for sales in 1999), on exactly the same data, on Oracle 12.1 on Exadata. The cost of the Star Transformation is lower at 575, because db_file_multi_block_read_count=128, and this makes sense on an Exadata because you can do 1Mb reads from disk.
Plan hash value: 472227253

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                        | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | Pstart| Pstop | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                             |      1 |        |       |   575 (100)|          |       |       |     45 |00:00:00.37 |   98059 |     98 |     98 |       |       |          |
|   1 |  TEMP TABLE TRANSFORMATION           |                             |      1 |        |       |            |          |       |       |     45 |00:00:00.37 |   98059 |     98 |     98 |       |       |          |
|   2 |   LOAD AS SELECT                     |                             |      1 |        |       |            |          |       |       |      0 |00:00:00.03 |    1647 |      0 |     98 |  1040K|  1040K|          |
|   3 |    HASH JOIN                         |                             |      1 |   2921 |   111K|    23   (5)| 00:00:01 |       |       |  18520 |00:00:00.02 |    1546 |      0 |      0 |  2408K|  2408K|  281K (0)|
|   4 |     JOIN FILTER CREATE               | :BF0000                     |      1 |      1 |    18 |     2   (0)| 00:00:01 |       |       |      1 |00:00:00.01 |       9 |      0 |      0 |       |       |          |
|   5 |      TABLE ACCESS STORAGE FULL       | COUNTRIES                   |      1 |      1 |    18 |     2   (0)| 00:00:01 |       |       |      1 |00:00:00.01 |       9 |      0 |      0 |  1025K|  1025K|          |
|   6 |     JOIN FILTER USE                  | :BF0000                     |      1 |  55500 |  1138K|    21   (5)| 00:00:01 |       |       |  18520 |00:00:00.01 |    1537 |      0 |      0 |       |       |          |
|   7 |      TABLE ACCESS STORAGE FULL       | CUSTOMERS                   |      1 |  55500 |  1138K|    21   (5)| 00:00:01 |       |       |  18520 |00:00:00.01 |    1537 |      0 |      0 |  1025K|  1025K|          |
|   8 |   SORT GROUP BY                      |                             |      1 |   2344 |   100K|   551   (1)| 00:00:01 |       |       |     45 |00:00:00.35 |   96406 |     98 |      0 |  6144 |  6144 | 6144  (0)|
|   9 |    HASH JOIN                         |                             |      1 |   2344 |   100K|   550   (1)| 00:00:01 |       |       |  64818 |00:00:00.32 |   96406 |     98 |      0 |  5827K|  5180K| 7295K (0)|
|  10 |     VIEW                             | VW_ST_C525CEF3              |      1 |   2344 | 42192 |   548   (1)| 00:00:01 |       |       |  64818 |00:00:00.29 |   96300 |     98 |      0 |       |       |          |
|  11 |      NESTED LOOPS                    |                             |      1 |   2344 |   116K|   542   (1)| 00:00:01 |       |       |  64818 |00:00:00.28 |   96300 |     98 |      0 |       |       |          |
|  12 |       PARTITION RANGE SUBQUERY       |                             |      1 |   2344 | 67986 |   313   (1)| 00:00:01 |KEY(SQ)|KEY(SQ)|  64818 |00:00:00.21 |   96041 |     98 |      0 |       |       |          |
|  13 |        BITMAP CONVERSION TO ROWIDS   |                             |      5 |   2344 | 67986 |   313   (1)| 00:00:01 |       |       |  64818 |00:00:00.20 |   95974 |     98 |      0 |       |       |          |
|  14 |         BITMAP AND                   |                             |      5 |        |       |            |          |       |       |      5 |00:00:00.19 |   95974 |     98 |      0 |       |       |          |
|  15 |          BITMAP MERGE                |                             |      5 |        |       |            |          |       |       |      5 |00:00:00.01 |    1921 |      0 |      0 |  1024K|   512K|39936  (0)|
|  16 |           BITMAP KEY ITERATION       |                             |      5 |        |       |            |          |       |       |    365 |00:00:00.01 |    1921 |      0 |      0 |       |       |          |
|  17 |            BUFFER SORT               |                             |      5 |        |       |            |          |       |       |   1820 |00:00:00.01 |      65 |      0 |      0 | 73728 | 73728 |          |
|  18 |             TABLE ACCESS STORAGE FULL| TIMES                       |      1 |    364 |  4368 |     2   (0)| 00:00:01 |       |       |    364 |00:00:00.01 |      65 |      0 |      0 |  1025K|  1025K|          |
|  19 |            BITMAP INDEX RANGE SCAN   | SALES_TIME_BIX              |   1820 |        |       |            |          |KEY(SQ)|KEY(SQ)|    365 |00:00:00.01 |    1856 |      0 |      0 |  1025K|  1025K|          |
|  20 |          BITMAP MERGE                |                             |      5 |        |       |            |          |       |       |      5 |00:00:00.01 |     161 |      0 |      0 |  1024K|   512K|10240  (0)|
|  21 |           BITMAP KEY ITERATION       |                             |      5 |        |       |            |          |       |       |    132 |00:00:00.01 |     161 |      0 |      0 |       |       |          |
|  22 |            BUFFER SORT               |                             |      5 |        |       |            |          |       |       |    130 |00:00:00.01 |      10 |      0 |      0 | 73728 | 73728 |          |
|  23 |             TABLE ACCESS STORAGE FULL| PRODUCTS                    |      1 |     14 |   112 |     2   (0)| 00:00:01 |       |       |     26 |00:00:00.01 |      10 |      0 |      0 |  1025K|  1025K|          |
|  24 |            BITMAP INDEX RANGE SCAN   | SALES_PROD_BIX              |    130 |        |       |            |          |KEY(SQ)|KEY(SQ)|    132 |00:00:00.01 |     151 |      0 |      0 |  1025K|  1025K|          |
|  25 |          BITMAP MERGE                |                             |      5 |        |       |            |          |       |       |      5 |00:00:00.19 |   93892 |     98 |      0 |  1024K|   512K|  310K (0)|
|  26 |           BITMAP KEY ITERATION       |                             |      5 |        |       |            |          |       |       |   6504 |00:00:00.18 |   93892 |     98 |      0 |       |       |          |
|  27 |            BUFFER SORT               |                             |      5 |        |       |            |          |       |       |  92600 |00:00:00.03 |     106 |     98 |      0 |    28M|  2978K|  928K (0)|
|  28 |             TABLE ACCESS STORAGE FULL| SYS_TEMP_0FD9D66B4_75DFDD27 |      1 |   2921 | 14605 |     2   (0)| 00:00:01 |       |       |  18520 |00:00:00.01 |     106 |     98 |      0 |  1025K|  1025K|          |
|  29 |            BITMAP INDEX RANGE SCAN   | SALES_CUST_BIX              |  92600 |        |       |            |          |KEY(SQ)|KEY(SQ)|   6504 |00:00:00.13 |   93786 |      0 |      0 |  1025K|  1025K|          |
|  30 |       TABLE ACCESS BY USER ROWID     | SALES                       |  64818 |      1 |    22 |   235   (0)| 00:00:01 | ROWID | ROWID |  64818 |00:00:00.05 |     259 |      0 |      0 |       |       |          |
|  31 |     TABLE ACCESS STORAGE FULL        | SYS_TEMP_0FD9D66B4_75DFDD27 |      1 |   2921 | 75946 |     2   (0)| 00:00:01 |       |       |  18520 |00:00:00.01 |     106 |      0 |      0 |  1025K|  1025K|          |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
The cost of the full scan plan is just 66.  In fact, I could only get the star transformation with an explicit hint to force it (so, the cost-based star transformation is working as expected).
Plan hash value: 874291156

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name      | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | Pstart| Pstop | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |           |      1 |        |       |    66 (100)|          |       |       |     45 |00:00:00.14 |    2173 |       |       |          |
|   1 |  SORT GROUP BY                    |           |      1 |    103 |  8343 |    66  (25)| 00:00:01 |       |       |     45 |00:00:00.14 |    2173 |  6144 |  6144 | 6144  (0)|
|   2 |   HASH JOIN                       |           |      1 |   2411 |   190K|    65  (24)| 00:00:01 |       |       |  64818 |00:00:00.11 |    2173 |  4038K|  4038K| 1436K (0)|
|   3 |    PART JOIN FILTER CREATE        | :BF0000   |      1 |    364 |  4368 |     2   (0)| 00:00:01 |       |       |    364 |00:00:00.01 |      65 |       |       |          |
|   4 |     TABLE ACCESS STORAGE FULL     | TIMES     |      1 |    364 |  4368 |     2   (0)| 00:00:01 |       |       |    364 |00:00:00.01 |      65 |  1025K|  1025K|          |
|   5 |    HASH JOIN                      |           |      1 |   9672 |   651K|    63  (24)| 00:00:01 |       |       |  79240 |00:00:00.10 |    2108 |  5219K|  5219K| 1098K (0)|
|   6 |     JOIN FILTER CREATE            | :BF0001   |      1 |     14 |   112 |     2   (0)| 00:00:01 |       |       |     26 |00:00:00.01 |      10 |       |       |          |
|   7 |      TABLE ACCESS STORAGE FULL    | PRODUCTS  |      1 |     14 |   112 |     2   (0)| 00:00:01 |       |       |     26 |00:00:00.01 |      10 |  1025K|  1025K|          |
|   8 |     HASH JOIN                     |           |      1 |  48360 |  2880K|    60  (24)| 00:00:01 |       |       |  79240 |00:00:00.09 |    2098 |  3241K|  3241K| 1869K (0)|
|   9 |      JOIN FILTER CREATE           | :BF0002   |      1 |   2921 |   111K|    23   (5)| 00:00:01 |       |       |  18520 |00:00:00.02 |    1546 |       |       |          |
|  10 |       HASH JOIN                   |           |      1 |   2921 |   111K|    23   (5)| 00:00:01 |       |       |  18520 |00:00:00.01 |    1546 |  2507K|  2507K|  234K (0)|
|  11 |        JOIN FILTER CREATE         | :BF0003   |      1 |      1 |    18 |     2   (0)| 00:00:01 |       |       |      1 |00:00:00.01 |       9 |       |       |          |
|  12 |         TABLE ACCESS STORAGE FULL | COUNTRIES |      1 |      1 |    18 |     2   (0)| 00:00:01 |       |       |      1 |00:00:00.01 |       9 |  1025K|  1025K|          |
|  13 |        JOIN FILTER USE            | :BF0003   |      1 |  55500 |  1138K|    21   (5)| 00:00:01 |       |       |  18520 |00:00:00.01 |    1537 |       |       |          |
|  14 |         TABLE ACCESS STORAGE FULL | CUSTOMERS |      1 |  55500 |  1138K|    21   (5)| 00:00:01 |       |       |  18520 |00:00:00.01 |    1537 |  1025K|  1025K|          |
|  15 |      JOIN FILTER USE              | :BF0001   |      1 |    918K|    19M|    34  (30)| 00:00:01 |       |       |  87209 |00:00:00.06 |     552 |       |       |          |
|  16 |       JOIN FILTER USE             | :BF0002   |      1 |    918K|    19M|    34  (30)| 00:00:01 |       |       |  87209 |00:00:00.06 |     552 |       |       |          |
|  17 |        PARTITION RANGE JOIN-FILTER|           |      1 |    918K|    19M|    34  (30)| 00:00:01 |:BF0000|:BF0000|  87209 |00:00:00.06 |     552 |       |       |          |
|  18 |         TABLE ACCESS STORAGE FULL | SALES     |      5 |    918K|    19M|    34  (30)| 00:00:01 |:BF0000|:BF0000|  87209 |00:00:00.06 |     552 |  1025K|  1025K|          |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
The balance point between star transformation and full scan Bloom filter is exactly the same, but they all perform about 4x faster.
ISO Country Code
Country Name
Number of Sales in 1999
Number of Sales
Star Transformation
Full Scan-Bloom Filter
A-Time
Buffers
A-Time
Buffers
US
United States of America
2662
526212
.45
98056
.13
2173
DE
Germany
561
81978
.14
45538
.07
2173
JP
Japan
281
60183
.19
7118
.29
2068
GB
United Kingdom
391
58638
.41
4238
.25
2068
IT
Italy
257
42570
.12
43473
.07
2173
AU
Australia
228
33685
.04
8186
.06
2173
FR
France
161
33078
.07
23434
.07
2173
SG
Singapore
80
25253
.04
6984
.06
2173
CA
Canada
90
22858
.05
14156
.06
2173
ES
Spain
85
17136
.05
14319
.06
2173
DK
Denmark
89
16651
.03
5195
.06
2173
AR
Argentina
3
202
.02
5975
.06
2173
BR
Brazil
9
180
.03
7957
.05
2173
TR
Turkey
1
168
.02
4200
.05
2173
CN
China
4
19
.03
7342
.06
2173
PL
Poland
2
18
.03
7342
.06
2173
SA
Saudi Arabia
0
7
.02
4093
.05
2173

 A full-scan is far more likely to deliver acceptable performance on an Engineered system, so why would you go to the trouble of bitmap indexing to achieve star transformation?  You probably wouldn't.