Monday, June 24, 2019

Importing and Working with Exported AWR/ASH data in an Oracle database in a VirtualBox VM

A lot of my performance tuning work involves analysis of ASH and AWR data.  Frequently, I do not have direct access to the databases in question.  Sometimes, I ask clients to run EDB360 on their databases and send me the results, but sometimes I also want to work directly with ASH or AWR metrics.  So, I ask for an export of their AWR repository.
Oracle distributes a pair of scripts in $ORACLE_HOME/rdbms/admin.
  • awrextr.sql exports the AWR repository (including the ASH data), optionally for a range of AWR snapshot IDs. 
  • awrload.sql the export into a staging table and then moves it into the AWR repository in the current database from where you can reference it via the usual DBA_HIST% views and work with it in the usual way.
I do this on an Oracle database in a virtual machine in VirtualBox.  This note explains how I create and set up that machine and then work with the data.

Getting Started

1. Build Oracle VirtualBox Linux VM

I use Frits Hoogland's vagrant builder to create a standard Oracle database on a Linux VM in VirtualBox.  I always build the latest version of Oracle because I can import an AWR export into a higher version of the database, but not a lower one.

2. Make Life Easier For Yourself

I sometimes install other utilities into the virtual machine, such as git, using yum.
yum install git
The VirtualBox is only going to run a single Oracle database, and I am very lazy.  So, I set up the environment in the shell profile by adding the following commands the ~oracle/.bash_profile.
export ORACLE_SID=oracle
export ORAENV_ASK=NO
. oraenv
export ORAENV_ASK=YES

3. Share a Folder on the Host Machine

I usually leave the AWR export file on my host machine rather than copy it into the virtual machine.  I create a VirtualBox shared folder so that the VirtualBox VM can reference a folder on the host machine.  In this case, I have shared c:\temp on my host Windows laptop and called the filesystem temp.  Note that it is auto-mounted and permanent.

In the VirtualBox, the shared folder has been mounted as /media/sf_temp.
[oracle@oracle-database sql]$ df /media/sf_temp
Filesystem     1K-blocks      Used Available Use% Mounted on
temp           974661916 340278844 634383072  35% /media/sf_temp
The sf_temp directory is owned by root and accessible only to the vboxsf group.  Therefore, you must add any user who needs to reference this directory to that group.  In this case, Oracle must be made a member of the vboxsf group.  For safety, this should be done with the usermod command rather than by editing /etc/group directly.
usermod -G vboxsf -a oracle
You can see the result of this command in the /etc/group file
grep oracle /etc/group
vboxsf:x:994:oracle
When logged in as Oracle, the id command will also report the groups of which it is a member.
[oracle@oracle-database ~]$ id
uid=54321(oracle) gid=54321(oinstall) groups=54321(oinstall),100(users),994(vboxsf),54322(dba)
Create a directory in the database to reference this directory on the filesystem.
CREATE OR REPLACE DIRECTORY TEMP as '/media/sf_temp';

4. Oracle FILESYSTEMIO_OPTIONS Parameter

Initially, I got the following error when trying to run a data pump import from the shared folder:
ORA-39000: bad dump file specification
ORA-31640: unable to open dump file "/media/sf_temp/awrdat_xxxx_xxxx.dmp" for read
ORA-27041: unable to open file
Linux-x86_64 Error: 22: Invalid argument
Additional information: 2
On VirtualBox, the Oracle parameter FILESYSTEMIO_OPTIONS will be set to setall by default.  To work around this error set it to none and restart the database.
ALTER SYSTEM SET FILESYSTEMIO_OPTIONS=none SCOPE=SPFILE;
SHUTDOWN IMMEDIATE
STARTUP

5. Data Pump Import

Now you can import an AWR export (previously exported with awrextr.sql) with the script awrload.sql.
cd $ORACLE_HOME/rdbms/admin
sqlplus / as sysdba
@awrload

Working with Imported Data

The AWR data exported from another database will be visible through all the usual database views.  It can be distinguished by having the database ID of the original database.  If, like me, you are used to mining ASH data with SQL scripts, then they will work if you specify the DBID.  However, I have a couple of tips for working with imported AWR data.

SQLDB360 (EDB360 and SQLD360)

Sometimes I run an EDB360 report on the imported data.  The reports and graphs of ASH data and the system time model are particularly useful.
I can generate AWR reports for any pair of snapshots in the imported repository, not just the ones generated by the client and provided to me.  EDB360 generates AWR reports for peak periods by various criteria.
Having installed git, I can use it to download the latest version of SQLDB360.
cd ~/
git clone https://github.com/sqldb360/sqldb360.git
cd ~/sqldb360
Although, I could have used curl instead.
cd ~/
curl -LO https://github.com/sqldb360/sqldb360/archive/unstable.zip
unzip unstable.zip
Obtain the DBID of the database that you have just imported by looking at one of the AWR views
col max(startup_time) format a26
col min(begin_interval_time) format a26
col max(end_interval_time) format a26
SELECT dbid, db_name, count(*), max(startup_time)
FROM   dba_hist_database_instance
GROUP BY dbid, db_name
/
SELECT dbid, min(begin_interval_time), max(end_interval_time)
FROM   dba_hist_snapshot
GROUP BY dbid
/
In this case, my database is ORACLE, and XXXXXXX is the database that I have just imported.
      DBID DB_NAME     COUNT(*) MAX(STARTUP_TIME)
---------- --------- ---------- --------------------------
1884370030 ORACLE             9 20-JUN-19 10.51.50.000 AM
3743596248 XXXXXXX            4 05-MAY-19 01.03.57.000 AM

      DBID MIN(BEGIN_INTERVAL_TIME)   MAX(END_INTERVAL_TIME)
---------- -------------------------- --------------------------
1884370030 18-JUN-19 07.03.44.000 PM  20-JUN-19 11.02.49.684 AM
3743596248 18-JUN-19 05.00.13.342 PM  19-JUN-19 11.00.20.020 AM
Create a custom configuration file in the sqldb360/sql directory
  • Set the database ID with the DBID from the export.
  • Specify edb360_conf_date_from/to parameters to match the range of dates in the import.
- this is a sample custom configuration requesting to produce just one section
-- for all possible DEF values please review edb360_00_config.sql

-- section to report. null means all (default)
-- report column, or section, or range of columns or range of sections i.e. 3, 3-4, 3a, 3a-4c, 3-4c, 3c-4
DEF edb360_sections = '';

-- range of dates below superceed history days when values are other than YYYY-MM-DD
-- default values YYYY-MM-DD mean: use edb360_conf_days
-- actual values sample: 2016-04-26
DEF edb360_conf_date_from = '2019-06-18';
DEF edb360_conf_date_to = '2019-06-19';

-- use if you need tool to act on a dbid stored on AWR, but that is not the current v$database.dbid
DEF edb360_config_dbid = '3743596248';
Run EDB360 referencing the custom config file
@edb360 T custom_config_01.sql
@sqld360 5dcbnw1d9yfna T custom_config_01.sql
NB: EDB360 and SQLD360 reports generated in this way will contain a mixture of data local database and the database from which the AWR export was taken.
  • Some reports are based on v$ and x$ views or data dictionary views (DBA_TABLES etc.).  These will report on the database in the virtual machine.
  • Only the reports based on DBA_HIST% views will report on the database from which the AWR data was exported.
It is important to keep this distinction in mind.
Nearly every section of the EDB360 report contains reports based on the AWR data, so it is not worth trying to exclude any sections when running EDB360 imported data.

Extracting Execution Plans from AWR

We would normally extract the execution plan of a particular statement from the AWR for the local database using the DISPLAY_AWR function in DBMS_XPLAN package as follows.
SELECT * FROM table(dbms_xplan.display_awr('burz0q5qjtgmv'));
You can specify the database ID to DISPLAY_AWR(), but if it is not the DBID of the current database, then it returns nothing.
SELECT * FROM table(dbms_xplan.display_awr('5dcbnw1d9yfna',db_id=>3743596248));
The various DISPLAY% functions in DBMS_XPLAN are formatted report generators for data held in a particular data structure.  DISPLAY_CURSOR() references V$SQL_PLAN, DISPLAY_AWR() references DBA_HIST_SQL_PLAN, and you can pass a table name to DISPLAY so that it can reference any table or view.  That is useful if you have your own private plan table.
In this case, instead of DISPLAY_AWR(), we can use the DISPLAY() function directly on the DBA_HIST_SQLPLAN view, thus:
set long 5000 lines 150 pages 999
column sql_text format a150
SELECT sql_text from dba_hist_sqltext
where dbid=3743596248 AND sql_id = '5dcbnw1d9yfna'
/
SELECT * FROM table(dbms_xplan.display(
  table_name=>'DBA_HIST_SQL_PLAN'
, format=>'ADVANCED +ADAPTIVE'
, filter_preds=>'dbid=3743596248 AND sql_id = ''5dcbnw1d9yfna'''))
/
Note that:
  • The DBID and SQL_ID are passed as quoted literals to the function.
  • DBMS_XPLAN.DISPLAY() does not report the SQL statement, so that must be queried separately.

Monday, February 04, 2019

Regular Expression Functions are Considered To Be Non-Deterministic from Oracle 12.2

You cannot put non-deterministic functions into virtual columns or function-based index.  Regular expressions have always been sensitive to National Language Support (NLS) settings because what is a character is different in different languages.  However, from Oracle 12.2. regular expressions are considered to be non-deterministic because Oracle has fixed bug 20804063.

Background 

A developer came to me with the following query, complaining it was slow.
SELECT …
FROM   …
AND    LOWER(REGEXP_REPLACE(ASCIISTR(my_column), '[^a-zA-Z0-9]')) ='2732018'
…
Obviously, functions on columns in predicates prevent the use of indexes, so I suggested creating and indexing a virtual column (slightly more elegant and almost the same as a function-based index but without needing the function in the code to match the index).
This works fine in Oracle 12.1, but they came back saying it didn't in 12.2.  Note the error messages in bold below.
DROP TABLE t PURGE
/
CREATE TABLE t AS 
SELECT TO_CHAR(sysdate-level,'YYYY/DDD') a
FROM   dual
CONNECT BY LEVEL < 1000
/

CREATE INDEX t1 on t (REGEXP_REPLACE(a,'[^a-zA-Z0-9]'))
/
ORA-01743: only pure functions can be indexed

DROP INDEX t1
/
ALTER TABLE t 
ADD b AS (REGEXP_REPLACE(a,'[^a-zA-Z0-9]')) VIRTUAL
/
ORA-54002: only pure functions can be specified in a virtual column expression
I eventually came across this question and answer by Connor McDonald on AskTom: Adding constraint with REGEXP_REPLACE fails that explains that Oracle has fixed bug 20804063 in 12.2.  This fix affects virtual columns, function-based indexes (because they also create virtual columns), and constraints.
There are a number of workarounds, but none are completely satisfactory.

Workaround 1: Deterministic Function 

As Connor suggests, you can work around this unwanted behaviour by creating your own PL/SQL function and telling the database it is deterministic. This is not simply lying to the database.  Not all regular expressions are actually NLS sensitive, it depends what you are doing.  However, it does appear that the validation is applied regardless. If the function is NLS sensitive that might have unwanted consequences, including incorrect query results. You would have to decide whether you can live with the risk depending on what is actually in the column and function in question.
CREATE OR REPLACE FUNCTION my_regexp_replace(p1 VARCHAR2, p2 VARCHAR2) 
RETURN varchar2 
DETERMINISTIC IS
BEGIN
RETURN REGEXP_REPLACE(p1, p2);
END;
/
show error

CREATE INDEX t1 ON t (my_regexp_replace(a,'[^a-zA-Z0-9]'))
/
DROP INDEX t1
/
ALTER TABLE t 
ADD b AS (my_REGEXP_REPLACE(a,'[^a-zA-Z0-9]')) VIRTUAL
/
CREATE INDEX t1 ON t (b)
/

Workaround 2: Refresh on Commit Materialized View 

Another possible workaround would be a materialized view that refreshes on commit, with the expression creating an additional column that is then indexed.
CREATE TABLE t AS 
SELECT rownum n
,      TO_CHAR(sysdate-level,'YYYY/DDD') a
FROM  dual
CONNECT BY LEVEL < 1000
/
ALTER TABLE t ADD CONSTRAINT t_pk PRIMARY KEY (n)
/
CREATE MATERIALIZED VIEW LOG ON t
/

CREATE MATERIALIZED VIEW T1_MV 
REFRESH ON COMMIT 
FAST 
WITH PRIMARY KEY 
ENABLE QUERY REWRITE AS 
SELECT t.*
,REGEXP_REPLACE(a,'[^a-zA-Z0-9]') b
FROM t
/

CREATE INDEX t1_mv_b ON t1_mv(b);

INSERT INTO t
SELECT ROWNUM+1000 n
,      TO_CHAR(sysdate-level-1000,'YYYY/DDD') a
FROM  dual
CONNECT BY LEVEL < 1000
/
commit
/

set autotrace on
SELECT * FROM t
WHERE REGEXP_REPLACE(a,'[^a-zA-Z0-9]') like '201720%';
And Oracle does indeed rewrite the query to use the materialised view and then uses the index on the materialized view.
Plan hash value: 3543552962

----------------------------------------------------------------------------------------------------------
| Id  | Operation                                      | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                               |         |     1 |    21 |     3   (0)| 00:00:01 |
|   1 |  MAT_VIEW REWRITE ACCESS BY INDEX ROWID BATCHED| T1_MV   |     1 |    21 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                             | T1_MV_B |     2 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------
However, this solution has the same weakness of being dependent on NLS settings – it is just that there is no validation to stop you! There is the further threat that the refresh on commit could become a performance problem if there is intense DDL on the underlying table.

Workaround 3: Use a Non-NLS Dependent Function 

Ideally, it would be better to use a non-NLS dependent function instead of a REGEXP% function. However, this may be easier said than done.  Regular expressions solve problems that are hard to do with basic string handling functions and may require a PL/SQL function anyway to hold procedural code.  Although in this relatively simple example it is possible to remove the unwanted characters with the translate and replace functions. There is no NLS restriction here.
SELECT a
,      REGEXP_REPLACE(a,'[^0-9]')
,      REPLACE(TRANSLATE(a,'/',' '),' ','')
FROM   t
WHERE rownum <= 10
/

A        REGEXP_REPLACE(A,'[^0-9]')       REPLACE(TRANSLATE(A,'/',''),'','
-------- -------------------------------- --------------------------------
2019/031 2019031                          2019031
2019/030 2019030                          2019030
2019/029 2019029                          2019029
2019/028 2019028                          2019028
2019/027 2019027                          2019027
2019/026 2019026                          2019026
2019/025 2019025                          2019025
2019/024 2019024                          2019024
2019/023 2019023                          2019023
2019/022 2019022                          2019022
…
And you can put this into a function-based index or virtual column thus
DROP INDEX t1 
/ 
ALTER TABLE t ADD b AS (REPLACE(TRANSLATE(a,'/',' '),' ','')) VIRTUAL 
/ 
CREATE INDEX t1 ON t (b) 
/