UKOUG2021

Tuesday, August 03, 2021

Alter SQL Profiles from Exact to Force Matching

You can use DBMS_SQLTUNE.ALTER_SQL_PROFILE to change the status, name, description, or category of a SQL profile, but you can't alter it from exact to force matching.  Instead, you would have to recreate it.  That is easy if you have the script that you used to create it in the first place.  There is another way.

Oracle support note How to Move SQL Profiles from One Database to Another (Including to Higher Versions) (Doc ID 457531.1) describes a process to export SQL profiles to a staging table that can be imported into another database.  This provides an opportunity to alter a profile by updating the data in the staging table.  There are two columns in the staging table that have to be updated.

  • SQLFLAGS must be updated from 0 (indicating an exact match profile) to 1 (indicating a force match profile)
  • SIGNATURE must be recalculated as a force matching signature using DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE.

Demonstration

I am going to create a small table with a unique index.  

CREATE TABLE t (a not null, b) AS 
SELECT rownum, ceil(sqrt(rownum)) FROM dual connect by level <= 100;
create unique index t_idx on t(a);
exec dbms_stats.gather_table_stats(user,'T');

ttitle off
select * from dba_sql_profiles where name like 'my%sql_profile%';
explain plan for SELECT * FROM t WHERE a = 42;
ttitle 'Default Execution plan without profiles (index scan)'
select * from table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));

Without any SQL profiles, when I query by the unique key I get a unique index scan.

Plan hash value: 2929955852

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |     6 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     |     1 |     6 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | T_IDX |     1 |       |     0   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Now I am going to create two SQL profiles.  I have deliberately put the same SQL text into both SQL Profiles.

  • my_sql_profile is exact matching
  • my_sql_profile_force is force matching.

DECLARE
signature INTEGER;
sql_txt CLOB;
h       SYS.SQLPROF_ATTR;
BEGIN
sql_txt := q'[
SELECT * FROM t WHERE a = 54
]';
h := SYS.SQLPROF_ATTR(
q'[BEGIN_OUTLINE_DATA]',
q'[IGNORE_OPTIM_EMBEDDED_HINTS]',
q'[FULL(@"SEL$1" "T"@"SEL$1")]',
q'[END_OUTLINE_DATA]');
signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt);
DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
sql_text    => sql_txt,
profile     => h,
name        => 'my_sql_profile',
category    => 'DEFAULT',
validate    => TRUE,
replace     => TRUE,
force_match => FALSE 
);
END;
/

DECLARE
signature INTEGER;
sql_txt CLOB;
h       SYS.SQLPROF_ATTR;
BEGIN
sql_txt := q'[
SELECT * FROM t WHERE a = 54
]';
h := SYS.SQLPROF_ATTR(
q'[BEGIN_OUTLINE_DATA]',
q'[IGNORE_OPTIM_EMBEDDED_HINTS]',
q'[FULL(@"SEL$1" "T"@"SEL$1")]',
q'[END_OUTLINE_DATA]');
signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt);
DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
sql_text    => sql_txt,
profile     => h,
name        => 'my_sql_profile_force',
category    => 'DEFAULT',
validate    => TRUE,
replace     => TRUE,
force_match => TRUE 
);
END;
/
ttitle off
select * from dba_sql_profiles where name like 'my%sql_profile%';

NAME                           CATEGORY               SIGNATURE SQL_TEXT                                                                         CREATED
------------------------------ ---------- --------------------- -------------------------------------------------------------------------------- ------------------------------
LAST_MODIFIED                  DESCRIPTION          TYPE    STATUS   FOR    TASK_ID TASK_EXEC_NAME       TASK_OBJ_ID TASK_FND_ID TASK_REC_ID TASK_CON_DBID
------------------------------ -------------------- ------- -------- --- ---------- -------------------- ----------- ----------- ----------- -------------
my_sql_profile                 DEFAULT      9394869341287877934                                                                                  31-JUL-21 10.47.34.243454
                                                                SELECT * FROM t WHERE a = 54
31-JUL-21 10.47.34.000000                           MANUAL  ENABLED  NO

my_sql_profile_force           DEFAULT     11431056000319719221                                                                                  31-JUL-21 10.47.34.502721
                                                                SELECT * FROM t WHERE a = 54
31-JUL-21 10.47.34.000000                           MANUAL  ENABLED  YES

The force match profile works if the literal value is different from that in the profiles.

explain plan for SELECT * FROM t WHERE a = 42;
select * from table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));

Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     6 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |     6 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------
…
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"=42)

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------

   0 -  STATEMENT
           -  IGNORE_OPTIM_EMBEDDED_HINTS

   1 -  SEL$1 / T@SEL$1
           -  FULL(@"SEL$1" "T"@"SEL$1")

Note
-----
   - SQL profile "my_sql_profile_force" used for this statement

The exact match profile takes precedence of the force match profile.

explain plan for SELECT * FROM t WHERE a = 54;
select * from table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));

Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     6 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |     6 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------
…
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"=54)

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------

   0 -  STATEMENT
           -  IGNORE_OPTIM_EMBEDDED_HINTS

   1 -  SEL$1 / T@SEL$1
           -  FULL(@"SEL$1" "T"@"SEL$1")

Note
-----
   - SQL profile "my_sql_profile" used for this statement

I am now going to follow the process to export the SQL Profiles to a staging table, and subsequently reimport them.

exec DBMS_SQLTUNE.CREATE_STGTAB_SQLPROF(table_name=>'STAGE',schema_name=>user);
exec DBMS_SQLTUNE.PACK_STGTAB_SQLPROF (staging_table_name =>'STAGE',profile_name=>'my_sql_profile');
exec DBMS_SQLTUNE.PACK_STGTAB_SQLPROF (staging_table_name =>'STAGE',profile_name=>'my_sql_profile_force');

There is a row in the staging table for each profile and you can see the differences between them.

select signature, sql_handle, obj_name, obj_type, sql_text, sqlflags from STAGE;

            SIGNATURE SQL_HANDLE                     OBJ_NAME
--------------------- ------------------------------ ---------------------------------------------------------------------
OBJ_TYPE                       SQL_TEXT                                                                           SQLFLAGS
------------------------------ -------------------------------------------------------------------------------- ----------
  9394869341287877934 SQL_826147e3c6ac0d2e           my_sql_profile
SQL_PROFILE                                                                                                              0
                               SELECT * FROM t WHERE a = 54

 11431056000319719221 SQL_9ea344de32a78735           my_sql_profile_force
SQL_PROFILE                                                                                                              1
                               SELECT * FROM t WHERE a = 54

I will update the staging table using this PL/SQL loop (because SQL doesn't recognise TRUE as a boolean constant).

DECLARE
  l_sig INTEGER;
BEGIN
  FOR i IN (
    SELECT rowid, stage.* FROM stage WHERE sqlflags = 0 FOR UPDATE
  ) LOOP
    l_sig := dbms_sqltune.sqltext_to_signature(i.sql_text,TRUE);
    UPDATE stage
    SET    signature = l_sig
    ,      sqlflags = 1
    WHERE  sqlflags = 0
    AND    rowid = i.rowid;
  END LOOP;
END;
/

And now the profiles look the same.

select signature, sql_handle, obj_name, obj_type, sql_text, sqlflags from STAGE;

            SIGNATURE SQL_HANDLE                     OBJ_NAME
--------------------- ------------------------------ ---------------------------------------------------------------------
OBJ_TYPE                       SQL_TEXT                                                                           SQLFLAGS
------------------------------ -------------------------------------------------------------------------------- ----------
 11431056000319719221 SQL_826147e3c6ac0d2e           my_sql_profile
SQL_PROFILE                                                                                                              1
                               SELECT * FROM t WHERE a = 54

 11431056000319719221 SQL_9ea344de32a78735           my_sql_profile_force
SQL_PROFILE                                                                                                              1
                               SELECT * FROM t WHERE a = 54

But I can't just reimport my_sql_profile from the staging replacing the one in the database because I will get ORA-13841: SQL profile named my_sql_profile already exists for a different signature/category pair. To avoid this error I must either drop the profile or rename it.  

I am going to rename the existing exact matching profile, and also disable it and move it to another category to stop it from matching my statement in preference to the force matching profile (see previous post Clashing SQL Profiles - Exact Matching Profiles Take Precedence Over Force Matching Profiles), and thus I can go back to it later if needed.

I will drop my example force matching profile.  I no longer need that.

Then, I can reimport the profile from the staging table.

exec dbms_sqltune.alter_sql_profile(name=>'my_sql_profile', attribute_name=>'NAME',value=>'my_old_sql_profile');
exec dbms_sqltune.alter_sql_profile(name=>'my_old_sql_profile', attribute_name=>'CATEGORY',value=>'DO_NOT_USE');
exec dbms_sqltune.alter_sql_profile(name=>'my_old_sql_profile', attribute_name=>'STATUS',value=>'DISABLED');
exec dbms_sqltune.drop_sql_profile('my_sql_profile_force',TRUE);
EXEC DBMS_SQLTUNE.UNPACK_STGTAB_SQLPROF(profile_name => 'my_sql_profile', replace => TRUE, staging_table_name => 'STAGE');

I can see in the SQL profile table that my SQL profile is now force matching, and it has a different signature to the old one that is exact matching.

ttitle off
select * from dba_sql_profiles where name like 'my%sql_profile%';

NAME                           CATEGORY               SIGNATURE SQL_TEXT                                                                         CREATED
------------------------------ ---------- --------------------- -------------------------------------------------------------------------------- ------------------------------
LAST_MODIFIED                  DESCRIPTION          TYPE    STATUS   FOR    TASK_ID TASK_EXEC_NAME       TASK_OBJ_ID TASK_FND_ID TASK_REC_ID TASK_CON_DBID
------------------------------ -------------------- ------- -------- --- ---------- -------------------- ----------- ----------- ----------- -------------
my_old_sql_profile             DO_NOT_USE   9394869341287877934                                                                                  31-JUL-21 10.54.58.694037
                                                                SELECT * FROM t WHERE a = 54
31-JUL-21 10.55.00.000000                           MANUAL  DISABLED NO

my_sql_profile                 DEFAULT     11431056000319719221                                                                                  31-JUL-21 10.55.01.005377
                                                                SELECT * FROM t WHERE a = 54
31-JUL-21 10.55.01.000000                           MANUAL  ENABLED  YES

Both my queries now match the new force matching version of the profile.

explain plan for SELECT * FROM t WHERE a = 42;
ttitle 'Execution plan with force match profile (full scan)'
select * from table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));

Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     6 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |     6 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------
…
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"=42)

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------

   0 -  STATEMENT
           -  IGNORE_OPTIM_EMBEDDED_HINTS

   1 -  SEL$1 / T@SEL$1
           -  FULL(@"SEL$1" "T"@"SEL$1")

Note
-----
   - SQL profile "my_sql_profile" used for this statement

explain plan for SELECT * FROM t WHERE a = 54;
select * from table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));

Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     6 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |     6 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

…
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"=54)

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------

   0 -  STATEMENT
           -  IGNORE_OPTIM_EMBEDDED_HINTS

   1 -  SEL$1 / T@SEL$1
           -  FULL(@"SEL$1" "T"@"SEL$1")

Note
-----
   - SQL profile "my_sql_profile" used for this statement
The script used for this demonstration is available on GitHub

Monday, August 02, 2021

Detecting Clashing SQL Profiles

In my last post, I discussed the possible undesirable consequences of force and exact matching SQL profiles on statements with the same force matching signature.  The question is how do you detect such profiles?

I have created three profiles on very similar SQL statements that only differ in the literal value of a predicate.  One of them is force matching, the others are exact matching.  The signature reported by DBA_SQL_PROFILES is the force matching signature for force matching profiles, and the exact matching signature for exact matching profiles.

select * from dba_sql_profiles;

NAME                           CATEGORY               SIGNATURE SQL_TEXT                                           CREATED
------------------------------ ---------- --------------------- -------------------------------------------------- ------------------------------
LAST_MODIFIED                  DESCRIPTION          TYPE    STATUS   FOR    TASK_ID TASK_EXEC_NAME       TASK_OBJ_ID TASK_FND_ID TASK_REC_ID TASK_CON_DBID
------------------------------ -------------------- ------- -------- --- ---------- -------------------- ----------- ----------- ----------- -------------
my_sql_profile_force           DEFAULT     11431056000319719221                                                    16:09:33 01/08/2021
                                                                SELECT * FROM t WHERE a = 54
16:09:33 01/08/2021                                 MANUAL  ENABLED  YES

my_sql_profile_24              DEFAULT     12140764948557749245                                                    16:09:33 01/08/2021
                                                                SELECT * FROM t
                                                                WHERE a = 24
16:09:33 01/08/2021                                 MANUAL  ENABLED  NO

my_sql_profile_42              DEFAULT     14843900676141266266                                                    16:09:33 01/08/2021
                                                                SELECT * FROM t WHERE a = 42
16:09:33 01/08/2021                                 MANUAL  ENABLED  NO
In order to be able to compare the profiles, I need to calculate the force matching signature for the exact matching profiles using DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE.  I can't use the Boolean constant TRUE parameter in SQL.  Instead, I have used a PL/SQL function in a with clause.
REM dup_sql_profiles1.sql 
WITH function sig(p_sql_text CLOB, p_number INTEGER) RETURN NUMBER IS
 l_sig NUMBER;
BEGIN
 IF p_number > 0 THEN 
  l_sig := dbms_sqltune.sqltext_to_signature(p_sql_text,TRUE);
 ELSIF p_number = 0 THEN 
  l_sig := dbms_sqltune.sqltext_to_signature(p_sql_text,FALSE);
 END IF;
 RETURN l_sig;
END;
x as (
select CASE WHEN force_matching = 'NO'  THEN signature ELSE sig(sql_text, 0) END exact_sig
,      CASE WHEN force_matching = 'YES' THEN signature ELSE sig(sql_text, 1) END force_sig
,      p.*
from   dba_sql_profiles p
where  (status = 'ENABLED' or force_matching = 'NO')
), y as (
select x.*
, row_number() over (partition by category, force_sig order by force_matching desc, exact_sig nulls first) profile#
, count(*) over (partition by category, force_sig) num_profiles
from x
)
select profile#, num_profiles, force_sig, exact_sig, name, created, category, status, force_matching, sql_text
from y
where num_profiles > 1
order by force_sig, force_matching desc, exact_sig
/
 We can see these three profiles are grouped together.  The force matching signature calculated on the exact matching profiles is the same as the signature on the force matching profile.  Now I can start to make some decisions about whether I should retain the exact matching profiles or remove them and just use the force matching profile.
Prof   Num        Force Matching        Exact Matching
   # Profs             Signature             Signature NAME                           CREATED                      CATEGORY             STATUS   FOR
---- ----- --------------------- --------------------- ------------------------------ ---------------------------- -------------------- -------- ---
SQL_TEXT
----------------------------------------------------------------------------------------------------------------------------------------------------
   1     3  11431056000319719221                       my_sql_profile_force           16:35:36 01/08/2021          DEFAULT              ENABLED  YES

SELECT * FROM t WHERE a = 54

   2     3                        12140764948557749245 my_sql_profile_24              16:35:36 01/08/2021          DEFAULT              ENABLED  NO

SELECT * FROM t
WHERE a = 24

   3     3                        14843900676141266266 my_sql_profile_42              16:35:36 01/08/2021          DEFAULT              ENABLED  NO

SELECT * FROM t WHERE a = 42
The SQL statements in this example are absurdly simple.  In real life that is rarely the case.  Sometimes it can be a struggle to see where two complex statements differ.
In the next query, I compare enabled force matching SQL profiles to any exact matching profiles in the same category with the same force matching signature.  The full query is on GitHub.
REM dup_sql_profiles2.sql
WITH function sig(p_sql_text CLOB, p_number INTEGER) RETURN NUMBER IS
…
END sig;
function norm(p_queryin CLOB) RETURN CLOB IS
…
END norm;
function str_diff(p_str1 CLOB, p_str2 CLOB) RETURN NUMBER IS
…
END str_diff;
x as (
select CASE WHEN force_matching = 'NO'  THEN signature ELSE sig(sql_text, 0) END exact_sig
,      CASE WHEN force_matching = 'YES' THEN signature ELSE sig(sql_text, 1) END force_sig
,      p.*
from   dba_sql_profiles p
), y as (
select f.force_matching, f.force_sig, f.name force_name, f.created force_created, f.status force_status
,      e.force_matching exact_matching, e.exact_sig, e.name exact_name
,      e.created exact_created, e.status exact_status, e.category
,      norm(e.sql_text) esql_text, norm(f.sql_text) fsql_text
from   x e
,      x f
where  f.force_matching = 'YES'
and    e.force_matching = 'NO'
and    e.force_sig = f.force_sig
and    e.category = f.category
and    e.name != f.name
and    f.status = 'ENABLED'
), z as (
select y.*
,      str_diff(fsql_Text, esql_text) diff_len
from y
)
select force_matching, force_Sig, force_name, force_created, force_status
,      exact_matching, exact_sig, exact_name, exact_Created, exact_status
, substr(fsql_text,1,diff_len) common_text
, substr(fsql_text,diff_len+1) fdiff_text, substr(esql_text,diff_len+1) ediff_text
from z
order by force_sig
/
I have shown the common part of both statements, from the start to the first difference, and then also how the rest of each statement continues.
It is not enough to simply compare two statements character by character.  Both the force and exact matching signatures are "calculated on the normalized SQL text. The normalization includes the removal of white space and the uppercasing of all non-literal strings".  However, neither the normalised SQL, nor the normalisation mechanism is exposed by Oracle.  Therefore, in this query, I have included my own rudimentary normalisation function (based on an idea from AskTOM) that I apply first and a string comparison function.  You can see that normalisation has eliminated the line feed in from the statement in my_sql_profile_24.
Now I can see my two exact matching profiles match my force matching profile.  I can see the common part of the SQL up to the literal value, and the different parts of the text are just the literal value.  
           Force Matching Force                          Force                        Force               Exact Matching Exact                          Exact                        Exact
FOR             Signature Name                           Created Date                 Status   EXA             Signature Name                           Created Date                 Status
--- --------------------- ------------------------------ ---------------------------- -------- --- --------------------- ------------------------------ ---------------------------- --------
Common Text
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Force Text                                                                                          Exact Text
--------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------
YES  11431056000319719221 my_sql_profile_force           16:35:36 01/08/2021          ENABLED  NO   12140764948557749245 my_sql_profile_24              16:35:36 01/08/2021          ENABLED
SELECT * FROM T WHERE A =
54                                                                                                  24

                                                                                      ENABLED  NO   14843900676141266266 my_sql_profile_42              16:35:36 01/08/2021          ENABLED
SELECT * FROM T WHERE A =
54
Both the queries mentioned in this blog are available on GitHub.

Saturday, July 31, 2021

Clashing SQL Profiles - Exact Matching Profiles Take Precedence Over Force Matching Profiles

Sometimes, you reach a point in performance tuning, where you use a SQL Baseline, or SQL Patch, or SQL Profile to stabilise an execution plan.  These methods all effectively inject a hint or set of hints into a statement to produce the desired execution plan.  Baselines and Patches will only exactly match a SQL ID and therefore a SQL statement.  However, a SQL Profile can optionally do force matching so that it applies to "all SQL statements that have the same text after the literal values in the WHERE clause have been replaced by bind variables.  

This setting may be useful for applications that use only literal values because it enables SQL with text differing only in its literal values to share a SQL profile. If both literal values and bind variables are in the SQL text, or if force_match is set to false (default), then the literal values in the WHERE clause are not replaced by bind variables." [Oracle Database SQL Tuning Guide]

I often work with PeopleSoft, whose batch processes often dynamically generate SQL with literal values.  Therefore, I usually create force matching profiles when I need to control an execution plan.  However, sometimes I come across situations where some exact matching (i.e. not force matching) profiles have been created (often by production DBAs using the tuning advisor) on different statements that have the same force matching signature, and then maybe a force matching profile has also been applied.

Note: SQL Profiles require the Tuning Pack licence.
Where both exact and force matching profiles apply to a SQL statement, the exact matching profile will take precedence over the force matching profile, and even if disabled it will prevent the force matching profile from being applied.
I will demonstrate this with a simple test.  I will create a table with a couple of indexes, collect statistics, and generate an execution plan for a query.  I am using explain plan for command to force a parse of the statement every time.
CREATE TABLE t (a not null, b) AS 
SELECT rownum, ceil(sqrt(rownum)) FROM dual CONNECT BY LEVEL <= 100;
CREATE UNIQUE INDEX t_idx on t(a);
CREATE INDEX t_idx2 on t(b,a);
EXEC dbms_stats.gather_table_stats(user,'T');

Without Any SQL Profiles

Without any profiles in place, I get a skip scan of T_IDX2, and there is no note in the execution plan.
EXPLAIN PLAN FOR SELECT * FROM t WHERE a = 42;
SELECT * FROM table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));

Plan hash value: 3418618943
---------------------------------------------------------------------------
| Id  | Operation        | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |     1 |     6 |     1   (0)| 00:00:01 |
|*  1 |  INDEX SKIP SCAN | T_IDX2 |     1 |     6 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------
…
Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      INDEX_SS(@"SEL$1" "T"@"SEL$1" ("T"."B" "T"."A"))
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('19.1.0')
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("A"=42)
       filter("A"=42)
…

Force Matching Profile

Now I will create an exact matching SQL profile on the query that will force the use of the unique index.  The query is the same except the literal value is different (it is 54 instead of 42).
DECLARE
signature INTEGER;
sql_txt CLOB;
h       SYS.SQLPROF_ATTR;
BEGIN
sql_txt := q'[
SELECT * FROM t WHERE a = 54
]';
h := SYS.SQLPROF_ATTR(
q'[BEGIN_OUTLINE_DATA]',
q'[IGNORE_OPTIM_EMBEDDED_HINTS]',
q'[FULL(@"SEL$1" "T"@"SEL$1")]',
q'[END_OUTLINE_DATA]');
signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt);
DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
sql_text    => sql_txt,
profile     => h,
name        => 'clashing_profile_test_force',
category    => 'DEFAULT',
validate    => TRUE,
replace     => TRUE,
force_match => TRUE 
);
END;
/
I only have a force-matching profile. 
                                                                   Execution plan with force matching profile (full scan)

NAME                           CATEGORY               SIGNATURE SQL_TEXT                                                                         CREATED
------------------------------ ---------- --------------------- -------------------------------------------------------------------------------- ------------------------------
LAST_MODIFIED                  DESCRIPTION          TYPE    STATUS   FOR    TASK_ID TASK_EXEC_NAME       TASK_OBJ_ID TASK_FND_ID TASK_REC_ID TASK_CON_DBID
------------------------------ -------------------- ------- -------- --- ---------- -------------------- ----------- ----------- ----------- -------------
clashing_profile_test_force    DEFAULT     11431056000319719221                                                                                  27-JUL-21 01.35.43.854691 PM
                                                                SELECT * FROM t WHERE a = 54
27-JUL-21 01.35.43.000000 PM                        MANUAL  ENABLED  YES
The execution plan uses the full plan as specified by the profile, there is a note confirming that the profile was matched and used, and the full hint was listed in the hint report.
EXPLAIN PLAN FOR SELECT * FROM t WHERE a = 42;
SELECT * FROM table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));

Plan hash value: 1601196873

----------------------------------------------------------------------------------
| Id  | Operation                 | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |      |     1 |     6 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS STORAGE FULL| T    |     1 |     6 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------
…
Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      FULL(@"SEL$1" "T"@"SEL$1")
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('19.1.0')
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - storage("A"=42)
       filter("A"=42)

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------

   0 -  STATEMENT
           -  IGNORE_OPTIM_EMBEDDED_HINTS

   1 -  SEL$1 / T@SEL$1
           -  FULL(@"SEL$1" "T"@"SEL$1")

Note
-----
   - SQL profile "clashing_profile_test_force" used for this statement

Exact Matching Profile 

I will now add an exact matching profile
DECLARE
signature INTEGER;
sql_txt CLOB;
h       SYS.SQLPROF_ATTR;
BEGIN
sql_txt := q'[
SELECT * FROM t WHERE a = 42
]';
h := SYS.SQLPROF_ATTR(
q'[BEGIN_OUTLINE_DATA]',
q'[IGNORE_OPTIM_EMBEDDED_HINTS]',
q'[INDEX(@"SEL$1" "T"@"SEL$1" ("T"."A"))]',
q'[END_OUTLINE_DATA]');
signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt);
DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
sql_text    => sql_txt,
profile     => h,
name        => 'clashing_profile_test_exact',
category    => 'DEFAULT',
validate    => TRUE,
replace     => TRUE,
force_match => FALSE 
);
END;
/
I can see I now have two SQL Profiles; one force matched, and one exact matched.
                                                                    Execution plan with force matching profile (unique index lookup)

NAME                           CATEGORY               SIGNATURE SQL_TEXT                                                                         CREATED
------------------------------ ---------- --------------------- -------------------------------------------------------------------------------- ------------------------------
LAST_MODIFIED                  DESCRIPTION          TYPE    STATUS   FOR    TASK_ID TASK_EXEC_NAME       TASK_OBJ_ID TASK_FND_ID TASK_REC_ID TASK_CON_DBID
------------------------------ -------------------- ------- -------- --- ---------- -------------------- ----------- ----------- ----------- -------------
clashing_profile_test_exact    DEFAULT     14843900676141266266                                                                                  27-JUL-21 01.35.46.825697 PM
                                                                SELECT * FROM t WHERE a = 42
27-JUL-21 01.35.46.000000 PM                        MANUAL  ENABLED  NO

clashing_profile_test_force    DEFAULT     11431056000319719221                                                                                  27-JUL-21 01.35.43.854691 PM
                                                                SELECT * FROM t WHERE a = 54
27-JUL-21 01.35.43.000000 PM                        MANUAL  ENABLED  YES
The execution plan has changed to the unique index scan. The index hint from the profile appears hints report. The note at the bottom of the plan shows the exact matching profile has been used, taking precedence over the force matching profile.
EXPLAIN PLAN FOR SELECT * FROM t WHERE a = 42;
SELECT * FROM table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));

Plan hash value: 2929955852

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |     6 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     |     1 |     6 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | T_IDX |     1 |       |     0   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
…
Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      INDEX_RS_ASC(@"SEL$1" "T"@"SEL$1" ("T"."A"))
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('19.1.0')
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"=42)

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------

   0 -  STATEMENT
           -  IGNORE_OPTIM_EMBEDDED_HINTS

   1 -  SEL$1 / T@SEL$1
           -  INDEX(@"SEL$1" "T"@"SEL$1" ("T"."A"))

Note
-----
   - SQL profile "clashing_profile_test_exact" used for this statement

Different Query

If I run the query with a different literal value, the plan changes back to the full scan, and the note reports the force matching profile was used
EXPLAIN PLAN FOR SELECT * FROM t WHERE a = 54;
SELECT * FROM table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));

Plan hash value: 1601196873

----------------------------------------------------------------------------------
| Id  | Operation                 | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |      |     1 |     6 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS STORAGE FULL| T    |     1 |     6 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------
…
Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      FULL(@"SEL$1" "T"@"SEL$1")
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('19.1.0')
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - storage("A"=54)
       filter("A"=54)

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------

   0 -  STATEMENT
           -  IGNORE_OPTIM_EMBEDDED_HINTS

   1 -  SEL$1 / T@SEL$1
           -  FULL(@"SEL$1" "T"@"SEL$1")

Note
-----
   - SQL profile "clashing_profile_test_force" used for this statement

Disable Exact Matching SQL Profile

I will now disable the exact matching profile.
exec dbms_sqltune.alter_sql_profile(name=>'clashing_profile_test_exact', attribute_name=>'STATUS',value=>'DISABLED');
SELECT * FROM dba_sql_profiles where name like 'clashing%';

                                                                Disable Exact Profile - Execution plan with no profile (skip scan) - Odd

NAME                           CATEGORY               SIGNATURE SQL_TEXT                                                                         CREATED
------------------------------ ---------- --------------------- -------------------------------------------------------------------------------- ------------------------------
LAST_MODIFIED                  DESCRIPTION          TYPE    STATUS   FOR    TASK_ID TASK_EXEC_NAME       TASK_OBJ_ID TASK_FND_ID TASK_REC_ID TASK_CON_DBID
------------------------------ -------------------- ------- -------- --- ---------- -------------------- ----------- ----------- ----------- -------------
clashing_profile_test_exact    DEFAULT     14843900676141266266                                                                                  27-JUL-21 01.35.46.825697 PM
                                                                SELECT * FROM t WHERE a = 42
27-JUL-21 01.35.52.000000 PM                        MANUAL  DISABLED NO

clashing_profile_test_force    DEFAULT     11431056000319719221                                                                                  27-JUL-21 01.35.43.854691 PM
                                                                SELECT * FROM t WHERE a = 54
27-JUL-21 01.35.43.000000 PM                        MANUAL  ENABLED  YES
I expected the profile to switch back to the force matching profile, but instead it goes back to the original plan using the skip scan with no profile at all. So the disabled exact matching profile prevents the force matching profile from matching the statement, and then doesn't get applied to the statement either! There is no note in the execution plan and no hint report.
EXPLAIN PLAN FOR SELECT * FROM t WHERE a = 42;
SELECT * FROM table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));

Plan hash value: 3418618943

---------------------------------------------------------------------------
| Id  | Operation        | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |     1 |     6 |     1   (0)| 00:00:01 |
|*  1 |  INDEX SKIP SCAN | T_IDX2 |     1 |     6 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------
…
Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      INDEX_SS(@"SEL$1" "T"@"SEL$1" ("T"."B" "T"."A"))
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('19.1.0')
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"=42)
       filter("A"=42)

Alter Category of Exact Matching SQL Profile

I could have dropped the SQL Profile, but I might want to retain it for documentation and in case I need to reinstate it. So instead I will move it to a different category.
exec dbms_sqltune.alter_sql_profile(name=>'clashing_profile_test_exact', attribute_name=>'CATEGORY',value=>'DO_NOT_USE');
SELECT * FROM dba_sql_profiles where name like 'clashing%';

                                                      Change Category of Exact Profile - Execution plan with force matching profile (full scan)

NAME                           CATEGORY               SIGNATURE SQL_TEXT                                                                         CREATED
------------------------------ ---------- --------------------- -------------------------------------------------------------------------------- ------------------------------
LAST_MODIFIED                  DESCRIPTION          TYPE    STATUS   FOR    TASK_ID TASK_EXEC_NAME       TASK_OBJ_ID TASK_FND_ID TASK_REC_ID TASK_CON_DBID
------------------------------ -------------------- ------- -------- --- ---------- -------------------- ----------- ----------- ----------- -------------
clashing_profile_test_exact    DO_NOT_USE  14843900676141266266                                                                                  27-JUL-21 02.57.11.343291 PM
                                                                SELECT * FROM t WHERE a = 42
27-JUL-21 02.57.19.000000 PM                        MANUAL  DISABLED NO

clashing_profile_test_force    DEFAULT     11431056000319719221                                                                                  27-JUL-21 02.57.08.390801 PM
                                                                SELECT * FROM t WHERE a = 54
27-JUL-21 02.57.08.000000 PM                        MANUAL  ENABLED  YES
And now the execution plan goes back to the force matching profile and the unique index lookup.
EXPLAIN PLAN FOR SELECT * FROM t WHERE a = 42;
SELECT * FROM table(dbms_xplan.display(null,null,'ADVANCED +ADAPTIVE -PROJECTION'));

Plan hash value: 1601196873

----------------------------------------------------------------------------------
| Id  | Operation                 | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |      |     1 |     6 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS STORAGE FULL| T    |     1 |     6 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------
…
Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      FULL(@"SEL$1" "T"@"SEL$1")
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('19.1.0')
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - storage("A"=42)
       filter("A"=42)

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------

   0 -  STATEMENT
           -  IGNORE_OPTIM_EMBEDDED_HINTS

   1 -  SEL$1 / T@SEL$1
           -  FULL(@"SEL$1" "T"@"SEL$1")

Note
-----
   - SQL profile "clashing_profile_test_force" used for this statement

Conclusion

An exact matching profile will be matched to a SQL statement before a force matching SQL statement, even if it is disabled, in which case neither profile will be applied.
If you have exact matching SQL profiles that provide the same hints to produce the same execution plan on various similar SQL statements that have the same force matching signature (i.e. they only differ in their literal values), and you wish to replace them with a single force matching profile, then rather than disable the exact matching profiles you should either drop them or if you prefer to retain them for documentation then alter them to a different category. 
  • The script GitHub used in this blog to demonstrate this behaviour is available on GitHub.  They were run in Oracle 19.9 for this post.
  • The script disabled_profiles_category.sql moves all disabled profiles from the category DEFAULT to DO_NOT_USE.
In a subsequent post, I will show how to detect conflicting SQL profiles.

Tuesday, April 06, 2021

Spatial Data 6: Text Searching Areas by their Name, and the Names of Parent Areas

This blog is part of a series about my first steps in using Spatial Data in the Oracle database.  I am using the GPS data from my cycling activities collected by Strava. All of my files are available on GitHub.

Now I have loaded all the areas, I want to be able to search for them by name.  I am going to create an Oracle Text Index, but I need to index more than just the name of each area.  I must index the full hierarchy of each area so I can search on combinations of names in different types of areas.  For example, I might search for a village and county (e.g. Streatley and Berkshire), to distinguish it from a village of the same name in a different county (e.g. Streatley in Bedfordshire).

I can generate the full hierarchy of an area with a PL/SQL function (strava_pkg.name_heirarchy_fn) by navigating up the linked list and discarding repeated names.  I could make that available in a virtual column.  However, I cannot build a text index on a function or a virtual column.  

Text Index Option 1: Store Hierarchy on Table, and Create a Multi-Column Text Index

I could store the hierarchy of an area on the my_areas table, and generate the area from PL/SQL function strava_pkg. name_heirarchy_fn.

DECLARE
  l_clob CLOB;
  l_my_areas my_areas%ROWTYPE;
BEGIN
  select m.*
  into   l_my_areas
  FROM   my_areas m
  WHERE  area_code = 'CPC'
  And    area_number = '40307';

  dbms_output.put_line(strava_pkg.name_heirarchy_fn(l_my_areas.area_code,l_my_areas.area_number));
  dbms_output.put_line(strava_pkg.name_heirarchy_fn(l_my_areas.parent_area_code,l_my_areas.parent_area_number));
END;
/

If I pass the code and number for a particular area, I can get its full hierarchy including its name.  I can see the parish of Streatley, is in the Unitary Authority of West Berkshire, which is in England, and England is in the United Kingdom.  If I pass the code and number of its parent, I just get the hierarchy up to its parent.  

Streatley, West Berkshire, England, United Kingdom
West Berkshire, England, United Kingdom

I can store the hierarchy on my_areas, though I have to store results on a temporary table, rather than update it directly.  Otherwise, I get a mutation error.

ALTER TABLE my_areas add name_heirarchy VARCHAR(4000)
/
CREATE GLOBAL TEMPORARY TABLE my_areas_temp ON COMMIT PRESERVE ROWS AS 
SELECT area_code, area_number, strava_pkg.name_heirarchy_fn(parent_area_code,parent_area_number) name_heirarchy
FROM my_areas WHERE parent_area_code IS NOT NULL AND parent_area_number IS NOT NULL
/
MERGE INTO my_areas u 
USING (SELECT * FROM my_areas_temp) s
ON (u.area_code = s.area_code AND u.area_number = s.area_number)
WHEN MATCHED THEN UPDATE
SET u.name_heirarchy = s.name_heirarchy
/

Then I can create a multi-column text index on the name

begin
 ctx_ddl.create_preference('my_areas_lexer', 'BASIC_LEXER');  
 ctx_ddl.set_attribute('my_areas_lexer', 'mixed_case', 'NO'); 
 ctx_ddl.create_preference('my_areas_datastore', 'MULTI_COLUMN_DATASTORE'); 
 ctx_ddl.set_attribute('my_areas_datastore', 'columns', 'name, name_heirarchy'); 
end;
/
CREATE INDEX my_areas_name_txtidx ON my_areas (name) INDEXTYPE IS ctxsys.context 
PARAMETERS ('datastore my_areas_datastore lexer my_areas_lexer sync(on commit)');

The index will sync if I have cause to update the hierarchy.

Text Index Option 2: Index a user_datastore based on the result of a PL/SQL function

Alternatively, I can build a text index on a combination of data from various sources by creating a PL/SQL procedure that combines the data and returns the string to be indexed.  

I have created a procedure (strava_pkg.name_heirarchy_txtidx) that returns a string containing the hierarchy of a given area, and then I will create a text index on that.  The format of the parameters must be exactly as follows: 

  • The rowid of the row being indexed is passed to the procedure; 
  • The string to be indexed is passed back as a CLOB parameter.
See also: Oracle Text Indexing Elements: USER_DATASTORE Attributes

…
PROCEDURE name_heirarchy_txtidx
(p_rowid in rowid
,p_dataout IN OUT NOCOPY CLOB
) IS
  l_count INTEGER := 0;
BEGIN
  FOR i IN (
    SELECT area_code, area_number, name, matchable
    FROM   my_areas m
    START WITH rowid = p_rowid
    CONNECT BY NOCYCLE prior m.parent_area_code   = m.area_code
                   AND prior m.parent_area_number = m.area_number
  ) LOOP
    IF i.matchable >= 1 THEN
      l_count := l_count + 1;
      IF l_count > 1 THEN
        p_dataout := p_dataout ||', '|| i.name;
      ELSE
        p_dataout := i.name;
      END IF;
    END IF;
  END LOOP;
END name_heirarchy_txtidx;
…

As an example, if I pass a particular rowid to the procedure, I obtain the full hierarchy of areas as before.

set serveroutput on
DECLARE
  l_rowid ROWID;
  l_clob CLOB;
BEGIN
  select rowid
  into   l_rowid
  FROM   my_areas m
  WHERE  area_code = 'CPC'
  And    area_number = '40307';

  strava_pkg.name_heirarchy_txtidx(l_rowid, l_clob);
  dbms_output.put_line(l_clob);
END;
/

Streatley, West Berkshire, England, United Kingdom

PL/SQL procedure successfully completed.

The procedure is referenced as an attribute to a user datastore, I can then build a text index on the user datastore.

BEGIN
  ctx_ddl.create_preference('my_areas_lexer', 'BASIC_LEXER');  
  ctx_ddl.set_attribute('my_areas_lexer', 'mixed_case', 'NO'); 
  ctx_ddl.create_preference('my_areas_datastore', 'user_datastore'); 
  ctx_ddl.set_attribute('my_areas_datastore', 'procedure', 'strava_pkg.name_heirarchy_txtidx'); 
  ctx_ddl.set_attribute('my_areas_datastore', 'output_type', 'CLOB');
END;
/

CREATE INDEX my_areas_name_txtidx on my_areas (name) INDEXTYPE IS ctxsys.context 
PARAMETERS ('datastore my_areas_datastore lexer my_areas_lexer');
I have not been able to combine a multi-column datastore with a user datastore.

Text Search examples

Both options produce an index that I can use in the same way.  I can search for a particular name, for example, the village of Streatley.

SELECT score(1), area_Code, area_number, name, suffix, name_heirarchy
FROM   my_areas m
WHERE  CONTAINS(name,'streatley',1)>0
/

I get the two Streatleys, one in Berkshire, and the other in Bedfordshire.  

  SCORE(1) AREA AREA_NUMBER NAME                 SUFFIX     NAME_HEIRARCHY
---------- ---- ----------- -------------------- ---------- ------------------------------------------------------------
        16 CPC        41076 Streatley            CP         Streatley, Central Bedfordshire, England, United Kingdom
        16 CPC        40307 Streatley            CP         Streatley, West Berkshire, England, United Kingdom

As I have indexed the full hierarchy, I can be more precise and search for both the village and the county, even though they are two different columns in the my_areas table.

SELECT score(1), area_Code, area_number, name, suffix, name_heirarchy
FROM   my_areas m
WHERE  CONTAINS(name,'streatley and berks%',1)>0
/

Now I just get one result.  The Streatley in Berkshire.

  SCORE(1) AREA AREA_NUMBER NAME                 SUFFIX     NAME_HEIRARCHY
---------- ---- ----------- -------------------- ---------- ------------------------------------------------------------
        11 CPC        40307 Streatley            CP         Streatley, West Berkshire, England, United Kingdom

Searching For the Top of Hierarchies

My search query works satisfactorily if my search identifies areas with no children, but supposing I search for something higher up the hierarchy, like Berkshire?  

SELECT score(1), area_Code, area_number, name, suffix, name_heirarchy
FROM   my_areas m
WHERE  CONTAINS(name,'berkshire',1)>0
/

I get 184 areas, of different types within the areas called Berkshire, because the name of the parent area appears in the hierarchy of all the children and so is returned by the text index.

           Area        Area
  SCORE(1) Code      Number NAME                      SUFFIX     NAME_HEIRARCHY
---------- ---- ----------- ------------------------- ---------- -----------------------------------------------------------
        11 UTA       101678 Windsor and Maidenhead    (B)        Windsor and Maidenhead, Berkshire, England, United Kingdom
        11 UTA       101680 Wokingham                 (B)        Wokingham, Berkshire, England, United Kingdom
        11 UTA       101681 Reading                   (B)        Reading, Berkshire, England, United Kingdom
        11 UTA       101685 West Berkshire                       West Berkshire, England, United Kingdom
        11 UTW        40258 Norreys                   Ward       Norreys, Wokingham, Berkshire, England, United Kingdom
        11 UTW        40261 Barkham                   Ward       Barkham, Wokingham, Berkshire, England, United Kingdom
…

However, I am just interested in the highest points on each part of the hierarchy I have identified.  So, I exclude any result where its parent is also in the result set.

WITH x AS (
SELECT area_code, area_number, parent_area_code, parent_area_number, name, name_heirarchy
FROM   my_areas m
WHERE  CONTAINS(name,'berkshire',1)>0
) SELECT * FROM x WHERE NOT EXISTS (
  SELECT 'x' FROM x x1
  WHERE  x1.area_code = x.parent_area_code
  AND    x1.area_number = x.parent_area_number
)
/

In this case, I still get two results because the boundaries of the unitary authority of West Berkshire are not entirely within the ceremonial county of Berkshire (some parts of Hungerford and Lambourne were exchanged with Wiltshire in 1990), hence I could not make Berkshire the parent of West Berkshire.

Area        Area
Code      Number      SCORE NAME_HEIRARCHY
---- ----------- ---------- ------------------------------------------------------------
UTA       101685         11 West Berkshire, England, United Kingdom
CCTY           7         11 Berkshire, England, United Kingdom

Text Searching for Activities that pass through Areas

It is a simple extension to join the pre-processed areas through which activities pass to the areas found by the text search, and then exclude areas whose parent was also found in the same activity.

WITH x AS (
SELECT aa.activity_id, m.area_code, m.area_number, m.parent_area_code, m.parent_area_number, m.name, m.name_heirarchy
FROM   my_areas m, activity_areas aa
WHERE  m.area_Code = aa.area_code
AND    m.area_number = aa.area_number
AND    CONTAINS(name,'berkshire',1)>0
) 
SELECT a.activity_id, a.activity_date, a.activity_name, a.activity_type, a.distance_km
,      x.area_Code, x.area_number, x.name, x.name_heirarchy
FROM   x, activities a
WHERE  x.activity_id = a.activity_id
AND    a.activity_date between TO_DATE('01022019','DDMMYYYY') and TO_DATE('28022019','DDMMYYYY')
AND NOT EXISTS (
  SELECT 'x' FROM x x1
  WHERE  x1.area_code = x.parent_area_code
  AND    x1.area_number = x.parent_area_number
  AND    x1.activity_id = x.activity_id)
ORDER BY a.activity_date
/

Now I can see the rides in Berkshire in February 2019.  I get two rows returned for the ride that was in both Berkshire and West Berkshire.  

  Activity Activity                                                Activity Distance Area   Area
        ID Date      ACTIVITY_NAME                                 Type         (km) Code Number NAME            NAME_HEIRARCHY
---------- --------- --------------------------------------------- -------- -------- ---- ------ --------------- -------------------------
2156308823 17-FEB-19 MV - Aldworth, CLCTC Aldworth-Reading         Ride       120.86 CCTY      7 Berkshire       England, United Kingdom
2156308823 17-FEB-19 MV - Aldworth, CLCTC Aldworth-Reading         Ride       120.86 UTA  101685 West Berkshire  England, United Kingdom
2172794879 24-FEB-19 MV - Maidenhead                               Ride        48.14 CCTY      7 Berkshire       England, United Kingdom
2173048214 24-FEB-19 CLCTC: Maidenhead - Turville Heath            Ride        53.15 CCTY      7 Berkshire       England, United Kingdom
2173048406 24-FEB-19 Maidenhead - Burnham Beeches - West Drayton   Ride        27.92 CCTY      7 Berkshire       England, United Kingdom
…

References

I found these references useful while creating the Text index:

Monday, March 29, 2021

Spatial Data 5: Searching For Geometries That Intersect Other Geometries

This blog is part of a series about my first steps in using Spatial Data in the Oracle database.  I am using the GPS data from my cycling activities collected by Strava. All of my files are available on GitHub.

I have loaded basic data for all countries, and detailed data for the UK and other countries where I have recorded activities.  The next step is to determine which activities pass through which areas.  Generically, the question is simply whether one geometry intersects with another.   I can test this in SQL with the sdo_geom.relate() function.

WHERE SDO_GEOM.RELATE(a.geom,'anyinteract',m.geom) = 'TRUE'

However, working out whether an activity, with several thousand points, is within an area defined with several thousand points can be CPU intensive and time-consuming.  Larger areas such as UK counties average over 20,000 points. 

I have 60,000 defined areas, of which, over 20,000 of which are for the UK.  I have 2700 activities recorded on Strava, with an average of 2700 points, but some have over 10,000 points.  It isn't viable to compare every activity with every area.  Comparing these large geometries can take a significant time, too long to do the spatial queries every time I want to interrogate the data, and too long for an on-line application.

Pre-processing Geometry Intersections

However, the data, once loaded is static.  Definitions of areas can change, but it is rare.  Activities do not change.  Therefore, I have decided to pre-process the data to produce a table of matching activities and areas.

CREATE TABLE activity_areas
(activity_id NUMBER NOT NULL
,area_code   VARCHAR2(4) NOT NULL
,area_number NUMBER NOT NULL
,geom_length NUMBER
,CONSTRAINT ACTIVITY_AREAS_PK PRIMARY KEY (activity_id, area_code, area_number)
,CONSTRAINT ACTIVITY_AREAS_FK FOREIGN KEY (activity_id) REFERENCES ACTIVITIES (activity_id)
,CONSTRAINT ACTIVITY_AREAS_FK2 FOREIGN KEY (area_code, area_number) 
                       REFERENCES MY_AREAS (area_code, area_number)
);

Recursive Search

I have written the search as a PL/SQL procedure to search areas that match a particular activity.

  • I pass the ID of the activity to be processed to the procedure.
  • I can specify the area code and number, or the parent area code and number, at which to search through the areas.  I usually leave them to default to null so the search starts with areas at the root of the hierarchy that therefore have no parents (i.e. sovereign countries).  
  • The procedure then calls itself recursively for each area that it finds matches the activity, to search its children.  This way, I limit the total number of comparisons required.  
  • For every area and activity, I have calculated the minimum bounding rectangle using sdo_geom.sdo_mbr() and stored it in another geometry column on the same row.  This geometry contains just 5 points (the last point is the same as the first to close the rectangle).  I can compare two rectangles very quickly, and if they don't intersect overlap then there is no need to see if the actual geometries overlap.  This approach filters out geometries that cannot match, so that fewer geometries then have to be compared in full, thus significantly improving the performance of the search.
AND SDO_GEOM.RELATE(a.mbr,'anyinteract',m.mbr) = 'TRUE'

  • I have found that it is necessary to have the MBR comparison earlier in the predicate clauses than the GEOM comparison.

…
PROCEDURE activity_area_search
(p_activity_id INTEGER
,p_area_code   my_areas.area_code%TYPE DEFAULT NULL
,p_area_number my_areas.area_number%TYPE DEFAULT NULL
,p_query_type VARCHAR2 DEFAULT 'P'
,p_level INTEGER DEFAULT 0
) IS
BEGIN
  FOR i IN(
   SELECT m.*
   ,      CASE WHEN m.geom_27700 IS NOT NULL THEN sdo_geom.sdo_length(SDO_GEOM.sdo_intersection(m.geom_27700,a.geom_27700,5), unit=>'unit=km') 
               WHEN m.geom       IS NOT NULL THEN sdo_geom.sdo_length(SDO_GEOM.sdo_intersection(m.geom,a.geom,5), unit=>'unit=km') 
          END geom_length
   ,      (SELECT MIN(m2.area_level) FROM my_areas m2 
           WHERE  m2.parent_area_code = m.area_code AND m2.parent_area_number = m.area_number) min_child_level
   FROM   my_areas m
   ,      activities a
   WHERE  (  (p_query_type = 'P' AND parent_area_code = p_area_code AND parent_area_number = p_area_number) 
          OR (p_query_type = 'A' AND area_code        = p_area_code AND area_number        = p_area_number)
	  OR (p_query_type = 'A' AND p_area_number IS NULL          AND area_code          =  p_area_code)
          OR (p_area_code IS NULL AND p_area_number IS NULL AND parent_area_code IS NULL AND parent_area_number IS NULL))
   AND    a.activity_id = p_activity_id
   and    SDO_GEOM.RELATE(a.mbr,'anyinteract',m.mbr) = 'TRUE'
   and    SDO_GEOM.RELATE(a.geom,'anyinteract',m.geom) = 'TRUE'
  ) LOOP
    IF i.area_level>0 OR i.num_children IS NULL THEN
      BEGIN
        INSERT INTO activity_areas
        (activity_id, area_code, area_number, geom_length)
        VALUES
        (p_activity_id, i.area_code, i.area_number, i.geom_length);
      EXCEPTION
        WHEN dup_val_on_index THEN
          UPDATE activity_areas
          SET    geom_length = i.geom_length
          WHERE  activity_id = p_activity_id
          AND    area_code = i.area_code
          AND    area_number = i.area_number;
      END;
    END IF;
  
    IF i.num_children > 0 THEN
      strava_pkg.activity_area_search(p_activity_id, i.area_code, i.area_number, 'P', p_level+1);
    END IF;
  END LOOP;

END activity_area_search;
…

The search can process a single activity by calling the procedure.  An activity that found just 10 areas, took just 6 seconds to process.  However, it does not scale linearly.  Activities that have over 100 areas can take at least 6 minutes.
SQL> exec strava_pkg.activity_area_search(4372796838);
Searching 4372796838:-
Found SOV-1159320701:United Kingdom,    2.895 km
.Searching 4372796838:SOV-1159320701
.Found GEOU-1159320743:England,    2.851 km
..Searching 4372796838:GEOU-1159320743
..Found GLA-117537:Greater London,    2.851 km
...Searching 4372796838:GLA-117537
...Found LBO-50724:City of Westminster,    1.732 km
....Searching 4372796838:LBO-50724
....Found LBW-117484:Abbey Road,    1.435 km
....Found LBW-50639:Maida Vale,    0.298 km
....Done 4372796838:LBO-50724:    0.415 secs).
...Found LBO-50632:Camden,    1.119 km
....Searching 4372796838:LBO-50632
....Found LBW-117286:Kilburn,    0.273 km
....Found LBW-117288:Swiss Cottage,    1.033 km
....Found LBW-117287:West Hampstead,    0.084 km
....Done 4372796838:LBO-50632:    0.521 secs).
...Done 4372796838:GLA-117537:    3.368 secs).
..Done 4372796838:GEOU-1159320743:    4.372 secs).
.Done 4372796838:SOV-1159320701:    4.750 secs).
Done 4372796838:-:    5.532 secs).

PL/SQL procedure successfully completed.
Since I load Strava activities from the bulk download, I also process them in bulk.
--process unmatched activities
set pages 99 lines 180 timi on serveroutput on
column activity_name format a60
BEGIN 
  FOR i IN (
    SELECT a.activity_id, activity_date, activity_name
    ,      distance_km, num_pts, ROUND(num_pts/NULLIF(distance_km,0),0) ppkm
    FROM   activities a
    WHERE  activity_id NOT IN (SELECT DISTINCT activity_id FROM activity_areas)
    AND    num_pts>0
  ) LOOP
    strava_pkg.activity_area_search(i.activity_id);
    commit;
  END LOOP;
END;
/
Matching 2,700 activities produced 71,628 rows on activity_areas for 5,620 distinct areas. In the next article, I will demonstrate how to text search the areas for matching activities.