Text Searching can be CPU Intensive?
It started with a requirement to find names that roughly matched a search string. The roughness of the match was determined by calculating the Levenshtein distance between each customer name and the search string. Informally, the Levenshtein distance is a mathematical function that calculates the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. Oracle has implemented it as a function in a delivered package UTL_MATCH.EDIT_DISTANCE(). This function isn't directly relevant to the rest of the discussion about parallelism, except that it is an example of CPU intensive processing that doesn't alter the database where parallel execution may be a legitimate tactic for improving performance.The examples in this article use the SH schema in the Oracle sample schemas (available on Github).
The following query searches case-insensitively for customer names within a Levenshtein distance of 3 from 'Stephen'. It finds 'Steven' and 'Staphany'.
set autotrace on timi on pages 99 lines 200
with x as (
select c.cust_first_name
, utl_match.edit_distance(upper(cust_first_name),'STEPHEN') edit_distance
from customers c
)
select * from x
where edit_distance <= 3
/
CUST_FIRST_NAME EDIT_DISTANCE
-------------------- -------------
Staphany 3
Steven 2
Steven 2
However, to do so Oracle had to full scan the table and execute the function for every row in the table.Plan hash value: 2008213504
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2775 | 19425 | 429 (4)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| CUSTOMERS | 2775 | 19425 | 429 (4)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("UTL_MATCH"."EDIT_DISTANCE"(UPPER("CUST_FIRST_NAME"),'STEPHEN')<=3)
Oracle implemented Levenshtein as a C function that is then called from PL/SQL, so it just consumes CPU and doesn't do anything else to the database. You can start to see why the user complained that the query with UTL_MATCH was slow.However, the first question is how much time do we spend on the full scan and how much time do we spend executing this function?
Follow the Time with Instrumentation
For test purposes, I am going to build my own packaged functions with session instrumentation. Then I can use Active Session History (ASH) to work out where the time went.NB: ASH requires a licence for the Diagnostics Pack
- One function levenshtein() calls to UTL_MATCH.EDIT_DISTANCE().
- The other dolittle() is a control function that does nothing except instrumentation. It is used to measure the intrusion effect of the instrumentation.
CREATE OR REPLACE package dmk AS
function levenshtein(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER;
function dolittle(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER;
END;
/
CREATE OR REPLACE package body dmk AS
function levenshtein(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER IS
l_distance INTEGER;
l_module VARCHAR2(64);
l_action VARCHAR2(64);
BEGIN
dbms_application_info.read_module(l_module,l_action); /*save current module/action*/
dbms_application_info.set_action('levenshtein()'); /*set action for function*/
l_distance := utl_match.edit_distance(UPPER(p1),UPPER(p2));
dbms_application_info.set_action(l_action); /*restore previous action*/
RETURN l_distance;
END levenshtein;
function dolittle(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER IS
l_distance INTEGER;
l_module VARCHAR2(64);
l_action VARCHAR2(64);
BEGIN
dbms_application_info.read_module(l_module,l_action);
dbms_application_info.set_action('dolittle()');
l_distance := 1;
dbms_application_info.set_action(l_action);
RETURN l_distance;
END dolittle;
END dmk;
/
Now, I will run a test query that executes UTL_MATCH for each of the 55500 rows in the CUSTOMERS table and executes that all 10 times inside a PL/SQL block.set timi on autotrace on lines 500 serveroutput on
DECLARE
l_counter INTEGER := 0;
BEGIN
dbms_application_info.set_module('DMK LEVENSHTEIN TEST',null);
FOR j IN 1..10 LOOP
FOR i IN (
select sum(dmk.levenshtein(cust_first_name,'STEPHEN')) a
, sum(dmk.dolittle(cust_first_name,'STEPHEN')) b
from customers c
) LOOP
l_counter := l_counter+i.b;
END LOOP;
END LOOP;
dbms_application_info.set_module(null,null);
dbms_output.put_line('Executions: '||l_counter);
END;
/
Executions: 555000
PL/SQL procedure successfully completed.
Elapsed: 00:00:25.08
We can see that the Levenshtein function was executed 555000 times. Now I will query the ASH for this test and group it by ACTION.set timi on autotrace off lines 500
column module format a25
column action format a25
select module, action, sql_id, sum(1) ash_Secs
from v$active_Session_History
where module = 'DMK LEVENSHTEIN TEST'
and sample_time >= SYSDATE-1/1440
group by module, action, sql_id
/
MODULE ACTION SQL_ID ASH_SECS
------------------------- ------------------------- ------------- ----------
DMK LEVENSHTEIN TEST dolittle() 7gp0w6qdvxrd2 2
DMK LEVENSHTEIN TEST levenshtein() 7gp0w6qdvxrd2 3
DMK LEVENSHTEIN TEST 7gp0w6qdvxrd2 20
The runtime of the Levenshtein function took 3 seconds, and the function that does nothing except instrumentation is 2 seconds, so the overhead of UTL_MATCH is only about 1 second, and there are 20 seconds in the SQL. In this test, the overhead of Levenshtein is low, but it would still be worth doing the full scan in parallel.I Can't Get No Parallelism
But there is a problem! I specified parallelism with a hint, but I don't get a parallel plan.select /*+PARALLEL(C 4)*/ avg(utl_match.edit_distance(UPPER(cust_first_name),'STEPHEN'))
from customers c
/
Plan hash value: 1978308596
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 115 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 7 | | |
| 2 | SORT AGGREGATE | | 1 | 7 | | |
| 3 | TABLE ACCESS FULL| CUSTOMERS | 55500 | 379K| 115 (0)| 00:00:01 |
---------------------------------------------------------------------------------
However, if I use a simple SQL function, then I do get parallelism.select /*+PARALLEL(C 4)*/ max(cust_first_name)
from customers c
/
Plan hash value: 1221513835
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 115 (0)| 00:00:01 | | | |
| 1 | SORT AGGREGATE | | 1 | 7 | | | | | |
| 2 | PX COORDINATOR | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 7 | | | Q1,00 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | 1 | 7 | | | Q1,00 | PCWP | |
| 5 | PX BLOCK ITERATOR | | 55500 | 379K| 115 (0)| 00:00:01 | Q1,00 | PCWC | |
| 6 | TABLE ACCESS FULL| CUSTOMERS | 55500 | 379K| 115 (0)| 00:00:01 | Q1,00 | PCWP | |
-----------------------------------------------------------------------------------------------------------------
Note
-----
- Degree of Parallelism is 4 because of table property
So, UTL_MATCH.EDIT_FUNCTION must be preventing parallelism in some way. There is a closed Oracle bug 169587070 that details exactly this problem with UTL_MATCH.PARALLEL_ENABLE and PRAGMA RESTRICT_REFERENCES
This is also documented default behaviour in PL/SQL. Database VLDB and Partitioning Guide: About Parallel Execution of Functions, Functions in Parallel Queries:"A user-written function may be executed in parallel in any of the following cases:
- If it has been declared with the PARALLEL_ENABLE keyword
- If it is declared in a package or type and has a PRAGMA RESTRICT_REFERENCES clause that indicates all of WNDS, RNPS, and WNPS
- If it is declared with CREATE FUNCTION and the system can analyze the body of the PL/SQL code and determine that the code neither writes to the database nor reads or modifies package variables"
Workarounds
There are various workarounds for this.- I could add the PARALLEL_ENABLE declarations to the functions in the package specification of the delivered UTL_MATCH package. Although it does work, I would certainly not be happy to alter any delivered Oracle package in any serious database, without approval from Oracle support.
- Or, I could add the RESTRICT_REFERENCES pragma to the package specification. Again, although this works, it involves altering a delivered package.
- However, I can wrap the delivered package in my own packaged function with either PARALLEL_ENABLE (my personal preferrance).
CREATE OR REPLACE package dmk AS
FUNCTION levenshtein(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER PARALLEL_ENABLE;
FUNCTION dolittle(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER PARALLEL_ENABLE;
END;
/
- Or you can use RESTRICT_REFERENCES in the package specification, but you must include the TRUST pragma to over-ride the lack of a pragma definition in the called packaged function.
CREATE OR REPLACE package dmk AS
FUNCTION levenshtein(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER;
PRAGMA RESTRICT_REFERENCES (levenshtein,WNDS,RNDS,WNPS,RNPS,TRUST);
FUNCTION dolittle(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER;
PRAGMA RESTRICT_REFERENCES (do_little,WNDS,RNDS,WNPS,RNPS,TRUST);
END;
/
Now, I get parallel execution of the unaltered delivered UTL_MATCH.EDIT_DISTANCE() function.select /*+PARALLEL(C 4) FULL(C)*/ sum(dmk.levenshtein(cust_first_name,'STEPHEN')) a
, sum(dmk.dolittle(cust_first_name,'STEPHEN')) b
from customers c
/
Plan hash value: 1221513835
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 115 (0)| 00:00:01 | | | |
| 1 | SORT AGGREGATE | | 1 | 7 | | | | | |
| 2 | PX COORDINATOR | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 7 | | | Q1,00 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | 1 | 7 | | | Q1,00 | PCWP | |
| 5 | PX BLOCK ITERATOR | | 55500 | 379K| 115 (0)| 00:00:01 | Q1,00 | PCWC | |
| 6 | TABLE ACCESS FULL| CUSTOMERS | 55500 | 379K| 115 (0)| 00:00:01 | Q1,00 | PCWP | |
-----------------------------------------------------------------------------------------------------------------
Note
-----
- Degree of Parallelism is 4 because of table property
Conclusion: Parallel PL/SQL
I can now repeat the earlier test, but with a parallel hint, the runtime goes down from 25 to 9 seconds, although about the same amount of database time is recorded by ASH. So parallelism can improve the performance for the end user, but will not reduce the total CPU overhead. If anything is likely to increase overall CPU time.set timi on lines 500
DECLARE
l_counter INTEGER := 0;
BEGIN
dbms_application_info.set_module('DMK LEVENSHTEIN TEST',null);
FOR j IN 1..10 LOOP
FOR i IN (
select /*+PARALLEL(C 4)*/
sum(dmk.levenshtein(cust_first_name,'STEPHEN')) a
, sum(dmk.dolittle(cust_first_name,'STEPHEN')) b
from customers c
) LOOP
l_counter := l_counter+i.b;
END LOOP;
END LOOP;
dbms_application_info.set_module(null,null);
dbms_output.put_line('Executions: '||l_counter);
END;
/
Executions: 555000
PL/SQL procedure successfully completed.
Elapsed: 00:00:09.34
set timi on autotrace off lines 500
column module format a32
column action format a32
select module, action, sql_id, sum(1) ash_Secs
from v$active_Session_History
where module = 'DMK LEVENSHTEIN TEST'
and sample_time >= SYSDATE-1/1440
group by module, action, sql_id
/
MODULE ACTION SQL_ID ASH_SECS
-------------------------------- -------------------------------- ------------- ----------
DMK LEVENSHTEIN TEST 3391gqpzu5g2k 21
DMK LEVENSHTEIN TEST levenshtein() 3391gqpzu5g2k 5
DMK LEVENSHTEIN TEST dolittle() 3391gqpzu5g2k 4
NB: I have not been able to get parallelism to work in a PL/SQL function defined in a WITH clause because you cannot specify PARALLEL_ENABLE, and a pragma can only be specified in a package specification.WITH
FUNCTION levenshtein(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER PARALLEL_ENABLE IS
BEGIN
RETURN utl_match.edit_distance(UPPER(p1),UPPER(p2));
END;
select /*+PARALLEL(C 4)*/
sum(levenshtein(cust_first_name,'STEPHEN')) a
from customers c
/
ORA-06553: PLS-712: illegal option for subprogram LEVENSHTEIN