- Putting data files into OCI Object Storage
- Reading from OCI Object Storage (using DBMS_CLOUD)
- Setting up a credential
- Using an external table
- Copy to a heap table
Copy Data into Table
Alternatively, we can copy the data into a normal table. The table needs to be created in advance. This time, I am going to run the copy as user SOE rather than ADMIN. I need to:- Grant connect and resource privilege and quota on the data tablespace.
- Grant execute on DBMS_CLOUD to SOE, so it can execute the command.
- Grant READ and WRITE access on the DATA_PUMP_DIR directory – the log and bad files created by this process are written to this database directory.
connect admin/Password2020!@gofaster1b_tp
CREATE USER soe IDENTIFIED BY Password2020;
GRANT CONNECT, RESOURCE TO soe;
GRANT EXECUTE ON DBMS_CLOUD TO soe;
GRANT READ, WRITE ON DIRECTORY data_pump_dir TO soe;
ALTER USER soe QUOTA UNLIMITED ON data;
I am now going to switch to user SOE and create my table.
connect soe/Password2020@gofaster1b_tp
Drop table soe.ash_hist purge;
CREATE TABLE soe.ASH_HIST
( SNAP_ID NUMBER,
DBID NUMBER,
INSTANCE_NUMBER NUMBER,
SAMPLE_ID NUMBER,
SAMPLE_TIME TIMESTAMP (3),
-- SAMPLE_TIME_UTC TIMESTAMP (3),
-- USECS_PER_ROW NUMBER,
SESSION_ID NUMBER,
SESSION_SERIAL# NUMBER,
SESSION_TYPE VARCHAR2(10),
FLAGS NUMBER,
USER_ID NUMBER,
-----------------------------------------
SQL_ID VARCHAR2(13),
IS_SQLID_CURRENT VARCHAR2(1),
SQL_CHILD_NUMBER NUMBER,
SQL_OPCODE NUMBER,
SQL_OPNAME VARCHAR2(64),
FORCE_MATCHING_SIGNATURE NUMBER,
TOP_LEVEL_SQL_ID VARCHAR2(13),
TOP_LEVEL_SQL_OPCODE NUMBER,
SQL_PLAN_HASH_VALUE NUMBER,
SQL_FULL_PLAN_HASH_VALUE NUMBER,
-----------------------------------------
SQL_ADAPTIVE_PLAN_RESOLVED NUMBER,
SQL_PLAN_LINE_ID NUMBER,
SQL_PLAN_OPERATION VARCHAR2(64),
SQL_PLAN_OPTIONS VARCHAR2(64),
SQL_EXEC_ID NUMBER,
SQL_EXEC_START DATE,
PLSQL_ENTRY_OBJECT_ID NUMBER,
PLSQL_ENTRY_SUBPROGRAM_ID NUMBER,
PLSQL_OBJECT_ID NUMBER,
PLSQL_SUBPROGRAM_ID NUMBER,
-----------------------------------------
QC_INSTANCE_ID NUMBER,
QC_SESSION_ID NUMBER,
QC_SESSION_SERIAL# NUMBER,
PX_FLAGS NUMBER,
EVENT VARCHAR2(64),
EVENT_ID NUMBER,
SEQ# NUMBER,
P1TEXT VARCHAR2(64),
P1 NUMBER,
P2TEXT VARCHAR2(64),
-----------------------------------------
P2 NUMBER,
P3TEXT VARCHAR2(64),
P3 NUMBER,
WAIT_CLASS VARCHAR2(64),
WAIT_CLASS_ID NUMBER,
WAIT_TIME NUMBER,
SESSION_STATE VARCHAR2(7),
TIME_WAITED NUMBER,
BLOCKING_SESSION_STATUS VARCHAR2(11),
BLOCKING_SESSION NUMBER,
-----------------------------------------
BLOCKING_SESSION_SERIAL# NUMBER,
BLOCKING_INST_ID NUMBER,
BLOCKING_HANGCHAIN_INFO VARCHAR2(1),
CURRENT_OBJ# NUMBER,
CURRENT_FILE# NUMBER,
CURRENT_BLOCK# NUMBER,
CURRENT_ROW# NUMBER,
TOP_LEVEL_CALL# NUMBER,
TOP_LEVEL_CALL_NAME VARCHAR2(64),
CONSUMER_GROUP_ID NUMBER,
-----------------------------------------
XID RAW(8),
REMOTE_INSTANCE# NUMBER,
TIME_MODEL NUMBER,
IN_CONNECTION_MGMT VARCHAR2(1),
IN_PARSE VARCHAR2(1),
IN_HARD_PARSE VARCHAR2(1),
IN_SQL_EXECUTION VARCHAR2(1),
IN_PLSQL_EXECUTION VARCHAR2(1),
IN_PLSQL_RPC VARCHAR2(1),
IN_PLSQL_COMPILATION VARCHAR2(1),
-----------------------------------------
IN_JAVA_EXECUTION VARCHAR2(1),
IN_BIND VARCHAR2(1),
IN_CURSOR_CLOSE VARCHAR2(1),
IN_SEQUENCE_LOAD VARCHAR2(1),
IN_INMEMORY_QUERY VARCHAR2(1),
IN_INMEMORY_POPULATE VARCHAR2(1),
IN_INMEMORY_PREPOPULATE VARCHAR2(1),
IN_INMEMORY_REPOPULATE VARCHAR2(1),
IN_INMEMORY_TREPOPULATE VARCHAR2(1),
-- IN_TABLESPACE_ENCRYPTION VARCHAR2(1),
CAPTURE_OVERHEAD VARCHAR2(1),
-----------------------------------------
REPLAY_OVERHEAD VARCHAR2(1),
IS_CAPTURED VARCHAR2(1),
IS_REPLAYED VARCHAR2(1),
-- IS_REPLAY_SYNC_TOKEN_HOLDER VARCHAR2(1),
SERVICE_HASH NUMBER,
PROGRAM VARCHAR2(64),
MODULE VARCHAR2(64),
ACTION VARCHAR2(64),
CLIENT_ID VARCHAR2(64),
MACHINE VARCHAR2(64),
PORT NUMBER,
-----------------------------------------
ECID VARCHAR2(64),
DBREPLAY_FILE_ID NUMBER,
DBREPLAY_CALL_COUNTER NUMBER,
TM_DELTA_TIME NUMBER,
TM_DELTA_CPU_TIME NUMBER,
TM_DELTA_DB_TIME NUMBER,
DELTA_TIME NUMBER,
DELTA_READ_IO_REQUESTS NUMBER,
DELTA_WRITE_IO_REQUESTS NUMBER,
DELTA_READ_IO_BYTES NUMBER,
-----------------------------------------
DELTA_WRITE_IO_BYTES NUMBER,
DELTA_INTERCONNECT_IO_BYTES NUMBER,
PGA_ALLOCATED NUMBER,
TEMP_SPACE_ALLOCATED NUMBER,
DBOP_NAME VARCHAR2(64),
DBOP_EXEC_ID NUMBER,
CON_DBID NUMBER,
CON_ID NUMBER,
-----------------------------------------
CONSTRAINT ash_hist_pk PRIMARY KEY (dbid, instance_number, snap_id, sample_id, session_id)
)
COMPRESS FOR QUERY LOW
/
As Autonomous Databases run on Exadata, I have also specified Hybrid Columnar Compression (HCC) for this table.Credentials are specific to the database user. I have to create an additional credential, for the same cloud user, but owned by SOE.
ALTER SESSION SET nls_date_Format='hh24:mi:ss dd.mm.yyyy';
set serveroutput on timi on
BEGIN
DBMS_CLOUD.CREATE_CREDENTIAL (
credential_name => 'SOE_BUCKET',
username=> 'oraclecloud1@go-faster.co.uk',
password=> 'K7xfi-mG<1Z:dq#88;1m'
);
END;
/
column owner format a10
column credential_name format a20
column comments format a80
column username format a40
SELECT * FROM dba_credentials;
OWNER CREDENTIAL_NAME USERNAME WINDOWS_DOMAIN
---------- -------------------- ---------------------------------------- ------------------------------
COMMENTS ENABL
-------------------------------------------------------------------------------- -----
ADMIN MY_BUCKET oraclecloud1@go-faster.co.uk
{"comments":"Created via DBMS_CLOUD.create_credential"} TRUE
SOE SOE_BUCKET oraclecloud1@go-faster.co.uk
{"comments":"Created via DBMS_CLOUD.create_credential"} TRUE
The COPY_DATA procedure is similar to CREATE_EXTERNAL_TABLE described in the previous post, but it doesn't have a column list. The field names much match the column names. It is sensitive to field names with a trailing #. These must be enclosed in double-quotes.
TRUNCATE TABLE soe.ash_hist;
DECLARE
l_operation_id NUMBER;
BEGIN
DBMS_CLOUD.COPY_DATA(
table_name =>'ASH_HIST',
credential_name =>'SOE_BUCKET',
file_uri_list =>'https://objectstorage.uk-london-1.oraclecloud.com/n/lrndaxjjgnuu/b/bucket-20200505-1552/o/dba_hist_active_sess_history.txt.gz',
schema_name => 'SOE',
format => json_object('blankasnull' value 'true'
,'compression' value 'gzip'
,'dateformat' value 'YYYY-MM-DD/HH24:mi:ss'
,'timestampformat' value 'YYYY-MM-DD/HH24:mi:ss.ff'
,'delimiter' value '<,>'
,'ignoreblanklines' value 'true'
,'rejectlimit' value '10'
,'removequotes' value 'true'
,'trimspaces' value 'lrtrim'
),
field_list=>'SNAP_ID,DBID,INSTANCE_NUMBER,SAMPLE_ID,SAMPLE_TIME ,SESSION_ID,"SESSION_SERIAL#",SESSION_TYPE,FLAGS,USER_ID
,SQL_ID,IS_SQLID_CURRENT,SQL_CHILD_NUMBER,SQL_OPCODE,SQL_OPNAME,FORCE_MATCHING_SIGNATURE,TOP_LEVEL_SQL_ID,TOP_LEVEL_SQL_OPCODE,SQL_PLAN_HASH_VALUE,SQL_FULL_PLAN_HASH_VALUE
,SQL_ADAPTIVE_PLAN_RESOLVED,SQL_PLAN_LINE_ID,SQL_PLAN_OPERATION,SQL_PLAN_OPTIONS,SQL_EXEC_ID,SQL_EXEC_START,PLSQL_ENTRY_OBJECT_ID,PLSQL_ENTRY_SUBPROGRAM_ID,PLSQL_OBJECT_ID,PLSQL_SUBPROGRAM_ID
,QC_INSTANCE_ID,QC_SESSION_ID,"QC_SESSION_SERIAL#",PX_FLAGS,EVENT,EVENT_ID,"SEQ#",P1TEXT,P1,P2TEXT
,P2,P3TEXT,P3,WAIT_CLASS,WAIT_CLASS_ID,WAIT_TIME,SESSION_STATE,TIME_WAITED,BLOCKING_SESSION_STATUS,BLOCKING_SESSION
,"BLOCKING_SESSION_SERIAL#",BLOCKING_INST_ID,BLOCKING_HANGCHAIN_INFO,"CURRENT_OBJ#","CURRENT_FILE#","CURRENT_BLOCK#","CURRENT_ROW#","TOP_LEVEL_CALL#",TOP_LEVEL_CALL_NAME,CONSUMER_GROUP_ID
,XID,"REMOTE_INSTANCE#",TIME_MODEL,IN_CONNECTION_MGMT,IN_PARSE,IN_HARD_PARSE,IN_SQL_EXECUTION,IN_PLSQL_EXECUTION,IN_PLSQL_RPC,IN_PLSQL_COMPILATION
,IN_JAVA_EXECUTION,IN_BIND,IN_CURSOR_CLOSE,IN_SEQUENCE_LOAD,IN_INMEMORY_QUERY,IN_INMEMORY_POPULATE,IN_INMEMORY_PREPOPULATE,IN_INMEMORY_REPOPULATE,IN_INMEMORY_TREPOPULATE,CAPTURE_OVERHEAD
,REPLAY_OVERHEAD,IS_CAPTURED,IS_REPLAYED,SERVICE_HASH,PROGRAM,MODULE,ACTION,CLIENT_ID,MACHINE,PORT
,ECID,DBREPLAY_FILE_ID,DBREPLAY_CALL_COUNTER,TM_DELTA_TIME,TM_DELTA_CPU_TIME,TM_DELTA_DB_TIME,DELTA_TIME,DELTA_READ_IO_REQUESTS,DELTA_WRITE_IO_REQUESTS,DELTA_READ_IO_BYTES
,DELTA_WRITE_IO_BYTES,DELTA_INTERCONNECT_IO_BYTES,PGA_ALLOCATED,TEMP_SPACE_ALLOCATED,DBOP_NAME,DBOP_EXEC_ID,CON_DBID,CON_ID',
operation_id=>l_operation_id
);
dbms_output.put_line('Operation ID:'||l_operation_id||' finished successfully');
EXCEPTION WHEN OTHERS THEN
dbms_output.put_line('Operation ID:'||l_operation_id||' raised an error');
RAISE;
END;
/
The copy data takes slightly longer than the query on the external table.
Operation ID:31 finished successfully PL/SQL procedure successfully completed. Elapsed: 00:02:01.11
The status of the copy operation is reported in USER_LOAD_OPERATIONS. This includes the number of rows loaded and the names of external tables that are created for the log and bad files.
set lines 120
column type format a10
column file_uri_list format a64
column start_time format a32
column update_time format a32
column owner_name format a10
column table_name format a10
column partition_name format a10
column subpartition_name format a10
column logfile_table format a15
column badfile_table format a15
column tempext_table format a30
select * from user_load_operations where id = &operation_id;
ID TYPE SID SERIAL# START_TIME UPDATE_TIME STATUS
---------- ---------- ---------- ---------- -------------------------------- -------------------------------- ---------
OWNER_NAME TABLE_NAME PARTITION_ SUBPARTITI FILE_URI_LIST ROWS_LOADED
---------- ---------- ---------- ---------- ---------------------------------------------------------------- -----------
LOGFILE_TABLE BADFILE_TABLE TEMPEXT_TABLE
--------------- --------------- ------------------------------
31 COPY 19965 44088 07-MAY-20 17.03.20.328263 +01:00 07-MAY-20 17.05.36.157680 +01:00 COMPLETED
SOE ASH_HIST https://objectstorage.uk-london-1.oraclecloud.com/n/lrndaxjjgnuu 1409305
/b/bucket-20200505-1552/o/dba_hist_active_sess_history.txt.gz
COPY$31_LOG COPY$31_BAD COPY$Y2R021UKPJ5F75JCMSKL
An external table is temporarily created by the COPY_DATA procedure but is then dropped before the procedure completes. The bad file is empty because the copy operation succeeded without error, but we can query the copy log.
select * from COPY$31_LOG; RECORD ------------------------------------------------------------------------------------------------------------------------ LOG file opened at 05/07/20 16:03:21 Total Number of Files=1 Data File: https://objectstorage.uk-london-1.oraclecloud.com/n/lrndaxjjgnuu/b/bucket-20200505-1552/o/dba_hist_active_sess_history.txt.gz Log File: COPY$31_105537.log LOG file opened at 05/07/20 16:03:21 Total Number of Files=1 Data File: https://objectstorage.uk-london-1.oraclecloud.com/n/lrndaxjjgnuu/b/bucket-20200505-1552/o/dba_hist_active_sess_history.txt.gz Log File: COPY$31_105537.log LOG file opened at 05/07/20 16:03:21 KUP-05014: Warning: Intra source concurrency disabled because the URLs specified for the Cloud Service map to compressed data. Bad File: COPY$31_105537.bad Field Definitions for table COPY$Y2R021UKPJ5F75JCMSKL Record format DELIMITED BY Data in file has same endianness as the platform Rows with all null fields are accepted Table level NULLIF (Field = BLANKS) Fields in Data Source: SNAP_ID CHAR (255) Terminated by "<,>" Trim whitespace from left and right DBID CHAR (255) Terminated by "<,>" Trim whitespace from left and right INSTANCE_NUMBER CHAR (255) Terminated by "<,>" Trim whitespace from left and right SAMPLE_ID CHAR (255) Terminated by "<,>" Trim whitespace from left and right SAMPLE_TIME CHAR (255) Date datatype TIMESTAMP, date mask YYYY-MM-DD/HH24:mi:ss.ff Terminated by "<,>" Trim whitespace from left and right … CON_ID CHAR (255) Terminated by "<,>" Trim whitespace from left and right Date Cache Statistics for table COPY$Y2R021UKPJ5F75JCMSKL Date conversion cache disabled due to overflow (default size: 1000) 365 rows selected.
These files are written to the DATA_DUMP_DIR database directory. We don't have access to the database file system in Autonomous, so Oracle has provided the LIST_FILES procedure in DBMS_CLOUD so that we can see what files are in a directory.
Set pages 99 lines 150 Column object_name format a32 Column created format a32 Column last_modified format a32 Column checksum format a20 SELECT * FROM DBMS_CLOUD.LIST_FILES('DATA_PUMP_DIR'); OBJECT_NAME BYTES CHECKSUM CREATED LAST_MODIFIED -------------------------------- ---------- -------------------- -------------------------------- -------------------------------- … COPY$31_dflt.log 0 07-MAY-20 16.03.20.000000 +00:00 07-MAY-20 16.03.20.000000 +00:00 COPY$31_dflt.bad 0 07-MAY-20 16.03.20.000000 +00:00 07-MAY-20 16.03.20.000000 +00:00 COPY$31_105537.log 13591 07-MAY-20 16.03.21.000000 +00:00 07-MAY-20 16.05.35.000000 +00:00
Statistics are automatically collected on the table by the copy process because it was done in direct-path mode. We can see the number of rows retrieved corresponds with the number of rows imported by the COPY_DATA procedure.
Set pages 99 lines 140 Column owner format a10 Column IM_STAT_UPDATE_TIME format a30 Select * from all_tab_statistics Where table_name = 'ASH_HIST'; OWNER TABLE_NAME PARTITION_ PARTITION_POSITION SUBPARTITI SUBPARTITION_POSITION OBJECT_TYPE NUM_ROWS BLOCKS EMPTY_BLOCKS ---------- ---------- ---------- ------------------ ---------- --------------------- ------------ ---------- ---------- ------------ AVG_SPACE CHAIN_CNT AVG_ROW_LEN AVG_SPACE_FREELIST_BLOCKS NUM_FREELIST_BLOCKS AVG_CACHED_BLOCKS AVG_CACHE_HIT_RATIO IM_IMCU_COUNT ---------- ---------- ----------- ------------------------- ------------------- ----------------- ------------------- ------------- IM_BLOCK_COUNT IM_STAT_UPDATE_TIME SCAN_RATE SAMPLE_SIZE LAST_ANALYZED GLO USE STATT STALE_S SCOPE -------------- ------------------------------ ---------- ----------- ------------------- --- --- ----- ------- ------- SOE ASH_HIST TABLE 1409305 19426 0 0 0 486 0 0 1409305 15:16:14 07.05.2020 YES NO NO SHARED
I can confirm that the data is compressed because the compression type of every row is type 8 (HCC QUERY LOW). See also DBMS_COMPRESSION Compression Types
WITH x AS (
select dbms_compression.get_compression_type('SOE', 'ASH_HIST', rowid) ctype
from soe.ash_hist sample (.1))
Select ctype, count(*) From x group by ctype;
CTYPE COUNT(*)
---------- ----------
8 14097
I can find this SQL Statement in the Performance Hub.
INSERT /*+ append enable_parallel_dml */ INTO "SOE"."ASH_HIST" SELECT * FROM COPY$Y2R021UKPJ5F75JCMSKLTherefore, the data was queried from the temporary external table into the permanent table, in direct path mode and in parallel.
I can also look at the OCI Performance Hub and see that mode of the time was spent on CPU. I can see the SQL_ID of the insert statement and the call to the DBMS_CLOUD procedure.
I can drill in further to the exact SQL statement.
When I query the table I get exactly the same data as previously with the external table.
set autotrace on timi on lines 180 trimspool on
break on report
compute sum of ash_secs on report
column min(sample_time) format a22
column max(sample_time) format a22
select event, sum(10) ash_Secs, min(sample_time), max(sample_time)
from soe.ash_hist
group by event
order by ash_Secs desc
;
EVENT ASH_SECS MIN(SAMPLE_TIME) MAX(SAMPLE_TIME)
---------------------------------------------------------------- ---------- ---------------------- ----------------------
10304530 22-MAR-20 09.59.51.125 07-APR-20 23.00.30.395
direct path read 3258500 22-MAR-20 09.59.51.125 07-APR-20 23.00.30.395
SQL*Net more data to client 269220 22-MAR-20 10.00.31.205 07-APR-20 22.59.30.275
direct path write temp 32400 22-MAR-20 11.39.53.996 07-APR-20 21.43.47.329
gc cr block busy 24930 22-MAR-20 10.51.33.189 07-APR-20 22.56.56.804
latch free 10 28-MAR-20 20.26.11.307 28-MAR-20 20.26.11.307
----------
sum 14093050
86 rows selected.
Elapsed: 00:00:00.62
I can see that the execution plan is now a single serial full scan of the table.
Execution Plan
----------------------------------------------------------
Plan hash value: 1336681691
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 84 | 1428 | 1848 (9)| 00:00:01 |
| 1 | SORT ORDER BY | | 84 | 1428 | 1848 (9)| 00:00:01 |
| 2 | HASH GROUP BY | | 84 | 1428 | 1848 (9)| 00:00:01 |
| 3 | TABLE ACCESS STORAGE FULL| ASH_HIST | 1409K| 22M| 1753 (4)| 00:00:01 |
----------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
11 recursive calls
13 db block gets
19255 consistent gets
19247 physical reads
2436 redo size
5428 bytes sent via SQL*Net to client
602 bytes received via SQL*Net from client
7 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
86 rows processed