Skip to Main Content

Breadcrumb

Dev Live Dev Intro

We are celebrating Developers at AskTOM. We welcome Developers of all levels of experience to join us at our FREE Developer Live events coming in August and September. Just click on the left to register today! If you are brand new to Database Technology, then we also have got you covered. Just click on the right for your comprehensive FREE training program to kick start your Oracle Database Development journey!

Question and Answer

Chris Saxon

Thanks for the question.

Asked: March 06, 2017 - 10:56 am UTC

Answered by: Chris Saxon - Last updated: August 12, 2020 - 7:50 am UTC

Category: Developer - Version: 11.2

Viewed 1000+ times

You Asked

I have a table tb_card_data

CARD_NO TYPE EMAIL MODULE CREATED_ON SEQ_NO
------- ---- ----- ------ ---------- ---------
12345 C x@a.com MOD_1 06/03/2017 20170306000001 (create_on + sequence number)
..
so on around 500k records for MODULE - MOD_1

around 500k records for MODULE - MOD_2
around 500k records for MODULE - MOD_3

We need to process all the records for MOD_1 on the basis of CREATED_ON & SEQ_NO.
Once processing gets completed we need to delete only records for MOD_1 from above table.

Index is created on CREATED_ON column.

Since we are doing daily INSERT/DELETE on the table, do we need to take care of issues like High Water Mark Or Shrink table??

and we said...

It depends upon how you're inserting your data. If you use regular insert/delete, new rows will go below the high water mark (if there's space):

create table t as
  select sysdate dt, a.* from all_objects a
  where  1=0;
  
insert into t
  select trunc(sysdate) dt, a.* from all_objects a;
  
commit;

select bytes from user_segments
where  segment_name = 'T';

BYTES       
11,763,712 

delete t
where  dt = trunc(sysdate);

commit;

insert into t
  select trunc(sysdate)+1 dt, a.* from all_objects a;
  
commit;

select bytes from user_segments
where  segment_name = 'T';

BYTES       
11,763,712  

delete t
where  dt = trunc(sysdate)+1;

commit;

select bytes from user_segments
where  segment_name = 'T';

BYTES       
11,763,712  


So the space is reused and there's no need to reclaim it.

But if you use insert /*+ append */, new rows will always go above the high water mark. So even if the number of rows is constant, the size of your table will grow and grow!

insert /*+ append */into t
  select trunc(sysdate) dt, a.* from all_objects a;
  
commit;

select bytes from user_segments
where  segment_name = 'T';

BYTES       
26,304,512  

delete t
where  dt = trunc(sysdate);

commit;

insert /*+ append */into t
  select trunc(sysdate)+1 dt, a.* from all_objects a;
  
commit;

select bytes from user_segments
where  segment_name = 'T';

BYTES       
39,411,712  

delete t
where  dt = trunc(sysdate)+1;

commit;

select bytes from user_segments
where  segment_name = 'T';

BYTES       
39,411,712  


In this case you need to reclaim the space using shrink or move. Or just not use append ;)

You may also want to recover space if the amount of data you're adding shrinks over time. e.g. starting with 500k/day -> 400k/day -> 300k/day -> etc. In this case the table will still consume the space needed for the first 500k rows. But normally data grows over time, so this is rarely an issue!

If you're always accessing rows by module and date there's another option you may want to explore:

Partitioning!

(assuming you're licensed for this).

Partitioning by CREATED_ON and subpartitioning by MODULE (or vice versa) enables you to load the data and read what you need from the (sub)partition. Then wipe it out by truncating the (sub)partition.

and you rated our response

  (14 ratings)

Is this answer out of date? If it is, please let us know via a Review

Reviews

Partition Technique

March 06, 2017 - 1:49 pm UTC

Reviewer: A reader

Hi Chris,

Thanks for the detailed info below!!

Could you please have look at below scenario and give your expert opinion so that I can try it out.

tb_card_data
------------

CARD_NO TYPE EMAIL MODULE CREATED_ON Region SEQ_NO
------- ---- ----- ------ ---------- ------ --------------
12345 C x@a.com MOD_1 06/03/2017 APAC 20170306000001
..
23456 C x@a.com MOD_1 06/03/2017 APAC 20170306001111
..
12345 C x@a.com MOD_1 06/03/2017 EMEA 20170306001112
..
23456 C x@a.com MOD_1 06/03/2017 EMEA 20170306002222

There will be daily around 500k rows inserted each for Region specific (e.g. for APAC - 500k, EMEA - 500k).
There will be separate jobs which will fetch records based on CREATED_ON, Region & Seq_No and post execution of 500k records data from tb_card_data should get deleted.
1. APAC job will delete rows from tb_card_data using CREATED_ON, Region (APAC)
1. EMEA job will delete rows from tb_card_data using CREATED_ON, Region (EMEA)

Could you please assist -
1. Which is the best partition technique can be used to Truncate table considering each job can delete only that Region data from table?
2. RANGE PARTITION CREATED_ON >> SUB PARTITION BY LIST REGION OR
3. RANGE PARTITION REGION >> SUB PARTITION BY LIST CREATED_ON

Chris Saxon

Followup  

March 06, 2017 - 3:02 pm UTC

I'd try out both approaches to see how they function in your application:

- Range partition by date, list subpartition by region
- List subpartition by region, range partition by date

Range-list will be easier to manage up to and including 12.1, because you can't use interval subpartitioning:

create table t (
  region varchar2(10),
  dt     date,
  stuff  varchar2(1000)
) partition by list (region)  
  subpartition by range (dt) interval (interval '1' day)
  subpartition template (
    subpartition p1 values less than (date'2017-03-01')
  ) (
  partition APAC values ('APAC'),
  partition EMEA values ('EMEA')
);

SQL Error: ORA-00922: missing or invalid option


Though this is now possible in 12.2:

select * from v$version;

BANNER                                                                        CON_ID  
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production  0       
PL/SQL Release 12.2.0.1.0 - Production                                        0       
CORE 12.2.0.1.0 Production                                                    0       
TNS for Linux: Version 12.2.0.1.0 - Production                                0       
NLSRTL Version 12.2.0.1.0 - Production                                        0  

create table t (
  region varchar2(10),
  dt     date,
  stuff  varchar2(1000)
) partition by list (region)  
  subpartition by range (dt) interval (interval '1' day)
  subpartition template (
    subpartition p1 values less than (date'2017-03-01')
  ) (
  partition APAC values ('APAC'),
  partition EMEA values ('EMEA')
);

Table T created.


Both methods allow you to truncate a subpartition, which means you're removing rows for a date and region:

create table t (
  region varchar2(10),
  dt     date,
  stuff  varchar2(1000)
) partition by range (dt) interval (interval '1' day) 
  subpartition by list (region)
  subpartition template (
    subpartition APAC values ('APAC'),
    subpartition EMEA values ('EMEA')
  ) (
  partition p0 values less than (date'2017-03-01')
);

insert into t
  select 'APAC', sysdate+rownum, 'XXXXXX' from dual
  connect by level <= 5;

commit;

select par.partition_name, par.high_value, sub.subpartition_name, sub.high_value 
from   sys.user_tab_subpartitions sub
join   sys.user_tab_partitions par
on     sub.partition_name = par.partition_name
where  par.table_name = 'T';

PARTITION_NAME  HIGH_VALUE                                                                           SUBPARTITION_NAME  HIGH_VALUE  
SYS_P15263      TO_DATE(' 2017-03-08 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')  SYS_SUBP15262      'EMEA'      
SYS_P15266      TO_DATE(' 2017-03-09 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')  SYS_SUBP15264      'APAC'      
SYS_P15266      TO_DATE(' 2017-03-09 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')  SYS_SUBP15265      'EMEA'      
SYS_P15269      TO_DATE(' 2017-03-10 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')  SYS_SUBP15267      'APAC'      
SYS_P15269      TO_DATE(' 2017-03-10 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')  SYS_SUBP15268      'EMEA'      
SYS_P15272      TO_DATE(' 2017-03-11 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')  SYS_SUBP15270      'APAC'      
SYS_P15272      TO_DATE(' 2017-03-11 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')  SYS_SUBP15271      'EMEA'      
SYS_P15275      TO_DATE(' 2017-03-12 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')  SYS_SUBP15273      'APAC'      
SYS_P15275      TO_DATE(' 2017-03-12 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')  SYS_SUBP15274      'EMEA'      
P0              TO_DATE(' 2017-03-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')  P0_APAC            'APAC'      
P0              TO_DATE(' 2017-03-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')  P0_EMEA            'EMEA'      
SYS_P15263      TO_DATE(' 2017-03-08 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')  SYS_SUBP15261

select * from t subpartition (SYS_SUBP15264);

REGION  DT                    STUFF   
APAC    08-MAR-2017 06:57:55  XXXXXX 

alter table t truncate subpartition SYS_SUBP15264;

select * from t subpartition (SYS_SUBP15264);

 0 rows selected 

drop table t purge;
create table t (
  region varchar2(10),
  dt     date,
  stuff  varchar2(1000)
) partition by list (region)  
  subpartition by range (dt)
  subpartition template (
    subpartition p1 values less than (date'2017-03-01'),
    subpartition p2 values less than (date'2017-03-07'),
    subpartition p3 values less than (date'2017-03-08'),
    subpartition p4 values less than (date'2017-03-09'),
    subpartition p5 values less than (date'2017-03-10'),
    subpartition pmax values less than (maxvalue)
  ) (
  partition APAC values ('APAC'),
  partition EMEA values ('EMEA')
);

select partition_name, subpartition_name from sys.user_tab_subpartitions
where  table_name = 'T';

insert into t
  select 'APAC', sysdate+rownum, 'XXXXXX' from dual
  connect by level <= 5;

commit;

select * from t subpartition (APAC_P3);

REGION  DT                    STUFF   
APAC    07-MAR-2017 07:00:12  XXXXXX 

alter table t truncate subpartition APAC_p3;

select * from t subpartition (APAC_P3);

 0 rows selected 


Beyond that, test how your queries perform using the two methods. Then evaluate to decide which is the most suitable for you.

interval '1' day

March 06, 2017 - 3:14 pm UTC

Reviewer: Nikhil

Hi Chris,
Thanks for giving below inputs, definitely useful for me. Thanks a lot for that!!

One query here - we are on 11g and partition option interval '1' day is not working in case range partition only interval month/year is getting acception while creating table.
Does it applicable only in 12.2?
Chris Saxon

Followup  

March 06, 2017 - 5:45 pm UTC

What is your create table statement?

Subpartitioning by interval is only available in 12.2 as shown in the example above. Top-level partitioning by interval should work.

Inputs on LiveSQL

March 07, 2017 - 12:52 pm UTC

Reviewer: A reader

Hi Chirs,

Please have a look at table created in LiveSQL - https://livesql.oracle.com/apex/livesql/s/eoawd03g1uacx5u8egqeq13kg

I have created Range Partition on STMT_CREATED_ON, but when i try to execute SELECT with STMT_CREATED_ON in WHERE clause, PLAN still shows TABLE_ACCESS_FULL.

EXPLAIN PLAN FOR
select * from tb_123456 where STMT_CREATED_ON = '09-SEP-2016';

Plan hash value: 1196700793

----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 17508 | 14 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 4 | 17508 | 14 (0)| 00:00:01 | 1 | 1 |
|* 2 | TABLE ACCESS FULL | TB_123456 | 4 | 17508 | 14 (0)| 00:00:01 | 1 | 1 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("STMT_CREATED_ON"=TO_DATE(' 2016-09-09 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Note
-----
- dynamic sampling used for this statement (level=2)
Chris Saxon

Followup  

March 07, 2017 - 3:05 pm UTC

And above that it says

"PARTITION RANGE SINGLE"

Meaning it's only accessed one partition! This is what you'd expect.

Though you have made your initial partition 1 March 2017. And inserted dates in Sept 2016. Interval partitioning only creates new partitions for rows above the highest value in the partitions you defined at table creation. It doesn't split the existing ones.

If you want your September dates to go in different partitions, you'll need to make the high value for p0 sometime in 2016.

Perfect!!

March 07, 2017 - 3:18 pm UTC

Reviewer: Nikhil

Hi Chris,
Thanks for your quick response.

On last query -
This table will hold around 500k records for one date.
Since I will be accessing this table using one comman condition I.e. Date along with card_no.

Does it require to create any local Index on card_no if created_on is used along with it??
Any suggestion?

Connor McDonald

Followup  

March 09, 2017 - 1:46 am UTC

"Does it require to create any local Index on card_no if created_on is used along with it?? "

As with any index, the decision criteria is simple:

- Does the benefit I get from having the index outweigh the cost of maintaining it.

Only you can answer that question

Blank Partitions

March 10, 2017 - 9:48 am UTC

Reviewer: A reader

Hi Chris/Connor,

Have created Partition based on dt & region and queries on this table are running very fast as expected.
At the end of week, we are deleting old records from table as a part of Purging activity.
After prging data from older partition are gettting delete, still i can find those partitions in user_tab_partition table.

Do i need to take any action on this like drop partition, etc.


create table t (
region varchar2(10),
dt date,
stuff varchar2(1000)
) partition by range (dt) interval (interval '1' day)
subpartition by list (region)
subpartition template (
subpartition APAC values ('APAC'),
subpartition EMEA values ('EMEA')
) (
partition p0 values less than (date'2017-03-01')
);


Chris Saxon

Followup  

March 10, 2017 - 4:29 pm UTC

If you want to remove partitions, then yes, you have to drop them.

12.2 Interval partitioning

March 15, 2017 - 10:22 pm UTC

Reviewer: Rozalicska from Hungary

Hi Chris,

I'm a little confused.
I ran your sql on livesql.oracle.com, and I got an error.

create table t (
  region varchar2(10),
  dt     date,
  stuff  varchar2(1000)
) partition by list (region)  
  subpartition by range (dt) interval (interval '1' day)
  subpartition template (
    subpartition p1 values less than (date'2017-03-01')
  ) (
  partition APAC values ('APAC'),
  partition EMEA values ('EMEA') 
);


ORA-14179: An unsupported partitioning method was specified in this context.


Besides that I read about this in the 12.2 SQL Reference, and the Reference says:
Interval partitioning is not supported at the subpartition level.
http://docs.oracle.com/database/122/SQLRF/CREATE-TABLE.htm#GUID-F9CE0CC3-13AE-4744-A43C-EAC7A71AAAB6__BABGCEFI

What did I miss here?

Thanks

Rozalicska
Chris Saxon

Followup  

March 17, 2017 - 3:26 pm UTC

You're right, my mistake...

July 13, 2020 - 5:17 am UTC

Reviewer: Rich from USA

Hi Chris,

Database is on Oracle 12.2

I have a table which is currently range partitioned on timestamp column. Application pulls the data from this table and insert the data into another table (history table) every 1 hr based on another column db_update_time (time range). Daily partition data volume is 160 million rows every day. Every hour the partition delta volume would be about ~10 million rows. So we are thinking of subpartition the DB_update_time to reduce the I/O against the partition by creating hourly subpartition (0-23 hrs) on DB_UPDAET_TIME.

so, partition by range ( SUPP_SEQ_TIME TIMESTAMP column),
subpartition by ( DB_UPDATE_TIME TIMESTAMP)

Can we use range-list subpartition for hourly paritioning.

Can you please suggest how to implement hourly subpartition on Oracle 12.2..



Chris Saxon

Followup  

July 13, 2020 - 12:51 pm UTC

How are you expecting subpartitioning to decrease the I/O? You'll still be inserting the same number of rows, right?

To subpartition by hour, you can create a virtual column that does

extract ( hour from date_col )


and subpartition by this for the values 0 - 23.

July 13, 2020 - 11:20 pm UTC

Reviewer: Rich from USA

Hi Chris,

Thanks!

Since the data is pulled hourly from the source table to target table based on time range on db_update_time, we will have only few million rows in hourly subpartition instead 10's of million of rows at partition level.

supp_sequene_time is partition column
db_update_time is subpartition column

Query has the condition like, db_update_time is pulled every hour.

Since db_time_update has time_range (1hr) passed, With virtual partitioning (hourly subpartition), will the query able to partition elimination to the specific hour partition ?

Where supp_seq_time > to_timestamp('06-24-2020 00:00:00','MM-DD-YYYY HH24:MI:SS')
AND supp_seq_time <= to_timestamp('06-27-2020 00:00:00','MM-DD-YYYY HH24:MI:SS')
AND db_update_time > to_timestamp('06-26-2020 14:00:00','MM-DD-YYYY HH24:MI:SS')
AND db_update_time <= to_timestamp('06-26-2020 15:00:00','MM-DD-YYYY HH24:MI:SS');

Chris Saxon

Followup  

July 14, 2020 - 10:06 am UTC

You'll need to include the virtual column (or the same expression you use in it) in your query:

create table t (
  c1 int, c2 date, c3 timestamp,
  c4 number as ( extract ( hour from c3 ) ) 
) partition by range ( c2 ) 
  interval ( interval '1' day )
  subpartition by list ( c4 ) 
  subpartition template (
    subpartition p0 values ( 0 ),
    subpartition p1 values ( 1 ),
    subpartition p2 values ( 2 ),
    subpartition p3 values ( 3 ),
    subpartition p4 values ( 4 ),
    subpartition p5 values ( 5 ),
    subpartition p6 values ( 6 ),
    subpartition p7 values ( 7 ),
    subpartition p8 values ( 8 ),
    subpartition p9 values ( 9 ),
    subpartition p10 values ( 10 ),
    subpartition p11 values ( 11 ),
    subpartition p12 values ( 12 ),
    subpartition p13 values ( 13 ),
    subpartition p14 values ( 14 ),
    subpartition p15 values ( 15 ),
    subpartition p16 values ( 16 ),
    subpartition p17 values ( 17 ),
    subpartition p18 values ( 18 ),
    subpartition p19 values ( 19 ),
    subpartition p20 values ( 20 ),
    subpartition p21 values ( 21 ),
    subpartition p22 values ( 22 ),
    subpartition p23 values ( 23 )
  ) (
    partition p0 values less than ( date'2020-01-01' )
  );

insert into t ( c1, c2, c3 )
with rws as (
  select trunc ( sysdate ) + ( level /24 ) dt 
  from dual
  connect by level <= 24
)
  select rownum, dt, dt from rws;
  
set serveroutput off
select * from t
where  c2 >= trunc ( sysdate )
and    c2 < trunc ( sysdate ) + 1
and    c3 >= trunc ( sysdate ) + 1/24 
and    c3 < trunc ( sysdate ) + 2/24;

select * 
from   table(dbms_xplan.display_cursor(null, null, 'BASIC LAST +PARTITION'));

----------------------------------------------------------                 
| Id  | Operation                 | Name | Pstart| Pstop |                 
----------------------------------------------------------                 
|   0 | SELECT STATEMENT          |      |       |       |                 
|   1 |  FILTER                   |      |       |       |                 
|   2 |   PARTITION RANGE ITERATOR|      |   KEY |   KEY |                 
|   3 |    PARTITION LIST ALL     |      |     1 |    24 |                 
|   4 |     TABLE ACCESS FULL     | T    |   KEY |   KEY |                 
---------------------------------------------------------- 

select * from t
where  c2 >= trunc ( sysdate )
and    c2 < trunc ( sysdate ) + 1
and    c4 >= 1
and    c4 < 2;

select * 
from   table(dbms_xplan.display_cursor(null, null, 'BASIC LAST +PARTITION'));

----------------------------------------------------------             
| Id  | Operation                 | Name | Pstart| Pstop |             
----------------------------------------------------------             
|   0 | SELECT STATEMENT          |      |       |       |             
|   1 |  FILTER                   |      |       |       |             
|   2 |   PARTITION RANGE ITERATOR|      |   KEY |   KEY |             
|   3 |    PARTITION LIST SINGLE  |      |     2 |     2 |             
|   4 |     TABLE ACCESS FULL     | T    |   KEY |   KEY |             
----------------------------------------------------------

select * from t
where  c2 >= trunc ( sysdate )
and    c2 < trunc ( sysdate ) + 1
and    extract ( hour from c3 ) >= 1
and    extract ( hour from c3 ) < 2;

select * 
from   table(dbms_xplan.display_cursor(null, null, 'BASIC LAST +PARTITION'));

----------------------------------------------------------              
| Id  | Operation                 | Name | Pstart| Pstop |              
----------------------------------------------------------              
|   0 | SELECT STATEMENT          |      |       |       |              
|   1 |  FILTER                   |      |       |       |              
|   2 |   PARTITION RANGE ITERATOR|      |   KEY |   KEY |              
|   3 |    PARTITION LIST SINGLE  |      |     2 |     2 |              
|   4 |     TABLE ACCESS FULL     | T    |   KEY |   KEY |              
---------------------------------------------------------- 

Running Insert statements in a Anonymous block

July 22, 2020 - 5:20 pm UTC

Reviewer: Rajasekhar from India

Hello chris,

Is it possible to continue the begin end block, to insert value 3 even after getting exception for 1 and 2 values as unique constraint violation
for below code. sorry, if it doesn't make sense
create table test_2(id number primary key);

insert into test_2 values ('1');
insert into test_2 values ('2');
commit;


declare
v_Count number:=0;
begin
insert into test_2 values ('1');
insert into test_2 values ('2');
insert into test_2 values ('3');
exception
when DUP_VAL_ON_INDEX then
v_count:=v_vount+1;
end;
/


Thanks in advance


Connor McDonald

Followup  

July 23, 2020 - 4:00 am UTC

You'll need nested begin-ends for each insert, or better, look at doing FORALL for a BULK EXCEPTIONS handler.

Example here

https://livesql.oracle.com/apex/livesql/file/content_CMSE3VBYFCO9IX3PGE4S5C1Q8.html

August 10, 2020 - 6:10 am UTC

Reviewer: Rich from USA

Hi Chris,

Database is on Oracle 12.2.

I have a table which is currently range partitioned on timestamp column (test_sequence_time).
Application pulls the data from this table and insert the data into another table (history table) every 1 hr based on another column db_update_time (timestamp).
Query has JSON column and so JSON parsing involved.

Daily partition data volume is 160 million rows every day. Every hour the partition delta volume would be about ~10 million rows.

I have composite index "TEST_QUOTE_JSON_test_SEQ_UPDATE_TIME_IDX" on (test_Sequence_time, db_update_time) columns.


1) When i run this INSERT..SELECT, the plan which uses INDEX RANGE SCAN , it pulls the data from the source table using this composite index, its very slow it takes ~30 min to insert about ~15 million rows.
I/O usage is about 39GB


14962962 rows created.

Elapsed: 00:28:39.18


2) When i run this INSERT..SELECT, the plan which uses TABLE ACCESS STORAGE FULL, (parallel range scan partitions), it takes about 1 min 29 secs to insert ~15 million rows. But I/O usage is ~336 GB.

14962962 rows created.

Elapsed: 00:01:29.23


My questions:
=============

1) With index range scan, any idea why its going very slow Vs cell smart scan (Table access storage full). This is on Exadata.
Partition pruning is happening. I have collected stats on this table and index.

2) Any ways to improve the performance the index range scan plan (I/O is 39 GB) as Parallel range scan due to Parallel hint does lot of I/O (336 GB).



Below are the details:


TABLE_NAME PARTITION_POSITION PARTITION_NAME HIGH_VALUE PCT_FREE SEGM NUM_ROWS
---------------------------------------- ------------------ ---------------------------------------- ----------------------------------- ---------- ---- ----------
TEST_QUOTE_JSON_PART 1 V_QUOTE_JSON_PART_P0 TIMESTAMP' 2020-06-19 00:00:00' 10 NO
TEST_QUOTE_JSON_PART 2 SYS_P36501 TIMESTAMP' 2020-06-20 00:00:00' 10 YES 138746562
TEST_QUOTE_JSON_PART 3 SYS_P36541 TIMESTAMP' 2020-06-21 00:00:00' 10 YES 99009
TEST_QUOTE_JSON_PART 4 SYS_P36519 TIMESTAMP' 2020-06-22 00:00:00' 10 YES 1663940
TEST_QUOTE_JSON_PART 5 SYS_P36504 TIMESTAMP' 2020-06-23 00:00:00' 10 YES 158098563
TEST_QUOTE_JSON_PART 6 SYS_P36513 TIMESTAMP' 2020-06-24 00:00:00' 10 YES 162893808
TEST_QUOTE_JSON_PART 7 SYS_P36516 TIMESTAMP' 2020-06-25 00:00:00' 10 YES 175050916
TEST_QUOTE_JSON_PART 8 SYS_P36507 TIMESTAMP' 2020-06-26 00:00:00' 10 YES 178352935
TEST_QUOTE_JSON_PART 9 SYS_P36510 TIMESTAMP' 2020-06-27 00:00:00' 10 YES 155734642

9 rows selected.



SQL> INSERT /*+ ENABLE_PARALLEL_DML PARALLEL(tgt,16) */ into TEST.TEST_QUOTE_TMP tgt(uid_md,test_entity_id,test_key,test_sequence_time,test_source,test_entity_data_type,test_entity_type
2 ,quote_time,quote_source,bid_size,bid_price,bid_yield,ask_size,ask_price,ask_yield,quote_time_estimated_from)
3 --SELECT /*+ MONITOR PARALLEL(src,8) */
4 SELECT /*+ MONITOR */
7 uid_md,
8 test_entity_id,
9 CASE
10 WHEN test_entity_data_type = 'RTSXref' THEN JSON_VALUE(json,'$.RTS_TICKER')
11 || ', '
12 || JSON_VALUE(json,'$.RTS_SOURCE_ID')
13 END,
14 test_sequence_time,
15 test_source,
16 test_entity_data_type,
17 test_entity_type,
18 CASE
19 WHEN JSON_VALUE(json,'$[*].QUOTE_TIME') IS NOT NULL
20 AND JSON_VALUE(json,'$[*].QUOTE_TIME') <> 'null' THEN to_timestamp(JSON_VALUE(json,'$[*].QUOTE_TIME'),'YYYYMMDD-HH24:MI:SS.FF'
21 )
22 END AS quote_time,
23 JSON_VALUE(json,'$.QUOTE_SOURCE' RETURNING VARCHAR2) end,
24 CASE JSON_VALUE(json,'$[*].BID_SIZE')
25 WHEN 'null' THEN NULL
26 ELSE to_number(regexp_replace(JSON_VALUE(json,'$.BID_SIZE'),'\,+','') )
27 END,
28 CASE JSON_VALUE(json,'$[*].BID_PRICE')
29 WHEN 'null' THEN NULL
30 ELSE to_number(regexp_replace(JSON_VALUE(json,'$.BID_PRICE'),'\,+','') )
31 END,
32 CASE JSON_VALUE(json,'$[*].BID_YIELD')
33 WHEN 'null' THEN NULL
34 ELSE to_number(regexp_replace(JSON_VALUE(json,'$.BID_YIELD'),'\,+','') )
35 END,
36 CASE JSON_VALUE(json,'$[*].ASK_SIZE')
37 WHEN 'null' THEN NULL
38 ELSE to_number(regexp_replace(JSON_VALUE(json,'$.ASK_SIZE'),'\,+','') )
39 END,
40 CASE JSON_VALUE(json,'$[*].ASK_PRICE')
41 WHEN 'null' THEN NULL
42 ELSE to_number(regexp_replace(JSON_VALUE(json,'$.ASK_PRICE'),'\,+','') )
43 END,
44 CASE JSON_VALUE(json,'$[*].ASK_YIELD')
45 WHEN 'null' THEN NULL
46 ELSE to_number(regexp_replace(JSON_VALUE(json,'$.ASK_YIELD'),'\,+','') )
47 END,
48 CASE
49 WHEN JSON_VALUE(json,'$[*].QUOTE_TIME') IS NOT NULL
50 AND JSON_VALUE(json,'$[*].QUOTE_TIME') <> 'null' THEN NULL
51 END AS quote_time_estimated_from
52 FROM
53 test.TEST_QUOTE_JSON_PART src
55 WHERE
56 ( CASE
57 WHEN JSON_VALUE(json,'$[*].QUOTE_TIME') IS NOT NULL
58 AND JSON_VALUE(json,'$[*].QUOTE_TIME') <> 'null' THEN to_timestamp(JSON_VALUE(json,'$[*].QUOTE_TIME'),'YYYYMMDD-HH24:MI:SS.FF'
59 )
60 END ) >= TO_DATE('01-01-2012','MM-DD-YYYY')
61 AND ( test_source,
62 test_entity_data_type,
63 test_entity_type ) IN (
64 ( 'V-RTS/1105',
65 'RTSXref',
66 'Instrument' ),
67 ( 'V-RTS/205',
68 'RTSXref',
69 'Instrument' ),
70 ( 'V-RTS/257',
71 'RTSXref',
72 'Instrument' ),
73 ( 'V-RTS/258',
74 'RTSXref',
75 'Instrument' ),
76 ( 'V-RTS/297',
77 'RTSXref',
78 'Instrument' ),
79 ( 'V-RTS/300',
80 'RTSXref',
81 'Instrument' ),
82 ( 'V-RTS/309',
83 'RTSXref',
84 'Instrument' ),
85 ( 'V-RTS/314',
86 'RTSXref',
87 'Instrument' ),
88 ( 'V-RTS/344',
89 'RTSXref',
90 'Instrument' ),
91 ( 'V-RTS/357',
92 'RTSXref',
93 'Instrument' ),
94 ( 'V-RTS/649',
95 'RTSXref',
96 'Instrument' ),
97 ( 'V-RTS/655',
98 'RTSXref',
99 'Instrument' ),
100 ( 'V-RTS/692',
101 'RTSXref',
102 'Instrument' ),
103 ( 'V-RTS/747',
104 'RTSXref',
105 'Instrument' ),
106 ( 'V-RTS/779',
107 'RTSXref',
108 'Instrument' ),
109 ( 'V-RTS/808',
110 'RTSXref',
111 'Instrument' ),
112 ( 'V-RTS/839',
113 'RTSXref',
114 'Instrument' ),
115 ( 'V-RTS/848',
116 'RTSXref',
117 'Instrument' ),
118 ( 'V-RTS/877',
119 'RTSXref',
120 'Instrument' ),
121 ( 'V-RTS/882',
122 'RTSXref',
123 'Instrument' ),
124 ( 'V-RTS/904',
125 'RTSXref',
126 'Instrument' ),
127 ( 'V-RTS/918',
128 'RTSXref',
129 'Instrument' ),
130 ( 'V-RTS/949',
131 'RTSXref',
132 'Instrument' ),
133 ( 'V-RTS/953',
134 'RTSXref',
135 'Instrument' ),
136 ( 'V-RTS/968',
137 'RTSXref',
138 'Instrument' ),
139 ( 'V-RTS/978',
140 'RTSXref',
141 'Instrument' )
142 )
143 AND test_sequence_time > to_timestamp('06-24-2020 00:00:00','MM-DD-YYYY HH24:MI:SS')
144 AND test_sequence_time <= to_timestamp('06-27-2020 00:00:00','MM-DD-YYYY HH24:MI:SS')
145 AND db_update_time > to_timestamp('06-26-2020 14:00:00','MM-DD-YYYY HH24:MI:SS')
146 AND db_update_time <= to_timestamp('06-26-2020 15:00:00','MM-DD-YYYY HH24:MI:SS')
;



Plan hash value: 1996136171

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | 302M(100)| | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10002 | 6827M| 3351G| 302M (3)| 03:17:06 | | | Q1,02 | P->S | QC (RAND) |
| 3 | INDEX MAINTENANCE | TEST_QUOTE_TMP | | | | | | | Q1,02 | PCWP | |
| 4 | PX RECEIVE | | 6827M| 3351G| 302M (3)| 03:17:06 | | | Q1,02 | PCWP | |
| 5 | PX SEND RANGE | :TQ10001 | 6827M| 3351G| 302M (3)| 03:17:06 | | | Q1,01 | P->P | RANGE |
| 6 | LOAD AS SELECT (HYBRID TSM/HWMB) | TEST_QUOTE_TMP | | | | | | | Q1,01 | PCWP | |
| 7 | PX RECEIVE | | 6827M| 3351G| 302M (3)| 03:17:06 | | | Q1,01 | PCWP | |
| 8 | PX SEND ROUND-ROBIN | :TQ10000 | 6827M| 3351G| 302M (3)| 03:17:06 | | | | S->P | RND-ROBIN |
| 9 | NESTED LOOPS | | 6827M| 3351G| 302M (3)| 03:17:06 | | | | | |
| 10 | PARTITION RANGE ITERATOR | | 16M| 7892M| 4606K (1)| 00:03:00 | 7 | 10 | | | |
| 11 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| TEST_QUOTE_JSON_PART | 16M| 7892M| 4606K (1)| 00:03:00 | 7 | 10 | | | |
| 12 | INDEX RANGE SCAN | TEST_QUOTE_JSON_test_SEQ_UPDATE_TIME_IDX | 4073K| | 2297K (1)| 00:01:30 | 7 | 10 | | | |
| 13 | JSONTABLE EVALUATION | | | | | | | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

1 - SEL$B0868565
11 - SEL$B0868565 / SRC@SEL$1
12 - SEL$B0868565 / SRC@SEL$1

Outline Data
-------------

/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
DB_VERSION('12.2.0.1')
OPT_PARAM('_px_adaptive_dist_method' 'off')
OPT_PARAM('_optimizer_strans_adaptive_pruning' 'false')
OPT_PARAM('_optimizer_nlj_hj_adaptive_join' 'false')
OPT_PARAM('optimizer_dynamic_sampling' 8)
ALL_ROWS
OUTLINE_LEAF(@"SEL$B0868565")
MERGE(@"SEL$D317C8F7" >"SEL$1")
OUTLINE_LEAF(@"INS$1")
OUTLINE(@"SEL$1")
OUTLINE(@"SEL$D317C8F7")
OUTLINE(@"SEL$828D59B9")
PQ_DISTRIBUTE(@"INS$1" "TGT"@"INS$1" RANDOM)
FULL(@"INS$1" "TGT"@"INS$1")
INDEX_RS_ASC(@"SEL$B0868565" "SRC"@"SEL$1" ("TEST_QUOTE_JSON_PART"."test_SEQUENCE_TIME" "TEST_QUOTE_JSON_PART"."DB_UPDATE_TIME"))
BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$B0868565" "SRC"@"SEL$1")
FULL(@"SEL$B0868565" "P"@"SEL$828D59B9")
LEADING(@"SEL$B0868565" "SRC"@"SEL$1" "P"@"SEL$828D59B9")
USE_NL(@"SEL$B0868565" "P"@"SEL$828D59B9")
END_OUTLINE_DATA
*/

Note
-----
- dynamic statistics used: dynamic sampling (level=8)
- Degree of Parallelism is 16 because of table property





With index plan:

Plan hash value: 1996136171

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | 302M(100)| | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10002 | 6827M| 3351G| 302M (3)| 03:17:06 | | | Q1,02 | P->S | QC (RAND) |
| 3 | INDEX MAINTENANCE | TEST_QUOTE_TMP | | | | | | | Q1,02 | PCWP | |
| 4 | PX RECEIVE | | 6827M| 3351G| 302M (3)| 03:17:06 | | | Q1,02 | PCWP | |
| 5 | PX SEND RANGE | :TQ10001 | 6827M| 3351G| 302M (3)| 03:17:06 | | | Q1,01 | P->P | RANGE |
| 6 | LOAD AS SELECT (HYBRID TSM/HWMB) | TEST_QUOTE_TMP | | | | | | | Q1,01 | PCWP | |
| 7 | PX RECEIVE | | 6827M| 3351G| 302M (3)| 03:17:06 | | | Q1,01 | PCWP | |
| 8 | PX SEND ROUND-ROBIN | :TQ10000 | 6827M| 3351G| 302M (3)| 03:17:06 | | | | S->P | RND-ROBIN |
| 9 | NESTED LOOPS | | 6827M| 3351G| 302M (3)| 03:17:06 | | | | | |
| 10 | PARTITION RANGE ITERATOR | | 16M| 7892M| 4606K (1)| 00:03:00 | 7 | 10 | | | |
| 11 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| TEST_QUOTE_JSON_PART | 16M| 7892M| 4606K (1)| 00:03:00 | 7 | 10 | | | |
| 12 | INDEX RANGE SCAN | TEST_QUOTE_JSON_test_SEQ_UPDATE_TIME_IDX | 4073K| | 2297K (1)| 00:01:30 | 7 | 10 | | | |
| 13 | JSONTABLE EVALUATION | | | | | | | | | | |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

1 - SEL$B0868565
11 - SEL$B0868565 / SRC@SEL$1
12 - SEL$B0868565 / SRC@SEL$1

Outline Data
-------------

/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
DB_VERSION('12.2.0.1')
OPT_PARAM('_px_adaptive_dist_method' 'off')
OPT_PARAM('_optimizer_strans_adaptive_pruning' 'false')
OPT_PARAM('_optimizer_nlj_hj_adaptive_join' 'false')
OPT_PARAM('optimizer_dynamic_sampling' 8)
ALL_ROWS
OUTLINE_LEAF(@"SEL$B0868565")
MERGE(@"SEL$D317C8F7" >"SEL$1")
OUTLINE_LEAF(@"INS$1")
OUTLINE(@"SEL$1")
OUTLINE(@"SEL$D317C8F7")
OUTLINE(@"SEL$828D59B9")
PQ_DISTRIBUTE(@"INS$1" "TGT"@"INS$1" RANDOM)
FULL(@"INS$1" "TGT"@"INS$1")
INDEX_RS_ASC(@"SEL$B0868565" "SRC"@"SEL$1" ("TEST_QUOTE_JSON_PART"."test_SEQUENCE_TIME" "TEST_QUOTE_JSON_PART"."DB_UPDATE_TIME"))
BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$B0868565" "SRC"@"SEL$1")
FULL(@"SEL$B0868565" "P"@"SEL$828D59B9")
LEADING(@"SEL$B0868565" "SRC"@"SEL$1" "P"@"SEL$828D59B9")
USE_NL(@"SEL$B0868565" "P"@"SEL$828D59B9")
END_OUTLINE_DATA
*/

Note
-----
- dynamic statistics used: dynamic sampling (level=8)
- Degree of Parallelism is 16 because of table property




With Parallel partition range scan (same query as above with hint SELET /*+ PARALLEL(src,16) ...;)
==================================

Plan hash value: 3614240307

-------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | 4180K(100)| | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 1204M| 591G| 4180K (3)| 00:02:44 | | | Q1,01 | P->S | QC (RAND) |
| 3 | INDEX MAINTENANCE | TEST_QUOTE_TMP | | | | | | | Q1,01 | PCWP | |
| 4 | PX RECEIVE | | 1204M| 591G| 4180K (3)| 00:02:44 | | | Q1,01 | PCWP | |
| 5 | PX SEND RANGE | :TQ10000 | 1204M| 591G| 4180K (3)| 00:02:44 | | | Q1,00 | P->P | RANGE |
| 6 | LOAD AS SELECT (HYBRID TSM/HWMB)| TEST_QUOTE_TMP | | | | | | | Q1,00 | PCWP | |
| 7 | NESTED LOOPS | | 1204M| 591G| 4180K (3)| 00:02:44 | | | Q1,00 | PCWP | |
| 8 | PX BLOCK ITERATOR | | 2950K| 1392M| 526K (1)| 00:00:21 | 7 | 10 | Q1,00 | PCWC | |
| 9 | TABLE ACCESS STORAGE FULL | TEST_QUOTE_JSON_PART | 2950K| 1392M| 526K (1)| 00:00:21 | 7 | 10 | Q1,00 | PCWP | |
| 10 | JSONTABLE EVALUATION | | | | | | | | Q1,00 | PCWP | |
-------------------------------------------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

1 - SEL$B0868565
9 - SEL$B0868565 / SRC@SEL$1

Outline Data
-------------

/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
DB_VERSION('12.2.0.1')
OPT_PARAM('optimizer_dynamic_sampling' 8)
OPT_PARAM('_px_adaptive_dist_method' 'off')
OPT_PARAM('_optimizer_strans_adaptive_pruning' 'false')
OPT_PARAM('_optimizer_nlj_hj_adaptive_join' 'false')
ALL_ROWS
OUTLINE_LEAF(@"SEL$B0868565")
MERGE(@"SEL$D317C8F7" >"SEL$1")
OUTLINE_LEAF(@"INS$1")
OUTLINE(@"SEL$1")
OUTLINE(@"SEL$D317C8F7")
OUTLINE(@"SEL$828D59B9")
PQ_DISTRIBUTE(@"INS$1" "TGT"@"INS$1" NONE)
FULL(@"INS$1" "TGT"@"INS$1")
FULL(@"SEL$B0868565" "SRC"@"SEL$1")
FULL(@"SEL$B0868565" "P"@"SEL$828D59B9")
LEADING(@"SEL$B0868565" "SRC"@"SEL$1" "P"@"SEL$828D59B9")
USE_NL(@"SEL$B0868565" "P"@"SEL$828D59B9")
PQ_DISTRIBUTE(@"SEL$B0868565" "P"@"SEL$828D59B9" NONE BROADCAST)
END_OUTLINE_DATA
*/


Chris Saxon

Followup  

August 10, 2020 - 8:54 am UTC

A couple of things:

- Those look like EXPLAIN plans not EXECUTION plans - i.e. predictions about what's happening instead of what's actually happening. Please share the SQL monitor reports, showing the real execution details.
- Please wrap code in code tags preserving the whitespace! It's hard to read the plans without this

1. The interesting part is here:

| 9  | NESTED LOOPS                               |                                          | 6827M| 3351G|  302M (3)| 03:17:06 | | | | | |
| 10 |  PARTITION RANGE ITERATOR                  |                                          |   16M| 7892M| 4606K (1)| 00:03:00 | 7 | 10 | | | |
| 11 |   TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| TEST_QUOTE_JSON_PART                     |   16M| 7892M| 4606K (1)| 00:03:00 | 7 | 10 | | | |
| 12 |    INDEX RANGE SCAN                        | TEST_QUOTE_JSON_test_SEQ_UPDATE_TIME_IDX | 4073K|      | 2297K (1)| 00:01:30 | 7 | 10 | | | |
| 13 |  JSONTABLE EVALUATION | | | | | | | | | | |


Assuming these predictions are accurate, the index is finding 4 million rows and doing 16 million indexed lookups of the table. That's going to take a while! The database is fetching one row at a time => lots of I/O => slow.

2. Share the SQL monitor reports and we'll see if we can help you.

August 10, 2020 - 1:54 pm UTC

Reviewer: Rich from USA

Hi Chris

I have SQL monitor report in html. Can i send you as an attachment via email ? thanks.
Chris Saxon

Followup  

August 11, 2020 - 2:23 pm UTC

select dbms_sqltune.report_sql_monitor(
sql_id => '<sqlid>',
type => 'TEXT',
report_level => 'ALL')
from dual;

SQL monitor report

August 11, 2020 - 4:41 am UTC

Reviewer: Rich from USA

REPORT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Monitoring Report

SQL Text
------------------------------
INSERT /*+ ENABLE_PARALLEL_DML PARALLEL(tgt,16) */ into test.test_quote_tmp tgt(uid_md,omega_entity_id,omega_key,omega_sequence_time,omega_source,omega_entity_data_type,omega_entity_type ,quote_time,quote_source,bid_size,bid_price,bid_yield,ask_size,ask_price,ask_yield,quote_time_estimated_from)
SELECT /*+ MONITOR */ uid_md, omega_entity_id, CASE WHEN omega_entity_data_type = 'RTSXref' THEN JSON_VALUE(json,'$.RTS_TICKER') || ', ' || JSON_VALUE(json,'$.RTS_SOURCE_ID') END, omega_sequence_time, omega_source, omega_entity_data_type, omega_entity_type, CASE WHEN JSON_VALUE(json,'$[*].QUOTE_TIME') IS NOT NULL AND JSON_VALUE(json,'$[*].QUOTE_TIME') <> 'null' THEN to_timestamp(JSON_VALUE(json,'$[*].QUOTE_TIME'),'YYYYMMDD-HH24:MI:SS.FF' ) END
AS quote_time, JSON_VALUE(json,'$.QUOTE_SOURCE' RETURNING VARCHAR2) end, CASE JSON_VALUE(json,'$[*].BID_SIZE') WHEN 'null' THEN NULL ELSE to_number(regexp_replace(JSON_VALUE(json,'$.BID_SIZE'),'\,+','') ) END, CASE JSON_VALUE(json,'$[*].BID_PRICE') WHEN 'null' THEN NULL ELSE to_number(regexp_replace(JSON_VALUE(json,'$.BID_PRICE'),'\,+','') ) END, CASE JSON_VALUE(json,'$[*].BID_YIELD') WHEN 'null' THEN NULL ELSE to_number(regexp_replace(JSON_VALUE(json,'$.BID_YIELD'),'\,+','') ) END, CASE
JSON_VALUE(json,'$[*].ASK_SIZE') WHEN 'null' THEN NULL ELSE to_number(regexp_replace(JSON_VALUE(json,'$.ASK_SIZE'),'\,+','') ) END, CASE JSON_VALUE(json,'$[*].ASK_PRICE') WHEN 'null' THEN NULL ELSE to_number(regexp_replace(JSON_VALUE(json,'$.ASK_PRICE'),'\,+','') ) END, CASE JSON_VALUE(json,'$[*].ASK_YIELD') WHEN 'null' THEN NULL ELSE to_number(regexp_replace(JSON_VALUE(json,'$.ASK_YIELD'),'\,+','') ) END, CASE WHEN JSON_VALUE(json,'$[*].QUOTE_TIME') IS NOT NULL AND
JSON_VALUE(json,'$[*].QUOTE_TIME') <> 'null' THEN NULL END AS quote_time_estimated_from FROM test.test_quote_json_part src WHERE ( CASE WHEN JSON_VALUE(json,'$[*].QUOTE_TIME') IS NOT NULL AND JSON_VALUE(json,'$[*].QUOTE_TIME') <> 'null' THEN to_timestamp(JSON_VALUE(json,'$[*].QUOTE_TIME'),'YYYYMMDD-HH24:MI:SS.FF' ) END ) >= TO_DATE('01-01-2012','MM-DD-YYYY') AND ( omega_source, omega_entity_data_type, omega_entity_type ) IN ( ( 'V-RTS/1105',
'RTSXref', 'Instrument' ), ( 'V-RTS/205', 'RTSXref', 'Instrument' ), ( 'V-RTS/257', 'RTSXref', 'Instrument' ), ( 'V-RTS/258', 'RTSXref', 'Instrument' ), ( 'V-RTS/297', 'RTSXref', 'Instrument' ), ( 'V-RTS/300', 'RTSXref', 'Instrument' ), ( 'V-RTS/309', 'RTSXref', 'Instrument' ), ( 'V-RTS/314', 'RTSXref', 'Instrument' ), ( 'V-RTS/344', 'RTSXref', 'Instrument' ), ( 'V-RTS/357', 'RTSXref', 'Instrument' ), ( 'V-RTS/649', 'RTSXref', 'Instrument' ), ( 'V-RTS/655', 'RTSXref', 'Instrument' ), (
'V-RTS/692', 'RTSXref', 'Instrument' ), ( 'V-RTS/747', 'RTSXref', 'Instrument' ), ( 'V-RTS/779', 'RTSXref', 'Instrument' ), ( 'V-RTS/808', 'RTSXref', 'Instrument' ), ( 'V-RTS/839', 'RTSXref', 'Instrument' ), ( 'V-RTS/848', 'RTSXref', 'Instrument' ), ( 'V-RTS/877', 'RTSXref', 'Instrument' ), ( 'V-RTS/882', 'RTSXref', 'Instrument' ), ( 'V-RTS/904', 'RTSXref', 'Instrument' ), ( 'V-RTS/918', 'RTSXref', 'Instrument' ), ( 'V-RTS/949', 'RTSXref', 'Instrument' ), ( 'V-RTS/953', 'RTSXref', 'Instrument'
), ( 'V-RTS/968', 'RTSXref', 'Instrument' ), ( 'V-RTS/978', 'RTSXref', 'Instrument' ) ) AND omega_sequence_time > to_timestamp('06-24-2020 00:00:00','MM-DD-YYYY HH24:MI:SS') AND omega_sequence_time <= to_timestamp('06-27-2020 00:00:00','MM-DD-YYYY HH24:MI:SS') AND db_update_time > to_timestamp('06-26-2020 14:00:00','MM-DD-YYYY HH24:MI:SS') AND db_update_time <= to_timestamp('06-26-2020 15:00:00','MM-DD-YYYY HH24:MI:SS') --and extract ( hour from db_update_time) between 14 and 15

Global Information
------------------------------
Status : DONE
Instance ID : 1
Session : DBMAINT (2108:21276)
SQL ID : 7fvaw62d2w6m8
SQL Execution ID : 16777216
Execution Started : 08/10/2020 22:45:32
First Refresh Time : 08/10/2020 22:45:32
Last Refresh Time : 08/10/2020 23:15:16
Duration : 1784s
Module/Action : SQL*Plus/-
Service : SYS$USERS
Program : sqlplus


Global Stats
=============================================================================================================================
| Elapsed | Cpu | IO | Concurrency | Cluster | Other | Buffer | Read | Read | Write | Write | Offload |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Waits(s) | Gets | Reqs | Bytes | Reqs | Bytes | Returned Bytes |
=============================================================================================================================
| 2500 | 1151 | 1150 | 37 | 4.49 | 158 | 12M | 5M | 40GB | 605 | 277MB | 277MB |
=============================================================================================================================

====================================================================================================================================================================================================================================================================================================================================================

SQL Plan Monitoring Details (Plan Hash Value=1996136171)
=================================================================================================================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Write | Write | Mem | Activity | Activity Detail |
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | Reqs | Bytes | (Max) | (%) | (# samples) |
=================================================================================================================================================================================================================================================================
| 0 | INSERT STATEMENT | | | | 593 | +1192 | 33 | 16 | | | | | . | 0.31 | enq: HW - contention (3) |
| | | | | | | | | | | | | | | | Cpu (2) |
| | | | | | | | | | | | | | | | PX Deq: reap credit (2) |
| 1 | PX COORDINATOR | | | | 1784 | +1 | 33 | 16 | | | | | 65MB | 0.31 | Cpu (1) |
| | | | | | | | | | | | | | | | PX Deq: Execute Reply (1) |
| | | | | | | | | | | | | | | | PX Deq: reap credit (4) |
| | | | | | | | | | | | | | | | name-service call wait (1) |
| 2 | PX SEND QC (RANDOM) | :TQ10002 | 7G | 303M | 13 | +1772 | 16 | 16 | | | | | . | | |
| 3 | INDEX MAINTENANCE | test_quote_TMP | | | 36 | +1749 | 16 | 16 | 346K | 3GB | | | . | 18.13 | gc current grant 2-way (6) |
| | | | | | | | | | | | | | | | gc current multi block request (1) |
| | | | | | | | | | | | | | | | buffer busy waits (5) |
| | | | | | | | | | | | | | | | latch: cache buffers chains (1) |
| | | | | | | | | | | | | | | | Cpu (91) |
| | | | | | | | | | | | | | | | enq: FB - contention (179) |
| | | | | | | | | | | | | | | | enq: TX - contention (9) |
| | | | | | | | | | | | | | | | cell single block physical read (113) |
| 4 | PX RECEIVE | | 7G | 303M | 10 | +1750 | 16 | 15M | | | | | . | | |
| 5 | PX SEND RANGE | :TQ10001 | 7G | 303M | 9 | +1749 | 16 | 16M | | | | | . | | |
| 6 | LOAD AS SELECT (HYBRID TSM/HWMB) | test_quote_TMP | | | 1131 | +627 | 16 | 32 | 660 | 277MB | 605 | 277MB | . | 4.39 | Cpu (94) |
| | | | | | | | | | | | | | | | PGA memory operation (1) |
| | | | | | | | | | | | | | | | direct path read (3) |
| 7 | PX RECEIVE | | 7G | 303M | 1124 | +627 | 16 | 15M | | | | | . | 0.13 | Cpu (2) |
| | | | | | | | | | | | | | | | PX Deq: reap credit (1) |
| 8 | PX SEND ROUND-ROBIN | :TQ10000 | 7G | 303M | 1123 | +627 | 1 | 15M | | | | | . | 11.99 | Cpu (266) |
| | | | | | | | | | | | | | | | PX Deq: reap credit (2) |
| 9 | NESTED LOOPS | | 7G | 303M | 1123 | +627 | 1 | 15M | | | | | . | | |
| 10 | PARTITION RANGE ITERATOR | | 17M | 5M | 1123 | +627 | 1 | 15M | | | | | . | | |
| 11 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | test_quote_JSON_PART | 17M | 5M | 1124 | +626 | 3 | 15M | 2M | 18GB | | | . | 22.43 | Cpu (125) |
| | | | | | | | | | | | | | | | cell list of blocks physical read (179) |
| | | | | | | | | | | | | | | | cell multiblock physical read (30) |
| | | | | | | | | | | | | | | | cell single block physical read (166) |
| | | | | | | | | | | | | | | | cell single block read request (1) |
| 12 | INDEX RANGE SCAN | test_quote_JSON_OMEGA_SEQ_UPDATE_TIME_IDX | 4M | 2M | 1748 | +1 | 3 | 22M | 2M | 19GB | | | . | 31.78 | Cpu (76) |
| | | | | | | | | | | | | | | | cell single block physical read (632) |
| | | | | | | | | | | | | | | | cell single block read request (2) |
| 13 | JSONTABLE EVALUATION | | | | 1123 | +627 | 15M | 15M | | | | | . | 10.52 | Cpu (235) |
=================================================================================================================================================================================================================================================================

Chris Saxon

Followup  

August 11, 2020 - 2:34 pm UTC

Please, use the < code > tags!

As suspected, indexed lookups (operations 11 & 12) account for most of the time here; there's a huge amount of single block I/O. That's going to take a while.

You may be able to get some gains by using attribute clustering so consecutive indexed values are near each other in the table. I doubt this will make the gains you're hoping for though.

The full table scan is significantly faster - exactly what problem is using lots of I/O causing you?

SQL monitor report

August 11, 2020 - 2:47 pm UTC

Reviewer: Rich from USA

Hi Chris,

Thanks for the feedback.

>> You may be able to get some gains by using attribute clustering so consecutive indexed values are near each other in the table. I doubt this will make the gains you're hoping for though.

Any idea how to implement the "atrribute clustering" ?

>> The full table scan is significantly faster - exactly what problem is using lots of I/O causing you?

1) yes to reduce I/O (336GB of IO to scan the 3 partitions) because of full table scan at partition

2) Unless i give, /*+ PARALLEL(src,16) */ in the SELECT, Oracle chooses to go with INDEX RANGE SCAN and it runs very long time.

3) Do you have any recommendation on any optimization of the code.



Chris Saxon

Followup  

August 11, 2020 - 4:53 pm UTC

Any idea how to implement the "atrribute clustering" ?

https://richardfoote.wordpress.com/2014/08/26/12-1-0-2-introduction-to-attribute-clustering-the-division-bell/

336GB of IO to scan the 3 partitions

Yeah, but so what? Other than being a big number, precisely how is this affecting the database? After all, the you said the full scan runs in 1.5 minutes instead of 30.

Unless i give, /*+ PARALLEL(src,16) */ in the SELECT, Oracle chooses to go with INDEX RANGE SCAN and it runs very long time.

If the explain plan estimates are accurate, there are ~3 billion rows in those partitions. And the index fetches "just" a few million. So this may be a case where a small percent of the data is still a huge number of rows, which makes this tricky.

Can you use SQL plan baselines to force the full table scan? Does adding the dynamic_sampling hint help?

Do you have any recommendation on any optimization of the code.

Why do you need to transform so many rows from one table to another? Is this a one-off load? If it's an ongoing process, can you process smaller batches more often?

August 12, 2020 - 6:37 am UTC

Reviewer: Rich from USA

Hi Chris,

Thanks much. I will try out the attribute clustering.

>> If the explain plan estimates are accurate, there are ~3 billion rows in those partitions. And the index fetches "just" a few million. So this may be a case where a small percent of the data is still a huge number of rows, which makes this tricky.

Total number of rows are about 500 million rows in all 3 partitions combined. Out of 500 millions, local index range scan fetches about 22 million rows.
partitions position - 7 to 9 are the ones scanned as per the plan.
Also, in the execution plan (in the step 12 INDEX RANGE SCAN,, Estimated shows 4M but actual 22M. This is the step taking longer as you mentioned.
I did gather stats on the table/index. I can try with dynamic sampling hint.


TABLE_NAME PARTITION_POSITION PARTITION_NAME HIGH_VALUE PCT_FREE SEGM NUM_ROWS
---------------------------------------- ------------------ ---------------------------------------- ----------------------------------- ---------- ---- ----------
TEST_QUOTE_JSON_PART 1 V_QUOTE_JSON_PART_P0 TIMESTAMP' 2020-06-19 00:00:00' 10 NO
TEST_QUOTE_JSON_PART 2 SYS_P36501 TIMESTAMP' 2020-06-20 00:00:00' 10 YES 138746562
TEST_QUOTE_JSON_PART 3 SYS_P36541 TIMESTAMP' 2020-06-21 00:00:00' 10 YES 99009
TEST_QUOTE_JSON_PART 4 SYS_P36519 TIMESTAMP' 2020-06-22 00:00:00' 10 YES 1663940
TEST_QUOTE_JSON_PART 5 SYS_P36504 TIMESTAMP' 2020-06-23 00:00:00' 10 YES 158098563
TEST_QUOTE_JSON_PART 6 SYS_P36513 TIMESTAMP' 2020-06-24 00:00:00' 10 YES 162893808
TEST_QUOTE_JSON_PART 7 SYS_P36516 TIMESTAMP' 2020-06-25 00:00:00' 10 YES 175050916
TEST_QUOTE_JSON_PART 8 SYS_P36507 TIMESTAMP' 2020-06-26 00:00:00' 10 YES 178352935
TEST_QUOTE_JSON_PART 9 SYS_P36510 TIMESTAMP' 2020-06-27 00:00:00' 10 YES 155734642


2) >>Can you use SQL plan baselines to force the full table scan? Does adding the dynamic_sampling hint help?

Yes i will try SQL PLAN baselines to force to FTS. I will try with dynamic sampling also.

3) >>Why do you need to transform so many rows from one table to another? Is this a one-off load? If it's an ongoing process, can you process smaller batches more often?

This is ongoing process runs every hour to move the data from JSON table to another history table (non-json table with decomposed json entries) based on the time range as you see it in the WHERE clause of the query. We are going to try processing in 15 minutes interval instead of 1 hr.


Chris Saxon

Followup  

August 12, 2020 - 7:50 am UTC

OK, so the optimizer is over estimating the rows for a full table scan, but underestimating the rows using an index. If the stats are up-to-date, the dynamic_sampling hint should help the optimizer get better estimates.