Skip to Main Content

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question.

Asked: February 28, 2017 - 8:46 am UTC

Last updated: March 07, 2017 - 4:06 am UTC

Version: 11.2

Viewed 1000+ times

You Asked

Hi Connor/Chris,

Please have a look at below table structure
This is load table and there is no uniqueness related to any column, hence we cannot go for Primary key for this.

tb_card_details
---------------

card_id  start_date  end_date  address  email   created_on card_type sent_flag
-------  ----------  --------  -------  -----   ---------- --------- ---------
1000001  26/02/2017  26/03/2017  UK   dummy@abc.com 26/02/2017 Retail  1 or 0
..                        
so on
1000001  23/02/2017  26/04/2017  US   asdex@abc.com 27/02/2017 Corp  1 or 0
..
upto 100k records
1000002  26/03/2017  26/05/2017  UK   12344@abc.com 28/02/2017 Retail      1 or 0
..
upto 150k records
1000003  23/04/2017  26/04/2017  US   15456@abc.com 10/03/2017 Corp        1 or 0
..
so on


Data from this table can be fetch using WHERE columns - created_on, card_type & sent_flag.
I thought of partition this table using RANGE partition using (created_on as partition key).

For every created_on there are currently 100k records and in future it may grow.

Could you please suggest any better approach/suggestion like PARTITION technique, LOCAL index that considering WHERE condition based on created_on, card_type & sent_flag.



---------Additional Info------------------
tb_card_details
---------------

card_id start_date end_date address region created_on card_type sent_flag sequence_number
------- ---------- -------- ------- ----- ---------- --------- --------- ----------------
1000001 26/02/2017 26/03/2017 UK APAC 26/02/2017 Retail 1 or 0 2017022600000001
..
so on
1000001 26/02/2017 26/03/2017 UK APAC 26/02/2017 Retail 1 or 0 2017022699999999
..
upto 100k records
1000001 26/02/2017 26/03/2017 UK EMEA 27/02/2017 Retail 1 or 0 2017022700000001
..
1000001 26/02/2017 26/03/2017 UK EMEA 27/02/2017 Retail 1 or 0 2017022799999999


Whenever any select hit this table WHERE will be having column created_on, card_type, sent_flag only.
From the result after executing WHERE created_on = ? AND card_type = ? AND sent_flag = ? (e.g 300k rows),
Java will be multi-threading the resultsset by fetching rows lets say 100 rows per thread (using sequence_number)..

and Connor said...

Certainly partitioning on created_on will be of us. In terms of more granular partitioning, a lot depends on the distribution/num distinct values of card_type and/or sent_flag.

You might see benefit in (for example) a hash or list subpartitioning scheme on those columns, because once you get up to retrieving hundreds of thousands of rows, indexes often end up being more harm than good.

Partitioning rarely harms large scale scans - but one thing to keep in mind is that once you get into large numbers of partitions, you can end up with large optimization times, so thats the granularity trade off you should consider.

Is this answer out of date? If it is, please let us know via a Comment

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.