Hi Connor/Chris,
Please have a look at below table structure
This is load table and there is no uniqueness related to any column, hence we cannot go for Primary key for this.
tb_card_details
---------------
card_id start_date end_date address email created_on card_type sent_flag
------- ---------- -------- ------- ----- ---------- --------- ---------
1000001 26/02/2017 26/03/2017 UK dummy@abc.com 26/02/2017 Retail 1 or 0
..
so on
1000001 23/02/2017 26/04/2017 US asdex@abc.com 27/02/2017 Corp 1 or 0
..
upto 100k records
1000002 26/03/2017 26/05/2017 UK 12344@abc.com 28/02/2017 Retail 1 or 0
..
upto 150k records
1000003 23/04/2017 26/04/2017 US 15456@abc.com 10/03/2017 Corp 1 or 0
..
so on
Data from this table can be fetch using WHERE columns - created_on, card_type & sent_flag.
I thought of partition this table using RANGE partition using (created_on as partition key).
For every created_on there are currently 100k records and in future it may grow.
Could you please suggest any better approach/suggestion like PARTITION technique, LOCAL index that considering WHERE condition based on created_on, card_type & sent_flag.
---------Additional Info------------------
tb_card_details
---------------
card_id start_date end_date address region created_on card_type sent_flag sequence_number
------- ---------- -------- ------- ----- ---------- --------- --------- ----------------
1000001 26/02/2017 26/03/2017 UK APAC 26/02/2017 Retail 1 or 0 2017022600000001
..
so on
1000001 26/02/2017 26/03/2017 UK APAC 26/02/2017 Retail 1 or 0 2017022699999999
..
upto 100k records
1000001 26/02/2017 26/03/2017 UK EMEA 27/02/2017 Retail 1 or 0 2017022700000001
..
1000001 26/02/2017 26/03/2017 UK EMEA 27/02/2017 Retail 1 or 0 2017022799999999
Whenever any select hit this table WHERE will be having column created_on, card_type, sent_flag only.
From the result after executing WHERE created_on = ? AND card_type = ? AND sent_flag = ? (e.g 300k rows),
Java will be multi-threading the resultsset by fetching rows lets say 100 rows per thread (using sequence_number)..
Certainly partitioning on created_on will be of us. In terms of more granular partitioning, a lot depends on the distribution/num distinct values of card_type and/or sent_flag.
You might see benefit in (for example) a hash or list subpartitioning scheme on those columns, because once you get up to retrieving hundreds of thousands of rows, indexes often end up being more harm than good.
Partitioning rarely harms large scale scans - but one thing to keep in mind is that once you get into large numbers of partitions, you can end up with large optimization times, so thats the granularity trade off you should consider.