Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Rick.

Asked: July 07, 2008 - 3:54 pm UTC

Last updated: June 28, 2016 - 1:11 am UTC

Version: 10.2.0

Viewed 10K+ times! This question is

You Asked

Tom,

I have this table with around one million records. The table has the potential to grow to 2 million, 3 million or even more...

I am planning on partitioning this table using hash partitioning on the most queried column. A few questions:

1- I have read somewhere that due to the hashing algorithm that Oracle uses, it is better to use power of 2 number of partitions. Is that correct?

2- Is there a recommendation upon how many partitions are optimal for a given number of records? What would you take into consideration when deciding upon the number of partitions? Is it true that the more the merrier?

3- Any other recommendations?

Thank you,
Rick

and Tom said...

1) it is paramount, yes - 2, 4, 8, 16, 32, 64, ... and NO other numbers in between.

Else the data will be skewed across partitions with most of the data in the "middle" partitions and little data on the end partitions.

Also, a hash key should be fairly unique - lots and lots and lots of distinct values, else, you won't achieve even distribution of data

2) it is not record driven. 1,000,000 records might be 10mb, 100mb, 1tb - it is about volume.

What is the biggest segment you want to have?
What are you trying to accomplish with partitioning - what is your reason for using it - that will drive your partition sizes and your partitioning scheme.


3)
understand WHY you are partitioning. then and only then APPLY it

when you APPLY it, make sure you understand the ramifications of doing so and consider each and ever index independent of every other index. eg: say this is an employee table and you hash partition into 16 partitions by empno.

Say you have an index on last_name

Say you just locally partition that index on last_name


when you query: select * from t where last_name=:x

you will do 16 index range scans - you will do 16 times the work!

you would have wanted to either

a) not partition that index
b) partition that index by range - A-K go into this partition, L-Z go into this one
c) hash partition it (if you always use equals)


If you have access to my book Expert Oracle Database Architecture, I discuss all of this in some detail - why you want to know WHY you are partitioning and how to approach it and what not to do

Rating

  (16 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Excellent answer

Rick, July 08, 2008 - 3:03 pm UTC

Thanks Tom!

I am actually partitioning on a unique key that is the most highly queried (WHERE 'partitioning_key_used=X') so I guess I should be fine from that side.

I will further investigate on the volumes and use that to decide on the number of partitions. I will also order your book next thing after submitting this review :).

Thanks again!

Index local partition

Albert Nelson A., July 09, 2008 - 12:13 am UTC

Hi Tom,

You have mentioned about increase in the work done by a local partioned index and recommended not to partition the index as an option.

<quote>
you will do 16 index range scans - you will do 16 times the work!

you would have wanted to either

a) not partition that index
b) partition that index by range - A-K go into this partition, L-Z go into this one
c) hash partition it (if you always use equals)

</quote>

We have a heavily inserted (100s of records every second round the clock) table which is range partitioned for the purpose of purging old data. The table has few non partitioned indexes for the purpose of fast query response. They are not local partitioned for the very reason you have mentioned.

But during dropping of old partition, while the non partitioned indices are rebuilt, there is contention for the indices (enq-TX index contention). As the table is highly inserted table this causes "issues" in the application.

Hash partitioning the index might not be appropriate since queries will not use equals. Do we have any alternative? Or is it a compromise between query response time against partition drop time?

Regards,

Albert Nelson A.
Tom Kyte
July 09, 2008 - 10:20 am UTC

... You have mentioned about increase in the work done by a local partioned index
and recommended not to partition the index as an option.
...

just to be clear, what I was pointing out is "if you don't think about the physics behind partitioning - whether you use local or global indexes - you will be sorry and negatively impact performance"

when you drop the old partitions, you are maintaining the indexes - so in effect, it is like you are issuing a delete against the indexes

Can you clarify a little more why this would in your case cause the contention you describe.

The data being deleted won't be read or written by anyone else.
It would be done in a single transaction (so I cannot imagine it would be leading to ITL waits)


if you globally partition the existing indexes by range (eg: option (b)) you will create many indexes - spreading any contention out over many structures

but if the contention is on the leaf blocks themselves, that won't help - since the same range of leaves would still be there.


Powers of two and Hash partitioning

Ben, July 09, 2008 - 7:58 am UTC

Tom, can you please say a bit more about the algorithm used to determine which hash partition to use for a specific key? I'm having trouble understanding why a power of two would be important. We're considering implementing hash partitioning on a completely unique column, and it seems to me as if it should be easy to keep that evenly hash partitioned without having to resort to a specific number of partitions.

Tom Kyte
July 09, 2008 - 11:09 am UTC

it is just the way it works - powers of two. We have to hash a field to a value, unless that set of values is a power of two our algorithm does not distribute the values evenly.

ops$tkyte%ORA10GR2> CREATE TABLE t1 ( x   int ) PARTITION BY hash(x) ( partition part1 , partition part2 , partition part3 , partition part4 );

Table created.

ops$tkyte%ORA10GR2> CREATE TABLE t2 ( x   int ) PARTITION BY hash(x) ( partition part1 , partition part2 , partition part3 , partition part4, partition part5 );

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t1 select rownum from all_objects;

49961 rows created.

ops$tkyte%ORA10GR2> insert into t2 select rownum from all_objects;

49961 rows created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select h, count(*), sum(count(*)) over (),
  2            substr( rpad('*',100,'*'), 1, 100*ratio_to_report(count(*)) over ()) hist
  3    from (select 1 h from t1 partition (part1)
  4          union all select 2 h from t1 partition(part2)
  5          union all select 3 h from t1 partition(part3)
  6          union all select 4 h from t1 partition(part4)
  7                  )
  8   group by h
  9   order by h
 10  /

         H   COUNT(*) SUM(COUNT(*))OVER() HIST
---------- ---------- ------------------- --------------------------------------------------
         1      12435               49961 ************************
         2      12430               49961 ************************
         3      12719               49961 *************************
         4      12377               49961 ************************

ops$tkyte%ORA10GR2> select h, count(*), sum(count(*)) over (),
  2            substr( rpad('*',100,'*'), 1, 100*ratio_to_report(count(*)) over ()) hist
  3    from (select 1 h from t2 partition (part1)
  4          union all select 2 h from t2 partition(part2)
  5          union all select 3 h from t2 partition(part3)
  6          union all select 4 h from t2 partition(part4)
  7          union all select 5 h from t2 partition(part5)
  8                  )
  9   group by h
 10   order by h
 11  /

         H   COUNT(*) SUM(COUNT(*))OVER() HIST
---------- ---------- ------------------- --------------------------------------------------
         1       6243               49961 ************
         2      12430               49961 ************************
         3      12719               49961 *************************
         4      12377               49961 ************************
         5       6192               49961 ************

That's a very inflexible algorithm...

Ben, July 09, 2008 - 2:16 pm UTC

Tom-

I realize I'm about to shoot the messenger here, so let apologize in advance. Why, why, why, would Oracle push a restriction on the number of hash partitions as restrictive as the powers of two? It seems much simpler and more flexible to hash the key to an integer and use modulo arithmetic to determine which hash partition to use for the data. Can you explain why that algorithm does not work?
Tom Kyte
July 09, 2008 - 3:04 pm UTC

no, i cannot. I can only tell you how it does work - and that of course changing it would be disastrous.

Now, given that hash partitioning is good for one thing and one thing only - you use it when you are in the situation "we'd like smaller segments, but have no other idea how to partition, there is nothing to intelligently partition on - nothing to range or list on"...

the fact that you use 16, 32, 64, 128, 256 ... segments doesn't really bother me. You'd pick what you wanted to 'stripe' across that that would be it.

Index contention

Albert Nelson A., July 10, 2008 - 10:48 pm UTC

Hi Tom,

In your followup on July 9, 2008 - 10am US/Eastern, you have asked

<quote>
when you drop the old partitions, you are maintaining the indexes - so in effect, it is like you are issuing a delete against the indexes

Can you clarify a little more why this would in your case cause the contention you describe.

</quote>

Sorry I am unable to clarify further. It is observed in the OEM as the waiting event experienced by lots of active sessions trying to insert into the table whose partition was being dropped.

Thanks and regards,

Albert Nelson A.
Tom Kyte
July 11, 2008 - 8:13 am UTC

are these unique indexes

Unique indexes?

Albert Nelson A., July 14, 2008 - 6:25 am UTC

Hi Tom,

No, they are not unique indexes.

Regards,

Albert Nelson A
Tom Kyte
July 15, 2008 - 9:29 am UTC

they are blocking each other somehow, got the entire schema needed to reproduce with ?

Schema

Albert Nelson A, July 16, 2008 - 2:07 am UTC

Hi Tom,

I did not understand what you meant by entire schema but here is the DDL for the table and its indexes:


CREATE TABLE "CITYNET2"."ESC_VEHICLE_IVD_LOG"
( "VEHICLE_ID" VARCHAR2(10 BYTE) NOT NULL ENABLE,
"JOB_NO" VARCHAR2(12 BYTE),
"HEADER_INFO" RAW(12),
"DETAIL" VARCHAR2(60 BYTE),
"EVENT_CODE" VARCHAR2(5 BYTE) NOT NULL ENABLE,
"LOG_DT" DATE NOT NULL ENABLE,
"DRIVER_ID" VARCHAR2(9 BYTE),
"EVENT_STATUS" NUMBER(1,0),
"EVENT_DT" DATE,
CONSTRAINT "FK_ESC_VEH_IVD_LOG__EVNT" FOREIGN KEY ("EVENT_CODE")
REFERENCES "CITYNET2"."ESC_EVENT_CODES" ("EVENT_CODE") DISABLE,
CONSTRAINT "FK_ESC_VEH_IVD_LOG__VEH" FOREIGN KEY ("VEHICLE_ID")
REFERENCES "CITYNET2"."ESC_VEHICLE" ("VEHICLE_ID") ENABLE
) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOLOGGING
STORAGE(
BUFFER_POOL DEFAULT)
TABLESPACE "CN2TABPART1"
PARTITION BY RANGE ("LOG_DT")
(PARTITION "IVD_LOG_20080713" VALUES LESS THAN (TO_DATE(' 2008-07-13 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOLOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "CN2TABPART1" NOCOMPRESS ,
PARTITION "IVD_LOG_20080720" VALUES LESS THAN (TO_DATE(' 2008-07-20 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOLOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "CN2TABPART1" NOCOMPRESS ,
/* Other partitions deleted for brevity */
PARTITION "IVD_LOG_20080907" VALUES LESS THAN (TO_DATE(' 2008-09-07 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOLOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "CN2TABPART1" NOCOMPRESS ,
PARTITION "IVD_LOG_OTHERS" VALUES LESS THAN (MAXVALUE)
PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOLOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "CN2TABPART1" NOCOMPRESS ) ;


CREATE INDEX "CITYNET2"."IDX_ESC_VEH_IVD_LOG__EVN_COD" ON "CITYNET2"."ESC_VEHICLE_IVD_LOG" ("EVENT_CODE")
PCTFREE 10 INITRANS 2 MAXTRANS 255 NOLOGGING COMPUTE STATISTICS
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "CN2IND3" ;

CREATE INDEX "CITYNET2"."IDX_ESC_VEH_IVD_LOG__JOB_NO" ON "CITYNET2"."ESC_VEHICLE_IVD_LOG" ("JOB_NO")
PCTFREE 10 INITRANS 2 MAXTRANS 255 NOLOGGING COMPUTE STATISTICS
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "CN2IND3" ;

CREATE INDEX "CITYNET2"."IDX_ESC_VEH_IVD_LOG__LOG_DT" ON "CITYNET2"."ESC_VEHICLE_IVD_LOG" ("LOG_DT")
PCTFREE 10 INITRANS 2 MAXTRANS 255 LOGGING
STORAGE(
BUFFER_POOL DEFAULT)
TABLESPACE "CN2INDPART1" LOCAL
(PARTITION "IVD_LOG_20080713"
PCTFREE 10 INITRANS 2 MAXTRANS 255 NOLOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "CN2INDPART1" ,
PARTITION "IVD_LOG_20080720"
PCTFREE 10 INITRANS 2 MAXTRANS 255 NOLOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "CN2INDPART1" ,
/* Other partitions deleted for brevity */
PARTITION "IVD_LOG_20080907"
PCTFREE 10 INITRANS 2 MAXTRANS 255 NOLOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "CN2INDPART1" ,
PARTITION "IVD_LOG_OTHERS"
PCTFREE 10 INITRANS 2 MAXTRANS 255 NOLOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "CN2INDPART1" ) ;

CREATE INDEX "CITYNET2"."IDX_ESC_VEH_IVD_LOG__VEH_ID" ON "CITYNET2"."ESC_VEHICLE_IVD_LOG" ("VEHICLE_ID")
PCTFREE 10 INITRANS 2 MAXTRANS 255 NOLOGGING COMPUTE STATISTICS
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "CN2IND3" ;

CREATE INDEX "CITYNET2"."IDX_VEH_IVD_LOG__DRV_ID$DT" ON "CITYNET2"."ESC_VEHICLE_IVD_LOG" ("DRIVER_ID", "LOG_DT")
PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "CN2IND2" ;

Regards,

Albert Nelson A.

Tom Kyte
July 16, 2008 - 10:07 am UTC

I would like a schema (set of objects) that can be used to reproduce the issue - a schema that we can test with - hence no storage, no tablespaces - but complete (eg: you have foreign keys, to what?)

something I can actually run.

Understood. Will provide one.

Albert Nelson A, July 17, 2008 - 2:02 am UTC


Saurabh, September 14, 2008 - 2:55 am UTC

Hi Tom,

I was trying to figure out hashing algorithm used by Oracle to allocate rows to a particular hash partition. Then I stumbled upon ORA_HASH function.

I observed below behavior

Consider, these 4 variables (part_col, no_of_part, bucket_size, prev_2_power)

part_col be candidate partitioning column

no_of_part is the number of partitions we are considering

bucket_size is derived from no_of_part as follows:

if no_of_part is powers of 2(2,4,8,16...) then
bucket_size = no_of_part - 1

else if no_of_part not powers of 2 then
bucket_size = "next higher power of 2" - 1

for eg, if you have 24 no_of_part then (32 -1) = 31

prev_2_power is no_of_part if its a power of 2 number else its previous power of 2. for ex. for 24 partitions, 16 would be prev_2_power

THEN
Partition Number =

case
ora_hash(part_col, bucket_size)+ 1 > no_of_part
then 
ora_hash(part_col, bucket_size)+ 1 - prev_2_power
else
ora_hash(part_col, bucket_size)+ 1 /* +1 since bucket start with 0 but partitions start with 1 */
end;




I tested above algo for couple of cases and it worked correct. Also it explains why there is skew if we dont use power of 2 partitions. as anyways oracle bucket data into next 2power number of buckets and then just merges them using above logic to make desired no of partitions.

Above expression can be used along with a group by to evaluate partitioning skew before actually creating partitioned table.

I am not sure though that if this the logic or documented somewhere but seems to hold good.

PS: Pardon for re-posting this. I had posted it on "hash partitioning" thread but I got no response there. Seems that thread is inactive.

Regards,
Saurabh
Tom Kyte
September 16, 2008 - 9:47 pm UTC

all threads are inactive.
all threads are active.

I didn't see anything to "respond" to?

There isn't a question here... It is all statements?

number of partition

Myke T., November 26, 2009 - 5:42 am UTC

Hi Tom,
excuce me for my english, I have a table of 3.000.000 of record with comporite partition by range/hash (2/16) that I must join with tables of thousand records partitioned same way.
How many partition I can add for the best performance query and load data?

Myke
Tom Kyte
November 28, 2009 - 1:05 pm UTC

I have a car, it holds 10 gallons of gas. I get 50 miles to the gallon.

What is the best road to drive on to get to California.



We have asked basically for the same thing - a magic answer that no one could provide. I gave you no information that would be useful to answer my question. You'd have to start with a long list of questions - probably starting with "is your car currently in North America" and going down from there.


It would depend on how your data arrives to be loaded.

It would depend on what queries you execute.

It would depend.


Think about the physics of the data here, close your eyes and envision how the data could be, would be physically organized on disk using partitioning scheme "X" (you come up with as many X's as you can). Now, ask yourself - would that help me load? Could I just create a new partition and direct path load into it and viola - we are done (probably not if you use hash partitioning). And if I partition to make loading easy - have I made it so that queries can benefit from the partitioning scheme?

Eg: I have sales data, we need to keep seven years of sales data online. If I partition by month, it makes it really easy to purge old data (drop partition) and load new data (add partition, direct path it).

However, most of my queries are by customer_id or by location or by product. The fact that I partitioned by sales_date (making loading easier) does NOTHING for my queries.



Think about the physics of the data, think about what you are trying most to optimize, and apply the tool. That is about all I can say given your level of detail here.

Partitioning on VPD tables

Siva Kumar, July 27, 2012 - 8:31 am UTC

Hi Tom,

We have two audit tables (VPD enabled) which stores all OLTP transactions information and there are views on top them to fetch point in time data with heavy no of columns and more complex business logic. These views are used by batch jobs to pull the report information and batch jobs are spending most of the time(80% of the time) to query DB.

Table1 is having 10 million records (stores what was table changed, on what date and some additional info)

Table2 is having 50 million records (refers to Table1 and contains list of columns that were changed for that transaction and stores old value and new value for that column)

I wanted to implement hash partitioning for these tables based on client_key value to decrease the query response times.

I have following questions.

1) When we set the context for a session, that session can see data only for that context. In a way Oracle filters out the data by looking only for that context variable (what I meant is though my table is having millions of records it only look for the records for that context maybe I m wrong), does partitioning on VPD enabled tables will really improves the performance?

2) Based on what logic Oracle keeps the records in different partitions for hash partitioning, let’s assume that I have 8 partitions and 8 clients with different client_keys for each client. If I implement hash partitioning on client_key does Oracle keeps the records in same partition those who belongs to same client_key?

3) How many partitions will be better in this case?

4) Sometimes we are facing ORA-06910 Sort area size too small issues from application, however workarea_size_policy is set to “auto”. Ideally Oracle should ignore sort_area_size parameter value, if workarea_size_policy is set to auto, in that case what will be maximum sort area size can be allocated for a session?

5) If a query involves too much sorting may be it should take more time to execute it (by doing that sort operation in disk and sending that result back to buffer if sort area size is not enough), though workarea_size_policy is set to auto, how come we are receiving ORA-06910 error, is there any other property which is limiting the sort area size or is there something wrong on DB queries or from application? Could you please suggest some areas to focus to not get that error, because of this our jobs getting failed without executing completely.

Thanks in Advance

Thanks,
Siva.

Tom Kyte
July 30, 2012 - 11:43 am UTC

I wanted to implement hash partitioning for these tables based on client_key
value to decrease the query response times.


are you full scanning these tables?

if not, stop - you won't be able to achieve your goal, not at all.

so, before I even read anything further - answer this: are you full scanning either or both of these tables.


Partitioning on VPD tables

A reader, August 06, 2012 - 7:00 am UTC

Hi Tom,

Thanks for your reply and sorry for my late response.

There are no full table scans on both of the tables, Explain plan shows "TABLE ACCESS BY LOCAL INDEX ROWID" on both the tables.

Just wanted to know, how partitioning won't help if there are no full table scan.

Thanks,
Siva.
Tom Kyte
August 17, 2012 - 12:07 pm UTC

if you are using an index, you will do two-four IO's typically against the index to get a rowid and then a table access by index rowid to get the row.


It doesn't matter if there are 100,000 rows in the segment (table/partition) or 1,000,000,000 rows - it would be the same.


think about it - how could partitioning speed up an index retrieval? we have to scan the index - find a rowid - get a row. If you index incorrectly in an index situation with partitioning - you might end up making performance much WORSE (you might have to do up to N index range scans instead of 1 where N=number of partitions). How would it make it faster???

Partitioning on VPD tables

A reader, August 10, 2012 - 11:46 am UTC

Hi Tom,

Could you please help me out here, we are really suffering from the query response times.

I have tried by applying partition on those tables by creating replica of them, I have seen significant improvement (from 00:4:30 to 00:2:30 sec) on query response time on new partitioned tables.

Well, this is how I implemented the partitioning; I created 8 partitions for Table1 (including its indexes) and 16 partitions for Table2 (including its indexes) and Hash Partitioning on Client_Key. In a way all the information for one Client is going to store in one and only one single partition.

So, if we run a query on those tables by applying VPD on Client_Key, then what I m assuming is Table/ Index scan happens only within the single partition where that CLIENT_KEY is located. So in this way we minimized the no of records to be searched/ scanned/ filtered (Since only one partition will be involved in at any point of time). Is this the way partitioning works or Am I wrong here?

If I m wrong, how the query response time came down to 00:2:30 from 4 mins?

Note: I calculated time taken for the query after flushing Shared Pool and Buffer cache.

Thanks,
Siva.
Tom Kyte
August 17, 2012 - 1:46 pm UTC

I m assuming is Table/ Index scan happens only within the single partition
where that CLIENT_KEY is located.


correct


So in this way we minimized the no of records
to be searched/ scanned/ filtered,


false. why do you think that. If a client had 100 records before you moved the data, they still have 100 records afterwards. the number of rows processed by your query will not really have changed. There could be exceptions to that but in general, you are processing the same amount of data.

it might be on *fewer* blocks than before, so you might be experiencing slightly better caching.


remember also that when you moved to a partitioned table you rebuilt *everything*. It might have nothing to do with the partitioning scheme at all.


do you have the old data, compare tkprofs of your queries to see what it materially different.



never flush the buffer cache and shared pool to benchmark. that is more artificial than anything. You never have those empty in real life.

Large number of hash partitions

Luke, May 15, 2013 - 8:24 pm UTC

Hi Tom - I have a non-partitioned table with 250 million rows. The table has a not-null 'DATASET_ID' column. On average about 75,000 rows will share the same DATASET_ID value, so we have roughly 3,333 unique DATASET_IDs in the table.

Our application frequently needs to select all the rows for a given DATASET_ID, so I'm thinking of hash partitioning this table on DATASET_ID. All the queries we execute against this table include 'DATASET_ID=x' in the where clause, so every select would hit only a single partition.

Is there a reason I shouldn't create a large number of partitions if I know all my queries will be based on DATASET_ID? For example, if I create 1024 partitions then there should be about 3 DATASET_IDs per partition so I would expect a full-partition scan would be a pretty efficient way to retrieve the rows given a DATASET_ID. Or why not go with 4096 partitions and have about 1 DATASET_ID per partition?

Thanks...


Luke

Tom Kyte
May 16, 2013 - 9:25 am UTC

well, depending on the values of dataset_id, you could end up with 1 partition having ALL values and the rest with 0 (you'd have to be really unlucky ;))

for example, if the values are contiguous starting at one, you would get:

ops$tkyte%ORA11GR2> CREATE TABLE t
  2  (
  3    x   int
  4  )
  5  tablespace ptest
  6  PARTITION BY hash(x) partitions 4096
  7  /

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> insert into t select level from dual connect by level <= 3333;

3333 rows created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> variable x refcursor
ops$tkyte%ORA11GR2> declare
  2          l_sql clob := q'|
  3          select cnt, count(*) cnt_with_that_cnt
  4            from (
  5          select part#,
  6                 count(*) cnt
  7            from (|';
  8  
  9          l_str long := null;
 10  begin
 11          for x in (select partition_name, PARTITION_POSITION
 12                      from user_tab_partitions
 13                             where table_name = 'T'
 14                     order by PARTITION_POSITION )
 15          loop
 16                  l_str := l_str || ' select ' || x.PARTITION_POSITION || ' part#
 17                                        from t partition(' || x.partition_name || ') ';
 18                  dbms_lob.writeAppend( l_sql, length(l_str), l_str );
 19                  l_str := ' union all ';
 20          end loop;
 21          l_str := ' ) group by part# ) group by cnt order by cnt';
 22          dbms_lob.writeAppend( l_sql, length(l_str), l_str );
 23  
 24  
 25          open :x for l_sql;
 26  end;
 27  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> print x

       CNT CNT_WITH_THAT_CNT
---------- -----------------
         1              1498
         2               572
         3               173
         4                33
         5                 8



so, about half of them would get their own partition, 1/3 would share a partition, 8 of them would share a partition with four others.


if we go with random numbers, we might see:

ops$tkyte%ORA11GR2> insert into t select floor(dbms_random.value(1,1000000)) from dual connect by level <= 3333;


ops$tkyte%ORA11GR2> print x

       CNT CNT_WITH_THAT_CNT
---------- -----------------
         1              1507
         2               578
         3               164
         4                42
         5                 2





which looks about the same - but you could get a degenerate set of values such that they all hash or many hash to the same partition.

You would want to play with your data to see the spread you would achieve. Note that ptest was defined as:

create tablespace ptest datafile size 10m autoextend on next 1m uniform size 64k;


otherwise, we tend to start with 8mb initial extents in an auto-allocate tablespace and that table is huge even when empty! I used 64k for testing...


Why not list partition?

S, May 16, 2013 - 2:09 pm UTC

3K is not a huge number. Wouldn't it be better to list partition this table and have all rows with the same DATASET_ID in one segment? Given the name, it might even be loaded as a DATASET at the time ...

Tom Kyte
May 21, 2013 - 1:54 pm UTC

I would presume that the dataset_id is something that comes and goes over time. they probably do not have a fixed set of dataset ids.

List vs Hash Partitioning

Bhushan Yavagal, June 27, 2016 - 7:02 pm UTC

Greetings Tom,

I have a question on your latest follow up.

I have a table of about 16 million and growing, with pretty huge columns, but no unique keys. Retention is one year and it is batch loaded once every month, sometimes twice a month.

There is a column TRANS_ID(sequence generated number) that is same for all records loaded into the table during a single batch insert transaction.

At this point the volume of records loaded per batch is pretty skewed, sometimes it's huge (2-3 million), sometimes it's as little as 20k records.

Is List partitioning better or Hash for this table using the TRANS_ID key?
Chris Saxon
June 28, 2016 - 1:11 am UTC

Without knowing all of your requirements, its hard to say. But you have given 1 requirement, namely, "retention is one year". That would perhaps suggest a range/interval partitioning scheme on an appropriate date column.

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.