Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, AMIR.

Asked: August 06, 2007 - 3:30 am UTC

Last updated: September 23, 2013 - 6:00 pm UTC

Version: 10.0.1

Viewed 1000+ times

You Asked

hi tom

thanks for this wonderful forum

I have a question regarding oracle parallel processing and partition. before oracle 9.2 parallel processing is not possible without partitioning and DOP was equal to number of partitions. however this behaviour has been changed in oracle 10g. Now parallel processing is possible without partitioning. so.

1. does oracle 10g still have this restriction that number of parition is equal to DOP

2. if i perform update or delete using direct load with parallel processing how will it work. because direct load works with extend by pass the cache. does update or delete also creates an extend like insert.

3. if 1 is true. then we have a problem. lets suppose that we have partition p1,p2,p3 containing records r1,r2,r3. the records are unevenly distributed means. in p1 partition we have 10 million records, on p2 partition i have 1 million records and on p3 partition i have only 10 records. then time to finish the job will be

max{t1,t2,t3}= t1 where t1,t2,t3 are the time to finish full scan p1,p2,p3

now if t1 is very very large then processes working on partition p2 and p3 will be idle and time of parallel processing will be nearly equal to serial processing. so parallel processing failed. One of my friend is of the view point that when this type of situation occurs oracle processes p2 and p3 after finishing their work start working on partition p1.

while it may be possible that optimizer allocate two or three processes to big partition p1 and one process to small partition to even up the work load.

Which one oracle follows?

Thanks alot for bearing us


and Tom said...

this:

...
have a question regarding oracle parallel processing and partition. before oracle 9.2 parallel processing is not possible without partitioning and DOP was equal to number of partitions. however this behaviour has been changed in oracle 10g. Now parallel processing is possible without partitioning. so.
..

is not TRUE, not true at all. Parallel processing PREDATES the introduction of partitioning by many releases. Parallel processing - 7.1.6, Partitioning - 8.0 (so, 7.1, 7.2, 7.3 - all had parallel without partitioning!!)



Now, there was parallel DML that was limited (updates, deletes) by partitioning. but in general - 99.9% of the useful stuff was never impacted by partitions (or lack thereof).


1) never was true.
2) how do you do an update or delete with a LOAD?? there are no "direct path" updates or deletes.
3) not true, never was true.


Rating

  (8 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A lesson learned well!

Doug Burns, August 08, 2007 - 4:43 am UTC

I still remember the day that Tom (and Jonathan Lewis) pointed out this glaring error in one of my papers. Yet another myth, based on a partial truth, that I almost managed to propagate.

Thank goodness there are people like that around to point out such errors when they see them and I mean that just from my *own* point of view and my *own* errors!

thanks guys. that's so honest...

AMIR, August 08, 2007 - 6:01 am UTC

hi tom

thank for reply. 3 is false.
so
1, in case of uneven distribution of data how oracle behaves and balances the load on partition.

2, did we have to balance data over partition manually.

with apology an example from you needed.

regards
AMIR RIAZ
Tom Kyte
August 14, 2007 - 9:56 am UTC

Oracle looks at the allocated data on disk and splits it out. Forget partitioning for a minute, just think of the sum total collection of extents out there - they get split out and processed in parallel.


I did not understand your #2 directly above.

partition on update and delete

Amir Riaz, August 15, 2007 - 6:02 am UTC

thanks tom

let me summaried what i understand. in case of parallel query oracle uses partitions for partition elimination provided that we have histograms on these tables (does partition elimination possible without histograms). For full table scan for the table oracle while querying oracle forgets about the partition and divide the data on the disk.

if a bitmap index is on a table oracle cannot perform parallel processing(DOP is one) why?? and if a local bitmap index on the partition exists then the degree of parallelism is equal to number of partition why?

Now lets come to #2 what i want to say is if we perform parallel insert then degree of parallism is optimizer determined means dop can be greater than partitions. but for update and delete we divide data into small partitions manually for example if a update takes 30 minutes on a big table without partitions and i divide it manually into 10 partition then that update task should complete in 3 minutes provided that the data is evenly divided by hash partitions. I know you recommend droping table for delete and use insert for update but in case we cannot drop table what do you think about this option

regards
Amir Riaz





Tom Kyte
August 20, 2007 - 1:08 pm UTC

... partitions for partition elimination provided that we have histograms on these tables ....

Histograms have NOTHING to do with partition elimination - the actual partitioning information (eg: partition by range(dt) values less than 01-jan-2007 go into partition 1) plus the predicate (eg: where dt < 1-jan-2007) do that.


so, what you summarize is not accurate. Oracle figures out

a) what needs to be scanned
b) divides that into portions (granules, slices, whatever you want to call them)
c) assigns each portion to a parallel execution server


bitmap indexes do not preclude parallel operations. they do not force the degree to be the same as the number of partitions.


Now, you get into paragraph three which is a very specific case - parallel DML, bitmap indexes and an insert. Now you are doing something with a very specific implementation restriction (that makes utter sense when you think about how bitmap indexes work)


If you partition for this update/delete operation you talk of, you might absolutely KILL query response time on the other hand. You cannot partition in isolation from all of the other uses of the data.


I do not recommend dropping tables, I do not recommend inserts for updates. I don't know where you got that from - there are limited cases (eg: updating every row in a table) were a create table as select + drop + rename makes sense - but I've not said what you say I have.

AMIR RIAZ, August 23, 2007 - 8:44 am UTC

hi tom you said

"bitmap indexes do not preclude parallel operations. they do not force the degree to be the same as the number of partitions"

okay but here what oracle documentation says:

INSERT, UPDATE, MERGE, and DELETE operations on nonpartitioned tables are not parallelized if there is a bitmap index on the table. If the table is partitioned
and there is a bitmap index on the table, the degree of parallelism will be restricted to at most the number of partitions accessed.

Now these lines clearly says that DOP is at most number of partitions in case there is a bitmap index on the table. if we assume oracle will access all the partitions then degree of parallelism will be equal to number of partitions and is restricted by bitmpa index because one partition can have only on server process. I hope i am right this time

If you partition for this update/delete operation you talk of, you might absolutely KILL query response time on the other hand. You cannot partition in isolation from all of the other uses of the data.

dont you think this situation will be work in warehouse? if not why?

regard
Amir Riaz
Tom Kyte
August 23, 2007 - 1:04 pm UTC

In general - bitmap indices do not force the degree (you are looking at PDML (parallel dml) - a rarity - I'm talking PARALLEL - period, in general)


And I said above - yes in that one case there is this implementation feature, however IN GENERAL, no - it is not right to say "if you have a bitmap index, you are limited in the DOP" - that is not a correct statement, it needs a ton of qualification before it approaches accurate.



I think you need to understand how your data will be USED before you start partitioning. If, as stated, you partition for your update - you might literally KILL query response time.

Partitioning is a tool.
Use it wisely.

just confused by oracle doc

AMIR RIAZ, August 23, 2007 - 8:47 am UTC

hi tom you said

"bitmap indexes do not preclude parallel operations. they do not force the degree to be the same as the number of partitions"

okay but here what oracle documentation says:

INSERT, UPDATE, MERGE, and DELETE operations on nonpartitioned tables are not parallelized if there is a bitmap index on the table. If the table is partitioned
and there is a bitmap index on the table, the degree of parallelism will be restricted to at most the number of partitions accessed.

Now these lines clearly says that DOP is at most number of partitions in case there is a bitmap index on the table. if we assume oracle will access all the partitions then degree of parallelism will be equal to number of partitions and is restricted by bitmpa index because one partition can have only on server process. I hope i am right this time

If you partition for this update/delete operation you talk of, you might absolutely KILL query response time on the other hand. You cannot partition in isolation from all of the other uses of the data.

dont you think this situation will be work in warehouse? if not why?

thanks very much for you time
regard
Amir Riaz

just got your book

AMIR RIAZ, August 24, 2007 - 2:38 am UTC

thanks tom for quick reply let me try to summaries again

1. for parallel query without bitmap index DOP is not restricted by partitioning

2. for parallel query with a bitmap index DOP is not restricted by partitioning.

3. for PDML without bitmap index DOP is not restricted by partitioning

4a. for PDML with bitmap index DOP is not restricted by partitioning

4b. for PDML with bitmap index DOP is restricted by partitioning


also in you book oracle database architecture page 565 you said

5. if the table you wish to perform PDML on has a bitmap index in place of lob column then table must be partitioned in order to have the operation take place in parallel and degree of parallelism will be restricted to the number of partitions

now #5 is always right but either #4a is right or #4b is right

if #4a is right then #5 should be the case in which restriction take place.

however another theory says #4b is right because the way bitmap index works. when we perform DML oracle locks the bitmap of the bitmap index. in case of PDML when two server processes perform DML on the same partition we can have dead lock. so #4b is right.

thanks again for you kind help.

regards
Amir Riaz
Tom Kyte
August 24, 2007 - 3:03 pm UTC


I wrote exactly:

<quote>
PDML may only be performed on tables that have bitmap indexes or LOB columns if the table is partitioned, and then the degree of parallelism would be capped at the number of partitions. You cannot parallelize an operation within partitions in this case, as each partition would get a single parallel execution server to operate on it.
</quote>

that is very different from what you wrote as a supposed quote by me.

A better way to say your #4's would be:

in order to achieve PDML on a table with a bitmap index, you must partition it and the degree of parallelism would be limited to the number of partitions.


thousands of thanks to respectable TOM

AMIR RIAZ, August 25, 2007 - 2:07 am UTC

tom thanks for you response. Its the most comprehensive help and learning i ever have. Now i am completely clear able parallel processing

regards
Amir Riaz

how parallel processes is allocated

abhishek, September 12, 2013 - 11:24 am UTC

hi tom,

i still have some doubt about the point 3 asked by AMIR at start of this thread

Below is my doubt about uneven distribution of data in subpartition and parallel execution:


We have 4 subpartitions p1, p2 ,p3 ,p4. The no. of rows are 10000, 100,100,10.(u can consider order of data according to you :))

Case 1:

Now when we go for parallel execution of the query say 4 parallel processes(q1-4) are allocated to the 4 partitions. Partition p2,p3,p4 will be read quickly but p1 needs time. Even if q2-4 are done working. They will have to wait for process q1 to complete i.e. 3 paralell process are sitting idle. Once all are complete the result will be sent back to Query coordinator(QC).
Thus the time taken to read all 4 partition is depends on the maximum size of the subpartition.

Case 2:

As p1 is bigger, q1 is allocated for it. once it is done reading, process q2-4 are allocated and once all read are done , QC will combine the result and take appropriate action


I am confused which one of the above 2 cases are implemented by oracle. I am more in favour of case 1. But In that case unbalanced subpartitions will result in parallel processes to get blocked. It can be worse with more number of subpartition with unbalanced data distribution.


please clarify :)
Tom Kyte
September 23, 2013 - 6:00 pm UTC

we do not strictly partition by partitions, we would not assign a partition to each parallel execution server - we would do it be ranges of data based on volume.


More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.