A lesson learned well!
Doug Burns, August 08, 2007 - 4:43 am UTC
I still remember the day that Tom (and Jonathan Lewis) pointed out this glaring error in one of my papers. Yet another myth, based on a partial truth, that I almost managed to propagate.
Thank goodness there are people like that around to point out such errors when they see them and I mean that just from my *own* point of view and my *own* errors!
thanks guys. that's so honest...
AMIR, August 08, 2007 - 6:01 am UTC
hi tom
thank for reply. 3 is false.
so
1, in case of uneven distribution of data how oracle behaves and balances the load on partition.
2, did we have to balance data over partition manually.
with apology an example from you needed.
regards
AMIR RIAZ
August 14, 2007 - 9:56 am UTC
Oracle looks at the allocated data on disk and splits it out. Forget partitioning for a minute, just think of the sum total collection of extents out there - they get split out and processed in parallel.
I did not understand your #2 directly above.
partition on update and delete
Amir Riaz, August 15, 2007 - 6:02 am UTC
thanks tom
let me summaried what i understand. in case of parallel query oracle uses partitions for partition elimination provided that we have histograms on these tables (does partition elimination possible without histograms). For full table scan for the table oracle while querying oracle forgets about the partition and divide the data on the disk.
if a bitmap index is on a table oracle cannot perform parallel processing(DOP is one) why?? and if a local bitmap index on the partition exists then the degree of parallelism is equal to number of partition why?
Now lets come to #2 what i want to say is if we perform parallel insert then degree of parallism is optimizer determined means dop can be greater than partitions. but for update and delete we divide data into small partitions manually for example if a update takes 30 minutes on a big table without partitions and i divide it manually into 10 partition then that update task should complete in 3 minutes provided that the data is evenly divided by hash partitions. I know you recommend droping table for delete and use insert for update but in case we cannot drop table what do you think about this option
regards
Amir Riaz
August 20, 2007 - 1:08 pm UTC
... partitions for partition elimination provided that we have histograms on these tables ....
Histograms have NOTHING to do with partition elimination - the actual partitioning information (eg: partition by range(dt) values less than 01-jan-2007 go into partition 1) plus the predicate (eg: where dt < 1-jan-2007) do that.
so, what you summarize is not accurate. Oracle figures out
a) what needs to be scanned
b) divides that into portions (granules, slices, whatever you want to call them)
c) assigns each portion to a parallel execution server
bitmap indexes do not preclude parallel operations. they do not force the degree to be the same as the number of partitions.
Now, you get into paragraph three which is a very specific case - parallel DML, bitmap indexes and an insert. Now you are doing something with a very specific implementation restriction (that makes utter sense when you think about how bitmap indexes work)
If you partition for this update/delete operation you talk of, you might absolutely KILL query response time on the other hand. You cannot partition in isolation from all of the other uses of the data.
I do not recommend dropping tables, I do not recommend inserts for updates. I don't know where you got that from - there are limited cases (eg: updating every row in a table) were a create table as select + drop + rename makes sense - but I've not said what you say I have.
AMIR RIAZ, August 23, 2007 - 8:44 am UTC
hi tom you said
"bitmap indexes do not preclude parallel operations. they do not force the degree to be the same as the number of partitions"
okay but here what oracle documentation says:
INSERT, UPDATE, MERGE, and DELETE operations on nonpartitioned tables are not parallelized if there is a bitmap index on the table. If the table is partitioned
and there is a bitmap index on the table, the degree of parallelism will be restricted to at most the number of partitions accessed.
Now these lines clearly says that DOP is at most number of partitions in case there is a bitmap index on the table. if we assume oracle will access all the partitions then degree of parallelism will be equal to number of partitions and is restricted by bitmpa index because one partition can have only on server process. I hope i am right this time
If you partition for this update/delete operation you talk of, you might absolutely KILL query response time on the other hand. You cannot partition in isolation from all of the other uses of the data.
dont you think this situation will be work in warehouse? if not why?
regard
Amir Riaz
August 23, 2007 - 1:04 pm UTC
In general - bitmap indices do not force the degree (you are looking at PDML (parallel dml) - a rarity - I'm talking PARALLEL - period, in general)
And I said above - yes in that one case there is this implementation feature, however IN GENERAL, no - it is not right to say "if you have a bitmap index, you are limited in the DOP" - that is not a correct statement, it needs a ton of qualification before it approaches accurate.
I think you need to understand how your data will be USED before you start partitioning. If, as stated, you partition for your update - you might literally KILL query response time.
Partitioning is a tool.
Use it wisely.
just confused by oracle doc
AMIR RIAZ, August 23, 2007 - 8:47 am UTC
hi tom you said
"bitmap indexes do not preclude parallel operations. they do not force the degree to be the same as the number of partitions"
okay but here what oracle documentation says:
INSERT, UPDATE, MERGE, and DELETE operations on nonpartitioned tables are not parallelized if there is a bitmap index on the table. If the table is partitioned
and there is a bitmap index on the table, the degree of parallelism will be restricted to at most the number of partitions accessed.
Now these lines clearly says that DOP is at most number of partitions in case there is a bitmap index on the table. if we assume oracle will access all the partitions then degree of parallelism will be equal to number of partitions and is restricted by bitmpa index because one partition can have only on server process. I hope i am right this time
If you partition for this update/delete operation you talk of, you might absolutely KILL query response time on the other hand. You cannot partition in isolation from all of the other uses of the data.
dont you think this situation will be work in warehouse? if not why?
thanks very much for you time
regard
Amir Riaz
just got your book
AMIR RIAZ, August 24, 2007 - 2:38 am UTC
thanks tom for quick reply let me try to summaries again
1. for parallel query without bitmap index DOP is not restricted by partitioning
2. for parallel query with a bitmap index DOP is not restricted by partitioning.
3. for PDML without bitmap index DOP is not restricted by partitioning
4a. for PDML with bitmap index DOP is not restricted by partitioning
4b. for PDML with bitmap index DOP is restricted by partitioning
also in you book oracle database architecture page 565 you said
5. if the table you wish to perform PDML on has a bitmap index in place of lob column then table must be partitioned in order to have the operation take place in parallel and degree of parallelism will be restricted to the number of partitions
now #5 is always right but either #4a is right or #4b is right
if #4a is right then #5 should be the case in which restriction take place.
however another theory says #4b is right because the way bitmap index works. when we perform DML oracle locks the bitmap of the bitmap index. in case of PDML when two server processes perform DML on the same partition we can have dead lock. so #4b is right.
thanks again for you kind help.
regards
Amir Riaz
August 24, 2007 - 3:03 pm UTC
I wrote exactly:
<quote>
PDML may only be performed on tables that have bitmap indexes or LOB columns if the table is partitioned, and then the degree of parallelism would be capped at the number of partitions. You cannot parallelize an operation within partitions in this case, as each partition would get a single parallel execution server to operate on it.
</quote>
that is very different from what you wrote as a supposed quote by me.
A better way to say your #4's would be:
in order to achieve PDML on a table with a bitmap index, you must partition it and the degree of parallelism would be limited to the number of partitions.
thousands of thanks to respectable TOM
AMIR RIAZ, August 25, 2007 - 2:07 am UTC
tom thanks for you response. Its the most comprehensive help and learning i ever have. Now i am completely clear able parallel processing
regards
Amir Riaz
how parallel processes is allocated
abhishek, September 12, 2013 - 11:24 am UTC
hi tom,
i still have some doubt about the point 3 asked by AMIR at start of this thread
Below is my doubt about uneven distribution of data in subpartition and parallel execution:
We have 4 subpartitions p1, p2 ,p3 ,p4. The no. of rows are 10000, 100,100,10.(u can consider order of data according to you :))
Case 1:
Now when we go for parallel execution of the query say 4 parallel processes(q1-4) are allocated to the 4 partitions. Partition p2,p3,p4 will be read quickly but p1 needs time. Even if q2-4 are done working. They will have to wait for process q1 to complete i.e. 3 paralell process are sitting idle. Once all are complete the result will be sent back to Query coordinator(QC).
Thus the time taken to read all 4 partition is depends on the maximum size of the subpartition.
Case 2:
As p1 is bigger, q1 is allocated for it. once it is done reading, process q2-4 are allocated and once all read are done , QC will combine the result and take appropriate action
I am confused which one of the above 2 cases are implemented by oracle. I am more in favour of case 1. But In that case unbalanced subpartitions will result in parallel processes to get blocked. It can be worse with more number of subpartition with unbalanced data distribution.
please clarify :)
September 23, 2013 - 6:00 pm UTC
we do not strictly partition by partitions, we would not assign a partition to each parallel execution server - we would do it be ranges of data based on volume.