Parallel Query

More doubts

Suresh, July 21, 2001 - 11:38 pm UTC

Tom,
I understand from your answer that Parallel query would work only if it did full table scan. Now, If I were to issue a join select statement against a table with 10 million rows(index present, parallel query enabled) and another small table (1000 rows, index present, 'driving' table in the query), then Oracle would not perform parallel query against the first table as it is going to do a index scan in the first table..Is this right or am I missing some thing here?

Kalpesh, April 22, 2002 - 4:02 pm UTC

Hi Tom,
I am executing following query on two databases.
Two databases are same means they have same tables,same data, same indexes.Same init.ora parameters.
Those are copy of eachother.

SELECT NVL(b.trade_price,0), NVL(b.previous_day_change,0), NVL(b.daily_high,0), NVL(b.daily_low,0), NVL(b.w_52_high,0), NVL(b.w_52_low,0), NVL(b.volume,0)
INTO :trade_price, :previous_day_change, :daily_high,
:daily_low, :w_52_high, :w_52_low, :volume
from tkmaster.tbsymbols a,tkdaily.tbtick b
where a.symbol_id=b.symbol_id
and concat(a.symbol_name_reuters,'.N') = 'GE.N'
and a.symbol_type_id = 1

First one is producing followin explain plan:

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=104 Card=73 Bytes=81
76)

1 0 HASH JOIN* (Cost=104 Card=73 Bytes=8176) :Q429300
2

2 1 TABLE ACCESS* (FULL) OF 'TBSYMBOLS' (Cost=56 Card=73 Byt :Q429300
es=2920) 0

3 1 TABLE ACCESS* (FULL) OF 'TBTICK' (Cost=46 Card=22545 Byt :Q429300
es=1623240) 1

1 PARALLEL_TO_SERIAL SELECT /*+ ORDERED NO_EXPAND USE_HASH(A2) SW
AP_JOIN_INPUTS(A2) */ A1.C1,A1.C2,A1.C3,A1.C
4,A1.C5,A1.C6,A1.C7 FROM :Q4293001 A1,:Q4293
000 A2 WHERE A2.C0=A1.C0

2 PARALLEL_FROM_SERIAL
3 PARALLEL_TO_PARALLEL SELECT /*+ NO_EXPAND ROWID(A1) */ A1."SYMBOL
_ID" C0,A1."VOLUME" C1,A1."TRADE_PRICE" C2,A
1."PREVIOUS_DAY_CHANGE" C3,A1."DAILY_HIGH" C
4,A1."DAILY_LOW" C5,A1."W_52_HIGH" C6,A1."W_
52_LOW" C7 FROM "TKDAILY"."TBTICK" A1 WHERE
ROWID BETWEEN :B1 AND :B2

Second one is producing following explain plan:

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=139 Card=73 Bytes=72
27)

1 0 NESTED LOOPS (Cost=139 Card=73 Bytes=7227)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'TBSYMBOLS' (Cost=66 Ca
rd=73 Bytes=2920)

3 2 INDEX (RANGE SCAN) OF 'IDX_SYMBOL_TYPE_ID' (NON-UNIQUE
) (Cost=15 Card=73)

4 1 TABLE ACCESS (BY INDEX ROWID) OF 'TBTICK' (Cost=1 Card=5
7740 Bytes=3406660)

5 4 INDEX (UNIQUE SCAN) OF 'PK_TICK_SYMID' (UNIQUE)

It is taking more time on First one then the second one.
We have same indexes on both the datapools for both the tables:
Table_name Index_name
TBTICK PK_TICK_SYMID (SYMBOL_ID)
TBSYMBOLS IDX_EXCHANGE_ID (EXCHANGE_ID)
TBSYMBOLS IDX_SYMBOL_TYPE_ID(SYMBOL_TYPE_ID)
TBSYMBOLS IDX_SYMNM(SYMBOL_NAME)
TBSYMBOLS IDX_SYMNM_REUTER(SYMBOL_NAME_REUTERS)
TBSYMBOLS PK_SYM_SYMID(SYMBOL_ID)

Tbtick is having one index, tbsymbols is having 5 different indexes on different columns.

What could be the reason behind this ?
Why it is executing parallely on first one and why it is using index on second one ?
Do I need to rebuild the indexes on first one ?
How can I eliminate this ?
Please let me know your suggestions.

Thanks

April 22, 2002 - 9:03 pm UTC

I doubt the init.ora's are the same OR the statistics in one differ significantly from the other.

verify the init.ora's (diff a show parameter from each database) and check the stats.

Parallel Query

Kalpesh, April 23, 2002 - 3:01 pm UTC

Hi Tom,
As per your suggestion, I have taken the difference of "show parameters" (init.ora) for both the Databases and found no difference in any of the parameters.

Tom, you have said statistics might be different ?
Can you explain that and How can I find and fix that ?
Thanks

April 23, 2002 - 3:05 pm UTC

select table_name, num_rows from user_tables; compare the results.

do the same for indexes, look in user_tab_histograms as well.

Parallel Query

Kalpesh, April 25, 2002 - 2:14 pm UTC

Hi Tom,
I have taken the statistics from user_tables, user_indexes and user_tab_histograms. I am seeing differences between both the Databases.
Now what is the next step to fix the problem.
Please let me know.

Thank You

April 25, 2002 - 6:36 pm UTC

make the statistics the same.

analyze the tables in the "bad" database.

Another problem.

Ganesh Raja, April 24, 2003 - 11:41 pm UTC

Hi,

I am experiencing Problems with Parallel Query. The query is running in parallel if No Condition is given, but if there is any condition then the Query is not running in parallel.

There is no Change in the Plan between these two queries except that the Query is not running in parallel.

Since the Inner Query will be Submiited for EUL in discoverer i am Simulating what Discoverer will write as a Query.

Thanks a Ton !!!!

Regards,
Ganesh R

===========================================================
Query
============================================================
Select *
From
(
Select
Revenue.Business_Unit,
Revenue.Business_Division,
Revenue.Product_Line,
Revenue.Revenue_Mu,
Revenue.Product_Sub_Family,
Revenue.Part_Number,
Revenue.Part_Description,
Revenue.Order_Type,
Al.Meaning Return_Reason,
Revenue.Order_Number,
Revenue.Order_Line,
Revenue.GL_Date Revenue_Date,
Cogs.GL_Date Cogs_Date,
Decode(Revenue.Currency,
'USD',1,GDR.Conversion_Rate
) Conversion_Rate,
Revenue.USP
*
Decode(Revenue.Currency,
'USD',1,GDR.Conversion_Rate
) USP,
Decode(OOLA.Line_Category_Code,
'RETURN',
-Ordered_Quantity,
Ordered_Quantity) Quantity,
Revenue.Revenue
*
Decode(Revenue.Currency,
'USD',1,GDR.Conversion_Rate
) Revenue,
COGS.Cogs COGS
From
OE_Order_Headers_All OOHA,
OE_Order_Lines_All OOLA,
AR_Lookups AL,
GL_Daily_Rates GDR,
(
Select Business_Unit,
Business_Division,
Product_Line,
Revenue_MU,
Product_Sub_Family,
Part_Number,
Part_Description,
Order_Type,
Order_Number,
Order_Line,
GL_Date,
Currency,
Sum(USP) USP,
Sum(Revenue) Revenue
From
(
Select XAMT.D_BU Business_Unit,
XAMT.E_Dv Business_Division,
XAMT.F_PL Product_Line,
MC.Segment4 Revenue_Mu,
MC.Segment5 Product_Sub_Family,
MSIB.Segment1 Part_Number,
MSIB.Description Part_Description,
RCTA.Trx_Number Invoice_Number,
RCTLGDA.GL_Date GL_Date,
RCTA.Interface_Header_Attribute1 Order_Number,
RCTLA.Interface_Line_Attribute6 Order_Line,
RCTA.Interface_Header_Attribute2 Order_Type,
RCTA.Invoice_Currency_Code Currency,
RCTLA.Unit_Selling_Price USP,
RCTLGDA.Amount Revenue
From Mtl_System_Items_B MSIB,
Mtl_Item_Categories MIC,
MTl_Categories MC,
Ra_Customer_Trx_All RCTA,
Ra_Customer_Trx_Lines_All RCTLA,
RA_Cust_Trx_Line_GL_Dist_All RCTLGDA,
GL_Code_Combinations GCC,
XXATORCL.XXAT_AR_MU_TAB XAMT
Where 1=1
And MSIB.Inventory_Item_Id = MIC.Inventory_Item_Id
And MSIB.Organization_Id = MIC.Organization_Id
And Mic.Category_Id = Mc.Category_Id
And Mic.Category_Set_Id = 73 /* Product Hierarchy */
And RCTA.Customer_Trx_Id = RCTLA.Customer_Trx_Id
And RCTLA.Inventory_Item_Id = MSIB.Inventory_Item_Id
And RCTLA.Warehouse_Id = MSIB.Organization_Id
And RCTLA.Customer_Trx_Line_Id = RCTLGDA.Customer_Trx_Line_Id
And RCTLGDA.Code_Combination_Id = GCC.Code_Combination_Id
And XAMT.MU = MC.Segment4
And GCC.Segment2 = '400001'
And RCTLGDA.Amount != 0
And RCTLGDA.Account_Set_Flag = 'N'
And Interface_Line_Context = 'ORDER ENTRY'
) Revenue
Group By
Business_Unit,
Business_Division,
Product_Line,
Product_Sub_Family,
Revenue_Mu,
Part_Number,
Part_Description,
Order_Type,
Order_Number,
Order_Line,
GL_Date,
Currency
) Revenue,
(
Select Order_Line,
GL_Date,
Sum(Nvl(Trans_Cogs,Base_Cogs))
* Decode(Cogs.Currency_Code,
'USD',1,GDR.Conversion_Rate) Cogs
From
(
Select /*+ Parallel(MTA,5) Index(MMT MTL_MATERIAL_TRANSACTIONS_U1) */
MMT.Source_Line_Id Order_Line,
MTA.Transaction_Date GL_Date,
Nvl(MTA.Currency_Code,
GSOB.Currency_Code) Currency_Code,
MTA.Transaction_Value Trans_Cogs,
MTA.Base_Transaction_Value Base_Cogs
From MTL_Material_Transactions MMT,
MTL_Transaction_Accounts MTA,
GL_Code_Combinations GCC,
HR_Organization_Information HOI,
GL_Sets_Of_Books GSOB
Where MMT.Transaction_Id = MTA.Transaction_Id
And MTA.Reference_Account = GCC.Code_Combination_Id
And MTA.Organization_Id = HOI.Organization_Id
And HOI.Org_Information1 = GSOB.Set_Of_Books_Id
And HOI.ORG_Information_Context = 'Accounting Information'
And GCC.Segment2 = '700001'
And MTA.Accounting_Line_Type = '2'
) COGS,
GL_Daily_Rates GDR
Where COGS.Currency_Code = GDR.From_Currency (+)
And Trunc(COGS.GL_Date) = Trunc(GDR.Conversion_Date(+))
And GDR.To_Currency(+) = 'USD'
And GDR.Conversion_Type(+) = 'Corporate'
Group By Order_Line,
Gl_Date,
Cogs.Currency_Code,
GDR.Conversion_Rate
) COGS
Where OOHA.Header_Id = OOLA.Header_Id
And OOLA.Line_Id = Revenue.Order_Line
And AL.Lookup_Code(+) = OOLA.RETURN_REASON_CODE
And AL.Lookup_Type(+) = 'CREDIT_MEMO_REASON'
And Revenue.Order_Line = COGS.Order_Line(+)
And To_Char(Revenue.GL_Date,
'YYYYMM') = To_Char(COGS.GL_Date(+),'YYYYMM')
And To_Char(OOHA.Order_Number) = Revenue.Order_Number
And Revenue.Currency = GDR.From_Currency (+)
And Trunc(Revenue.GL_Date) = Trunc(GDR.Conversion_Date(+))
And GDR.To_Currency(+) = 'USD'
And GDR.Conversion_Type(+) = 'Corporate'
)
--Where Product_Line Like 'F47515%'
--And Revenue_Date Between '01-MAR-2003' and Last_Day('01-MAR-2003')

/*
Uncomment the Abve Two Lines and the query is Going for Serial Access. With Comment or Without Comment the The Material_Transaction_Accounts Table (MTA) which i have hinted to run in parallel is Going for a Full Table Scan
*/

============================================================

April 25, 2003 - 7:46 am UTC

I don't really know what to do with a N page query like that. The sql parser in my head overloaded.

I doubt the plans are the same -- look closer -- you'll find an index path in there somewhere.

In any case, the query itself is not useful. one would need the plans AND this would not be the right place to do that -- when I'm taking questions, this would be a candidate for a new question.

Vipin, July 07, 2003 - 11:07 am UTC

Hi Tom,

could you please clear this basic doubt for me. Will I be benefited from the Oracle Parallel query option unless and until my table is stripped across multiple data files which are located on multiple disks, so that mutiple controllers can do the IO for me in parallel.

July 07, 2003 - 11:17 am UTC

that will increase the benfit, yes.

is it mandatory -- no, not necessarily, will it be benefical, absolutely.

Vipin, July 07, 2003 - 11:35 am UTC

Could you please tell in which scenario the parallel query option will help when the table is not stripped that is the table is residing in the same tablespace.

July 07, 2003 - 11:44 am UTC

are you a multi-user system?
do you have some degree of scalability even when more then one person hits your database at the same time?

so why not for parallel?

select sum( some_number ) from big_table;

say that is done in parallel 2 on a single disk machine. Might it be true that while process 1 is reading disk , process 2 is summing some numbers and vice versa.

Unless you run everything in single user mode, you are in fact doing "parallel" stuff right now.

Consider an index build

i read
i sort
i write

might it be that two processes doing it get it done faster, even though it is the same disk. You do not need to have >1 disk to increase throughput -- unless your process is 100% disk based.

Full scans read ahead -- so every IO reads many blocks. It takes time to process those blocks, during which you are not reading (and so someone else might be able to!)

A reader, July 07, 2003 - 12:02 pm UTC

Tom,

is it just benchmarking that would drive me to use Parallel query option or is there any situation in which by seeing the problem i get a spark in head "Oh here I can use PARALLEL query"...

July 07, 2003 - 12:35 pm UTC

administrative things pop into my head first and foremost.... (create index, other "big" things)

then there is DIY parallelism (do it yourself) - make your CODE parallelizable - that big ETL (extract/transform/load) you do, make sure your T is parallelizable but writing code that can run concurrently on bits of the data

then there is parallel query, sort of what I would use last... it is of most use when you have few users -- and how often does that happen anymore ;)

A reader, July 07, 2003 - 12:39 pm UTC

Inspite of Noparallel Hint Parallel Query is executed.

A reader, August 12, 2003 - 8:00 am UTC

Tom,
I've a index on a table with degree 10. Our parallel query was dying with error ORA-12805. We decided not use the parallel query but we cannot set the degree to 1. Hence we used hint NOPARALLEL. Inspite of of hint my query is executed in parallel. Below are 3 trials of the query with changes in the hint.

Can you suggest any reason why my parallel query is used inspite of hint or without hint ?

12:49:58 OPS$QDW@QDWP:tpcds1> set autotrace traceonly explain
12:50:06 OPS$QDW@QDWP:tpcds1> SELECT /*+ NOPARALLEL INDEX(pstn_order_c_2 pstn_order_c_a_pk) */ data_date
12:50:17 2 FROM pstn_order_c_2
12:50:21 3 WHERE order_number = '123456'
12:50:25 4 AND sub_zone = 'ZA';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=85 Card=1 Bytes=15)
1 0 INDEX* (FAST FULL SCAN) OF 'PSTN_ORDER_C_N_PK' (UNIQUE) (C :Q108810
ost=85 Card=1 Bytes=15) 7000

1 PARALLEL_TO_SERIAL SELECT /*+ Q1088107000 */ A1."DATA_DATE" FRO
M :I."PSTN_ORDER_C_2"."PSTN_ORDER_C_N_PK" P
X_GRANULE(0, BLOCK_RANGE, DYNAMIC) A1 WHERE
A1."ORDER_NUMBER"='123456' AND A1."SUB_ZONE"
='ZA'

2:51:29 OPS$QDW@QDWP:tpcds1> SELECT /*+ NOPARALLEL */ data_date
12:51:32 2 FROM pstn_order_c_2
12:51:36 3 WHERE order_number = '123456'
12:51:41 4 AND sub_zone = 'ZA';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=85 Card=1 Bytes=15)
1 0 INDEX* (FAST FULL SCAN) OF 'PSTN_ORDER_C_N_PK' (UNIQUE) (C :Q108811
ost=85 Card=1 Bytes=15) 0000

1 PARALLEL_TO_SERIAL SELECT /*+ Q1088110000 */ A1."DATA_DATE" FRO
M :I."PSTN_ORDER_C_2"."PSTN_ORDER_C_N_PK" P
X_GRANULE(0, BLOCK_RANGE, DYNAMIC) A1 WHERE
A1."ORDER_NUMBER"='123456' AND A1."SUB_ZONE"
='ZA'

12:52:29 OPS$QDW@QDWP:tpcds1> 12:52:29 OPS$QDW@QDWP:tpcds1> SELECT data_date
12:52:35 2 FROM pstn_order_c_2
12:52:39 3 WHERE order_number = '123456'
12:52:43 4 AND sub_zone = 'ZA';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=85 Card=1 Bytes=15)
1 0 INDEX* (FAST FULL SCAN) OF 'PSTN_ORDER_C_N_PK' (UNIQUE) (C :Q108811
ost=85 Card=1 Bytes=15) 3000

1 PARALLEL_TO_SERIAL SELECT /*+ Q1088113000 */ A1."DATA_DATE" FRO
M :I."PSTN_ORDER_C_2"."PSTN_ORDER_C_N_PK" P
X_GRANULE(0, BLOCK_RANGE, DYNAMIC) A1 WHERE
A1."ORDER_NUMBER"='123456' AND A1."SUB_ZONE"
='ZA'

August 12, 2003 - 8:58 am UTC

and you have filed a tar with support right?

NOPARALLEL(pstn_order_c_2)

Parallel Query

wor, November 12, 2003 - 10:53 am UTC

I have read several bits and peices about Parallel Query and bind variables.
We have Oracle on Win2000, 2 CPU's and around 80GB hard drive.
1) Parallel Query is good if you have huge tables that requires full table scan. Is this correct ?
2) When you use Bind variables in the SQL the oracle does Full Table Scan beacase the oracle does not know what value the bind variable may have. Is this correct ?

If both the answer are correct than
What is the harm in writing all the SQl statements with parallel hint when using bind variables ( even for tables with 100,000 rows ) ?
Is there only a win-win situation in this or is there any drawback ?

November 12, 2003 - 11:26 am UTC

1) that is one of the uses, yes. you need to meet other criteria (eg: you want to allow a single user to use tons of resources for example) but yes.

2) no, that is patently "not correct"

cpu utilization

A reader, January 23, 2004 - 3:39 pm UTC

We have a Dual Xeon 2.8 Dell Windows 2000 Server running 9.2.0.4.0.

SQL> show parameter cpu_count

cpu_count                            integer     4

Here are my parallel query init.ora settings:

parallel_adaptive_multi_user         boolean     TRUE
parallel_automatic_tuning            boolean     TRUE
parallel_instance_group              string
parallel_max_servers                 integer     100
parallel_min_percent                 integer     0
parallel_min_servers                 integer     4
parallel_server                      boolean     FALSE
parallel_execution_message_size      integer     4096
parallel_server_instances            integer     1
parallel_threads_per_cpu             integer     2

On the server desktop, I open task manager, and a command line sqlplus session.

SQL> select count(*) from user_objects;

  COUNT(*)
----------
       518

Elapsed: 00:00:00.00

SQL> select count(*) from user_objects,user_objects,user_objects;

  COUNT(*)
----------
 138991832

Elapsed: 00:04:59.06

SQL> select /*+ PARALLEL (user_objects 4) */ count(*) from user_objects,user_objects,user_objects;

  COUNT(*)
----------
 138991832

Elapsed: 00:00:18.08

So the parallel query runs much more quickly than the non-parallel query.

Now - What is disturbing me, is that when I execute these CPU intensive operations, that my oracle.exe process shoots to 25%, and stays there, for both the non-parallel *and* parallel executions. When I switch tabs from "Processes" to "Performance", only one of the CPU's is showing work accomplished in the line chart.

If I kick off an export of the schema (as an example of an operating system command that is pretty CPU intense) I see all the CPU's showing work accomplished. So it can't be a problem with the OS or hardware. I hope it's a problem with how I'm doing what I'm doing or one of my settings.


Why am I not seeing other CPU's getting involved in the processing? In fact, shouldn't I be seeing all the CPU's being utilized when I kick off a query whether it is in parallel mode or not? In other words, I would think the oracle.exe process should show processor utilization across all 4 processors. Am I wrong?? :L

January 23, 2004 - 6:34 pm UTC

perhaps -- your queries are in fact not cpu intensive but rather IO bound.

I'd guess that cartesian join ate a bit of temp, don't you?

You'd have to do something cpu intensive and I think you did something io bound here. In any case, 18 seconds -- not bad eh?

Quad Xeon in above post

A reader, January 23, 2004 - 3:41 pm UTC

Sorry.

Multiple CPUs and utilization

Mark A. Williams, January 23, 2004 - 9:34 pm UTC

> Why am I not seeing other CPU's getting involved in the processing?

Windows is responsible for spreading the load across CPUs, not Oracle, so that may be a consideration.

- Mark

Windows is spreading the utilization across

A reader, January 25, 2004 - 3:28 pm UTC

multiple CPU's for other processes.

But not for the oracle.exe process.

I think the issue is the threads. Of course Oracle is a multi-threaded application, but is it properly informing the O/S that it wants to utilize all 4 CPUs? It is obviously mult-threading within one CPU (If my terminology is correct), but I don't necessarily blame the O/S when I see other programs able to utilize more than one CPU when executed.

The process is indeed CPU intensive. If I run it on a single CPU Windows 2000 server, the CPU goes to 100% until completion, so although I/O may be intensive, it is still whacking the CPU hard.

I am not understanding what I am seeing, and would wonder if anyone else out there that runs M$ 2000 on a multi-cpu machine can be of help? Does anyone have any hard and fast answers with proof?

Thanks much!
The feedback is truly appreciated.

January 25, 2004 - 4:09 pm UTC

oracle.exe will use many cpu's on windows.

windows is responsible for scheduling threads on more then one cpu.

tell you want -- start up a couple of job queue processes (alter system set job_queue_processes=20) and submit to dbms_job 20 background "big queries". watch that cpu go up.

If the process was CPU intensive -- it would take lots lots longer then 18 seconds there.

tell me -- when you run it on that single cpu going to 100%, does it run

a) faster
b) slower

than the 4 cpu with 25% utilization? I'm not seeing anything to get panicked over here -- windows is the guy that schedules threads.

windows model is all about threads.

Dave, January 25, 2004 - 4:55 pm UTC

I have a data warehouse on 9.2, with all the parallel query parameters set to automatic, and cannot get a query on a large, range-partitioned table to parallelize.

This is with the table's degree set to default, a system that is otherwise completely unutilized, and most importantly I have set "alter session force parallel query" which appears to be medicine so powerful that even "select * from dual" is parallelizing.

Would you say that it's time to open a TAR?

January 25, 2004 - 5:37 pm UTC

nope, cause you give us not a single example to work with.

Not every query is parallelizable and queries that we don't think need be -- won't be.

What you would have to do is

a) run the query and let the optimizer do its thing
b) hint the query and let it do what you think is best

in order to show us that "hey, it should have been parallel"

Dave, January 25, 2004 - 8:03 pm UTC

Well, it's a 30Gb data warehouse fact table that we are trying to build a summary table on -- a full table scan is taking 11 minutes as a non-parallel process. We have about 40 partitions on the table, 8 cpu's, RAID1+0, but the most compelling evidence for me that something is amiss is that even "force parallel query" will not parallelize the query ...

select /*+ full(t) */ count(*) from my_table t;

... and that much smaller tables -- summaries of this table, in fact -- parallelize at the drop of a hat?

Does it seem anomalous that this table should not be blessed, no matter how hard we try to force the issue?

January 25, 2004 - 8:15 pm UTC

what is the plan for that query (and is that the query in question)

parallel query

rabeila, January 26, 2004 - 2:16 am UTC

follow up to oracle.exe across multiple cpus

Justin, January 28, 2004 - 3:08 pm UTC

More information for those interested.

I figured out what the "problem" was.

I had changed what is called the "Priority Class" in Windows Task Manager from "Normal" to "Real Time" for the oracle.exe process, and it actually increased performance for a query I was running. At that time, I did not realize that the performance tab of task manager switched from showing all 4 CPUs doing work, to only 1 CPU doing the work.

So, I was relieved to figure out the answer.

I am still in a quandry though over the fact that I am being limited to 25% of the total CPU processing power. If I kick off to of the queries listed in my previous post, the total CPU utilization goes to 50% exactly, and stays there for the duration of the queries. Odd that the total CPU consumption for both settings is 25%...

Tom, what query would you use to max out a quad CPU machine? As you inferred earlier, my particular query would have been disk intensive also. What would be an example of a query that would be almost totally CPU intensive? Something using a bunch of mathematical functions I would imagine?

Would you also have an idea for a way to execute a command at that os that would be very CPU intensive? That way I could determine if the limit is an Oracle related issue at all.

Thanks much

January 29, 2004 - 7:38 am UTC

I might use the sum(ln(some_number)) in parallel. you'll find ln (natural log) to be fairly cpu consuming (until 10g with binary_float and binary_double anyway)

Spawning multiple Jobs for parallel processine

Vivek Sharma, April 13, 2004 - 6:15 am UTC

Dear Tom,

We have monthly Batch Jobs which run for 12-16 Hours and the records are processed for 5-6 products. Our application team has come up with a solution to reduce the time taken for completion of this job from 12-16 hours to 4-5 hours. Currently, for each products the main query has a IN Clause
for example
prodcd in (1,2,3,4,5,6).

The Application tean has come up with a solution that at runtime based on the number of products, several jobs will be scheduled (using dbms_job) at runtime and each of these jobs will process individual product codes. This will complete the processing jobs in parallel (6 difference jobs) and hence will take less time.

Is there any flaw or risk in this method of improving the response time.

I see a lot of scope in tuning the code as the optimizer is RBO and the indexes used by the queries are not effecient.

What do you suggest ?

Thanks and Regards
Vivek Sharma

April 13, 2004 - 8:10 am UTC

I wrote a chapter on DIY parallelism (do it yourself) in my last book "Effective Oracle By Design"

It is a sound approach as long as the parallel processes do not contend with eachother for resources.

parallel and free lists

Sam, April 13, 2004 - 11:42 am UTC

If I have DOP 8 for my parallel insert into a table. Does it mean that I should have 8 free lists defined for the table segment one for each of the parallel server process? How does it work? Thanks.

April 13, 2004 - 6:13 pm UTC

parallel insert is going to be a parallel direct path insert typically (using append).

it won't use freelists at all, each pq slave will write to its own extent.

sqlloader

reader, April 14, 2004 - 11:06 am UTC

If I used sqlldr direct path mode, do i have to put the table in nologging first and then load the data? Thanks.

April 14, 2004 - 11:35 am UTC

"have to" - no

"if you did" - it would bypass redo log generating in an archive log mode database and have no effect in a noarchive log mode database

"should you" -- please consult your DBA first, they sort of need to be aware if you do so they can schedule a backup ASAP

Parallel query problem

Arun Gupta, April 15, 2005 - 2:47 pm UTC

Tom,
Please see the example below. I do not understand why I am getting ora-00932 with PARALLEL hint with IOT.
Thanks...

Agupta@Devl@1> drop table t1;

Table dropped.

Agupta@Devl@1> CREATE TABLE T1
2 (
3 C1 NUMBER(10) NULL,
4 C2 NUMBER(10) NULL,
5 C3 NUMBER(10) NULL,
6 CONSTRAINT PK_T1 PRIMARY KEY (C1)
7 )
8 ORGANIZATION INDEX
9 NOPARALLEL;

Table created.

Agupta@Devl@1> select /*+ PARALLEL (a, 5) */ count(*)
2 from t1 a,
3 (select c2, max(c1) as c1
4 from t1
5 where c2=100
6 group by c2
7 ) b
8 where a.c2=100
9 and a.c2 = b.c2
10 and a.c3 = b.c1;
select /*+ PARALLEL (a, 5) */ count(*)
*
ERROR at line 1:
ORA-12801: error signaled in parallel query server P003
ORA-00932: inconsistent datatypes: expected NUMBER got ROWID

Agupta@Devl@1> select count(*)
2 from t1 a,
3 (select c2, max(c1) as c1
4 from t1
5 where c2=100
6 group by c2
7 ) b
8 where a.c2=100
9 and a.c2 = b.c2
10 and a.c3 = b.c1;

COUNT(*)
----------
0

Agupta@Devl@1>

April 15, 2005 - 2:57 pm UTC

looks like "a bug", please contact support.

more parallel slaves,

sns, April 27, 2005 - 10:51 am UTC

Tom,

I ran a insert..select statement having parallel degree 4 for the insert table and parallel degree 8 for the table in the select statement. The job took 3 hours to complete.

The parallel_max_servers in the database has been set to 128. I thought to make use of more degree of parallelism and increased the parallel degree to 8 for my insert table and 16 for my select table.

The job took almost 12 hours and we killed it. Nothing has changed in the SQL query other than increasing the parallel degree in the hint.

What could cause this kind of extremism?
How do we judge the amount of parallelism the job can take?

I am running in 8.1.7 database.

thanks,

April 27, 2005 - 10:59 am UTC

how many cpu's do you have, is the data "spread out", is the table you are writing to striped over tons of disk?

follow up,

sns, April 27, 2005 - 11:07 am UTC

4 cpu box. The table I am selecting from is spread across 4 different tablespaces (partition table). Each tablespace is having 4 data files under the same mount point. The table I am inserting into is stored in one tablespace (4 data files). the amount of data I am inserting is pretty less (16k records).

My query does not do any sorting. It is pure IO and join with other dimension tables using hash join.
My hash area size is 26214400.

thanks,

April 27, 2005 - 11:11 am UTC

doubt I'd be going over parallel 8 then.

you might not have sufficient IO capabilities. parallel query tends to do direct io (read from disk). parallel insert does direct path writes (write direct to disk).

You have one big mount point, if that is a single disk, the more you hit it the slower it'll go.

thanks

sns, April 27, 2005 - 11:16 am UTC

Then I would stick on to my previous parallel degrees.

If the data is stored in single mount point, it is better not use big parallel degrees.
Does this apply for tables in the select statement or for the tables in the insert statement?

thanks

April 27, 2005 - 11:28 am UTC

if that single mount point is on top of a big striped array with hundreds of disks and tons of controllers -- the answer is one thing...

if that single mount point is say a single 96gig drive the answer is something entirely different.

It is all about "how much IO capabilities do you have", not so much "single mount point". insufficient data to say here..

makes sense,

sns, April 27, 2005 - 11:44 am UTC

Well,
"if that single mount point is on top of a big striped array with hundreds of disks and tons of controllers"
WHAT IS THE ANSWER?

"if that single mount point is say a single 96gig drive the answer is something entirely different"
WHAT IS THE ANSWER?

It is all about "how much IO capabilities do you have"
HOW TO FIND OUT HOW MUCH IP CAPABILITIES I HAVE?

thanks for the quick response.

April 27, 2005 - 11:47 am UTC

lots of IO capabilities ==> will scale to a higher degree of parallelism.

little IO capabilities ==> will NOT scale

ask your system administrator what your system is built on.

David Aldridge, April 28, 2005 - 2:00 pm UTC

On the subject of parallel query, I'm wondering how the consistency of data scanned with a parallel query is guaranteed?

I was looking at the example of inter- and intra-parallelsim here </code> http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96524/c20paral.htm#29345 <code>in which ...

* The employees table is read
* a hash table is built from it
* the departments table is read
* the employees hash table is probed for each department row

(by the way, would it generally not be the other way around? Build hash of the smaller department table then probe with the employee rows?)

I was considering the problem of what happens if a commited change to the department table is written to disk after the query starts and before the changed block is read by the server process (parallel query being direct and all). Does the query server recognise this situation and then read resolve the problem in the regular way, by reading the rollback segments?

What about if a commited change has not yet been written to disk?

Lastly, can delayed block cleanout occur on a parallel read?

April 28, 2005 - 2:09 pm UTC

there is a use_hash hint in there.... it did the hash order.

<quote>
Note that hints have been used in the query to force the join order and join method, and to specify the degree of parallelism (DOP) of the tables employees and departments. In general, you should let the optimizer determine the order and method.
</quote>

read consistency still works, they all know the "as of" scn and how to do consistent reads.

hash join,

sns, April 28, 2005 - 2:19 pm UTC

I had a query to do hash join between a million record table and 400 records table. The query on tuesday executed in less than a minute.

Yesterday the same query (no change at all) ran for the whole night.

I am not sure where to look the problem is. But I have to answer my managers the reason for its prolonged execution.

Could you explain me the behaviour of hash join and how to monitor the join process while it is executing and any other related information?

thanks

April 28, 2005 - 2:32 pm UTC

v$session_longops for monitoring

v$sql_plan to see what the plans are and if they changed.

David Aldridge, April 28, 2005 - 2:31 pm UTC

Thanks

So if the parallel query server is reading the data files directly, but the commited change has not been written to disk yet, what mechanism does it use to find out that there are commited changes that it needs to consider?

April 28, 2005 - 2:53 pm UTC

PQ will issue a global checkpoint (reason 122223 why parallel query on a transactional system is deadly).

If you start seeing your major waits are "checkpoint" waits on a transaction system -- odds are someone rebuilt an index in parallel and left is "in parallel" mode and you are doing parallel queries by accident...

David Aldridge, April 28, 2005 - 3:47 pm UTC

>> PQ will issue a global checkpoint (reason 122223 why parallel query on a transactional system is deadly). <<

There's not a "get out of jail free" card on that global checkpoint if the tablespaces being accessed are read-only, is there? Just a thought.

April 28, 2005 - 4:24 pm UTC

if they are read only, the blocks would not have been dirty in the buffer cache and are not going to be modified anyway.

alter tablespace read only checkpoints the tablespace first. so, there is nothing to flush for them

Checkpoints

Jonathan Lewis, April 28, 2005 - 4:32 pm UTC

Unless it's changed in 9, the checkpoint is a segment level checkpoint - which is not nice, but isn't the disaster it used to be.

V7 used an extent based checkpoint - and didn't have a checkpoint queue, so if you parallel scanned a table, you ended up doing a full buffer search and flush once for every extent in the table. (Which is one reason why you had to be a little careful about the number of extents in a segment in v7 !! Total killer if you used parallel query on OPS with partition views.).

April 28, 2005 - 4:37 pm UTC

thanks -- appreciate the followup.

David Aldridge, May 03, 2005 - 11:05 am UTC

Excellent stuff chaps.

Couple more thoughts ... for a partitioned table the segment-level checkpoint would ideally just flush particular partitions, but those aren't necessarily known until execution time, right? Does this imply that the checkpoint might be over-inclusive of which partitions it affects?

Also, can you share the method for working out that the checkpoint (of a particular type) has taken place?

Segment Checkpoint

Jonathan Lewis, May 04, 2005 - 2:53 pm UTC

I'm going to have to do a little work on this one.
On a quick test, I saw that 8i, 9i, and 10g show the query coordinator waiting on a TC (tablespace checkpoint) event, and a couple of TC enqueues being taken out. But no sign of a segment specific checkpoint.

This occured when I had 16 partitions which should have been dirty - so I was looking for 16 of 'something' to appear in the stats somewhere. No such luck. Also, the partitions were spread across two tablespaces, so I might have expected to see two waits on the TC events - and didn't.

My memory of segment based checkpoints may be older than I thought - possibly the 'obvious' indications were in 7.3.4 - as I recall seeing N waits for 'CI' (cross instance) enqueues when running a parallel query against N segments, but in 7.3.4 the segments would have been tables in a partition view, not partitions in a table.

It is likely that whatever checkpoint is issued, it will not be excessive, as it is issued by query co-ordinator, which has to know at run-time exactly which partitions are going to be visited.

Parallel hint not used

Yogesh, May 10, 2005 - 5:43 am UTC

Under which circumstances oracle 8.0.4 will or will not use parallel hint?

I executed following query yesterday

create table x as select /*+ parallel (f,8,8) */ ...

This query did not spawned any additional process and the query was running forever...

Today I executed same query, not only it-spawned processes, but it finished in 25 mins.

I could not understand what was the problem. Can you please explain?

There was no change in init.ora parameters

May 10, 2005 - 9:10 am UTC

insufficient detail, could have been as simple as a typo on your part. could have been that the pq slaves were all already busy.

If all parallel server processes in the pool are occupied and the maximum
number of parallel server processes has been started, the parallel coordinator
switches to serial processing.

Parallel processes

Yogesh, May 10, 2005 - 12:34 pm UTC

typo on your part

---- I was using a script for execution. So possiblities of typo are rare.

If all parallel server processes in the pool are occupied and the maximum number of parallel server processes has been started, the parallel coordinator switches to serial processing.

--- I checked v$session, there was no session using parallel process.

May 10, 2005 - 1:50 pm UTC

they could have been when your query began. You looked perhaps far after the event started.

Everything is possible, you asked for "possibilities"

relation between parallel query and db_file_multiblock_read_count

deba, May 17, 2005 - 5:14 am UTC

Tom,

I want to know one little thing. If I use parallel query to read a full table ( full table scan ), then parallel query co-ordinator will break into few rowid range scans. Am I right ? If so then where is the use of db_file_multiblock_read_count (what I know this is used in FTS ) in this respect ? Can you explain this ?

The second question:
In 9i , DB_FILE_DIRECT_IO_COUNT becomes hidden parameter (_DB_FILE_DIRECT_IO_COUNT ). By default it is 1 mb , not in terms of blocks. Now if I chnage this parameter to some large value ( if io size of our OS allows the new large size ) then will the new value help to speed up the parallel query ? I know any change in hidden parameter is not reccomended but still I want to know the answer.

Thanks
Deba

May 17, 2005 - 8:55 am UTC

if you take an 8 gigabyte table and break it into 8 - 1 gigabyte "pieces" and ask 8 processes to read it, they will each "full scan" (multi-block read) their piece using the db file multiblock read count.

I do not, will not, recommend touching undocument parameters.

Parallel query problem

Rakesh Bansal, May 17, 2005 - 7:37 am UTC

Tom,
In continution of the response from Arun Gupta dated April 15, 2005 in same thread, I'm also getting parallel server query error with different use case. I'm running following query

select a.i_iss_id,a.instr_id,a.a_ast_prc_rpt new_ast_prc,c.a_ast_prc_rpt old_ast_prc,
cast(ROUND(cast((a.a_ast_prc_rpt - c.a_ast_prc_rpt) * 100 as decimal(26,10))/c.a_ast_prc_rpt,3) as decimal(18,3)) percent_chg
from sld_iss a,
sld_ctl b,
sld_iss_hist c
where c.d_cal = (b.d_cal- 1)
and a.i_iss_id = c.i_iss_id
and c.a_ast_prc_rpt <> 0
and abs(cast(ROUND(cast((a.a_ast_prc_rpt - c.a_ast_prc_rpt) * 100 as decimal(26,10))/c.a_ast_prc_rpt,3) as decimal(18,3))) > 50
order by a.i_iss_id;

and I'm getting following error

ORA-12801: error signaled in parallel query server P025
ORA-01727: numeric precision specifier is out of range (1 to 38)

But if I remove "ORDER BY" clause or if I add "ROWNUM < 9999999999999999999999999999" (read 38 times 9) I'm not getting any error.
Moreover same query is working in DB2 UDB Database.
I'm not sure whether it is a query related problem or Database issue.
Please give resolution for the same.
Thanks.

May 17, 2005 - 9:13 am UTC

please contact support for something like this.

relation between parallel query and db_file_multiblock_read_count

deba, May 26, 2005 - 12:37 am UTC

Hi Tom,

Thanks for the reply. But I can not accept one thing. In our datawarehouse environment, currently value of db_file_multiblock_read_count is 32. We want to see the effect of changing this value to 64. WE open a session and run 10 critical sqls and in each session we use db_file_multiblock_read_count = 64 ( using alter session ). Every time we restart the database before running sqls.So we see a really good change in response time.

We tried to use parallel query. So we have set following:

parallel_adaptive_multi_user boolean TRUE
parallel_automatic_tuning boolean TRUE
parallel_execution_message_size integer 4096
parallel_max_servers integer 120
parallel_min_percent integer 0
parallel_min_servers integer 0
parallel_server boolean FALSE
parallel_threads_per_cpu integer 2
cpu_count integer 12
parallel_threads_per_cpu integer 2

Now using this setting, I ran the same 10 sqls where db_file_multiblock_read_count = 64 and once again I ran the same 10 sqls where db_file_multiblock_read_count = 32.

Statistics when db_file_multiblock_read_count=32 of a one sql
------------------------------------------------

call count cpu elapsed disk query current rows misses
------- ----- ---- ------- ----- ----- -------- ---- ------
Parse 1 1.59 1.57 111 13459 3 0 1
Execute 1 0.03 0.69 1 3 0 0 0
Fetch 505 2.14 34.11 11962 11975 0 7552 0
------- ----- ---- ------- ----- ----- -------- ---- ------
total 507 3.76 36.37 12074 25437 3 7552 1

Event Times Count Max. Total Blocks
waited on Waited Zero Wait Waited Accessed
Time
----------------------------------- ------ ----- ---- ------ ---------
PX Deq: Execute Reply.............. 290 0 1.95 28.30
db file scattered read............. 399 0 0.06 2.63 11959
(multiblock full scan)
PX Deq Credit: send blkd........... 1 0 1.93 1.93
process startup.................... 12 0 0.06 0.56
PX Deq: Parse Reply................ 5 0 0.02 0.02
PX Deq: Join ACK................... 10 0 0.00 0.01
PX Deq: Signal ACK................. 4 0 0.00 0.00
db file sequential read............ 4 0 0.00 0.00 4
(single block scan)..
enqueue............................ 1 0 0.00 0.00
PX Deq Credit: need buffer......... 3 0 0.00 0.00
SQL*Net message from client (idle). 505 0 0.01 1.05
SQL*Net message to client (idle)... 505 70 0.00 0.00
----------------------------------- ------ ----- ---- ------ ---------

Statistics when db_file_multiblock_read_count=64 of the same sql
------------------------------------------------

call count cpu elapsed disk query current rows misses
------- ----- ---- ------- ----- ----- ------- ---- ------
Parse 1 1.59 5.81 95 13047 3 0 1
Execute 1 0.07 0.77 1 3 0 0 0
Fetch 505 2.01 80.65 11962 11975 0 7552 0
------- ----- ---- ------- ----- ----- ------- ---- ------
total 507 3.67 87.22 12058 25025 3 7552 1

Event Times Count Max. Total Blocks
waited on Waited Zero Wait Waited Accessed
Time
---------------------------------- ------ ----- ---- ------ ---------
PX Deq: Execute Reply............. 300 0 1.96 63.31
db file scattered read............ 213 0 0.80 14.89 11959
(multiblock full scan)
PX Deq Credit: send blkd.......... 2 0 0.80 0.80
process startup................... 12 0 0.05 0.57
db file sequential read .......... 4 0 0.10 0.30 4
(single block scan)
PX Deq: Signal ACK................ 5 0 0.10 0.10
PX Deq: Parse Reply............... 8 0 0.03 0.03
PX Deq: Join ACK.................. 9 0 0.00 0.01
PX Deq Credit: need buffer........ 4 0 0.00 0.00
latch free (004).................. 1 0 0.00 0.00
SQL*Net message from client (idle) 505 0 0.01 1.13
SQL*Net message to client (idle).. 505 57 0.00 0.00
---------------------------------- ------ ----- ---- ------ ---------

I have observed that when I am using the value 64 , elapsed time is becoming more (87.22) in compared to 32 (36.37). Not only that, I have also checked that in 64, no. of waits(213) for db file scattered read is 50% less in compared to the same in 32(399) but total waited column(14.89) in 64 is 7 times more in compared to the same in 32(2.63).

So my questions is :
a) I am getting significant improvement in response time while changing the value of db_file_multiblock_read_count from 32 to 64, but when I am using the parallel query then I am getting better response time in 32 instead of 64 though I got better response time in 64 ( not using parallel query). Why ?

b) the total waited time in "db_file_multiblock_read_count = 64" is 14.89 where no. of times waited is 213 , but total waited time in "db_file_multiblock_read_count = 32" is only 2.63 where no. of times waited is 399 ? What could be possible reasons ?

Thanks
Deba

May 26, 2005 - 8:43 am UTC

how are your disks mounted. unless you are using directio or RAW, you have to reboot the MACHINE -- not just restart the database.

Else your IO is coming from the OS file system cache, not the disk.

Which is why flushing the buffer cache in most cases is far less real world than just leaving it be.

insufficient data to say anything about the PQ, don't know the degree of parallelism, but unless you have tons of disk, I would expect the waits to go up with the degree as the PQ slaves contend with eachother.

why?

reader, June 14, 2005 - 4:18 pm UTC

<quote>Which is why flushing the buffer cache in most cases is far less real world than just leaving it be.<quote>

But parallel query does not use buffer cache to store the blocks from table(s)? right? Thanks.

June 14, 2005 - 4:39 pm UTC

parallel query may use direct io, it may not use direct io.
parallel query may use the buffer cache, it may bypass it.

but I don't understand a "why" with regards to those two statements put together.

Lets say PQ never used the buffer cache.
then flushing the buffer cache, besides being pointless, would only make it much less real world for the buffer cache holds lots of stuff, not just our queries data.

Lets say PQ always used the buffer cache.
then flushing the buffer cache would be as non-real world as anything else.

parallel query

reader, June 15, 2005 - 9:53 am UTC

Tom, in what conditions parallel query would entirely bypass buffer cache? Thanks.

June 15, 2005 - 10:16 am UTC

when it does direct IO to read the blocks.

parallel query and direct io

reader, June 15, 2005 - 11:29 am UTC

may be I am confused. I thought parallel query processes *always* bring the blocks to pga. Is this what you call direct io? If so, in what conditions, parallel query process would not do direct io? Thanks.

June 15, 2005 - 3:39 pm UTC

when they do a consistent get from the buffer cache.

In PQ, we can

a) do a checkpoint to flush blocks to disk so we don't have to bother going to cache or

b) read blocks from cache

both may happen during the processing of the same query.

How do you determine the degree of parallelism

David, July 13, 2005 - 5:19 am UTC

Tom,
Thank you for being the energizer bunny for Oracle solutions.

Don't know if this qualfies as a new question or continuation of this "parallel query" thread. If it doesn't fit here, feel free to run me off. But since you are already reading this line maybe you can read a couple(or 20) more.

What is the best technique to determine the degree of parallelism to use? By that I mean , what metrics do I need to gather about the OS, # of users, CPUs etc. to conclude that the degree of parallelism I am going to use is say..4 or whatever. And how do I apply all those metrics to get the answer ? Is there such a technique ?
OR
Is it just as straightforward as : I have 4 CPUs , so let me put /*+PARALLEL(4)*/ and see how it works. If there is not much improvement I'll go with 6. If the CPUs are maxing out, then I'll go down to 5. Is that what most people do ?

On a Unix system, how would I check the resource utilization for such a parallel query. I believe the parallel operation is carried out by multiple THREADS and not multiple PROCESSES, so I can't look at the parallel operation "working" from the OS level. I know the statspack/trace reports do show %CPU , but looking at it from the OS shell will give a "overall" picture of the load on the system as a whole.
Is that possible or am I just in Science Fiction Land ??

Thanks.

July 13, 2005 - 12:40 pm UTC

What is the best technique to determine the degree of parallelism to use?

funny, I just wrote about this yesterday in my chapter on parallelism I sent out for tech editing.

My answer:

don't, let the database do it. automatic tuning (in 9i, the default in 10g) and adaptive multiuser. Just make things "parallel" or use the "parallel hint" without a degree and let the database do it. If you have the resources, it'll use them. If you have too many concurrent users to do a parallel 8 query, it'll downgrade it to parallel 4 or maybe just serial.

A reader, July 22, 2005 - 11:32 am UTC

Follow-up to May 2005 - checkpoints

Jonathan Lewis, September 20, 2005 - 6:30 pm UTC

I see on Mark Rittman's website a note about an enhancement in 10g Release 2 relating to:

"a change in the way that checkpoints are issued prior to a direct path read when using parallel query, so that now just the object being queried is checkpointed rather than the whole tablespace"

So that explains the TC enqueue in my post of May 2005, and explains why I couldn't see anything supporting the segment based checkpoint that I thought was happening.

why does automatic parallel query happen?

Tulip Antar, December 21, 2005 - 9:36 am UTC

Dear Sir,
A view based on 5 tables # rows 250000 below:
persons (system_no varchar2 (14), .....);
pk = system_no
vehicles (chassis_no varchar2 (15), mark varchar2 (4), form varchar2 (4), .....);
pk = (chassis_no, mark, form)
vehicle_license
(chassis_no varchar2 (15), mark varchar2 (4), form varchar2 (4), update_date date,

update_type varchar2 (1), .....);
pk = (chassis_no, mark, form)
hist_vehicle_license
(chassis_no varchar2 (15), mark varchar2 (4), form varchar2 (4), update_type varchar2 (1),

update_date date, .....);
pk = (chassis_no, mark, form)
ownership
(chassis_no varchar2 (15), mark varchar2 (4), form varchar2 (4), system_no varchar2 (14),

start_date date, end_date date, .....);
pk = (system_no, chassis_no, mark, form, start_date)
---------------------------------------------------------
views :
all_license_hist
as
select ...
from hist_vehicle_license
union all
select ...
from vehicle_license
/
v_all_car_data
as
SELECT ....
FROM vehicles v, all_license_hist a
WHERE v.chassis_no(+) = a.chassis_no
AND v.mark(+) = a.mark
AND v.form(+) = a.form
/
v_all_per_own
as
SELECT ...
FROM persons p, ownership o
WHERE p.system_no = o.system_no
/
v_who_owns_what_when
as
select ...
from v_all_car_data c, v_all_per_own o
where c.chassis_no = o.chassis_no
and c.mark = o.mark
and c.form = o.form
and update_date between o.start_date and nvl (o.end_date, sysdate)
and update_type not in ('1','4')
UNION
select ...
from v_all_car_data c, v_all_per_own o
where c.chassis_no = o.chassis_no
and c.mark = o.mark
and c.form = o.form
and start_date in ( Select MIN(start_date)
From ownership own
where own.chassis_no = o.chassis_no
and own.mark = o.mark
and own.form = o.form)
and update_type = '1'
UNION
select ....
from v_all_car_data c, v_all_per_own o
where c.chassis_no = o.chassis_no
and c.mark = o.mark
and c.form = o.form
and update_date >= o.start_date
and update_date < nvl (o.end_date, sysdate)
and update_type = '4'
/
/*-----------------------------------------------*/
A form runs the following query
select ...
from v_who_owns_what_when v1
where system_no = :1
and chassis_no =:2
and mark = :3
and form = :4
and update_date = (Select max(update_date)
from v_who_owns_what_when v2
where v1.chassis_no = v2.chassis_no
and v1.mark = v2.mark
and v1.form = v2.form
and v1.system_no = v2.system_no)
***********************************/
In ver 817 it was fine. In ver9i I see it slow and when I investigate through
oem I see an iteration of parallel query granules which
divide my statemnt to 4 qeuries automatically.
The plan for one of them
/********************************************************************************
Version: Oracle 9.0.1.1.1
SQL Statement:
SELECT /*+ ORDERED NO_EXPAND USE_NL(A2) INDEX(A2 "IDX0_VEHICLES") */
A1.C0,A1.C1,A1.C2,A1.C3,A1.C4,A1.C5,A1.C6,A1.C7,A1.C8,A1.C9,A1.C10,A1.C11,A1.C12,A1.C13,A1.C1

4,A1.C15,A1.C16,A1.C17,A1.C18,A1.C19,A1.C20,A1.C21,A1.C22,A1.C23,A1.C24,A1.C25,A1.C26,A1.C27,

A1.C28,A1.C29,A2.ROWID,A2."CHASSIS_NO",A2."MARK",A2."MODEL",A2."FORM",A2."MANUFACTURING_DATE"

,A2."MASTREP_TRAFFIC_UNIT" FROM (SELECT /*+ ORDERED NO_EXPAND USE_NL(A4) INDEX(A4

"IDX0_PERSONS") */ A3.C0 C0,A3.C1 C1,A3.C2 C2,A3.C3 C3,A3.C4 C4,A3.C5 C5,A3.C6 C6,A3.C7

C7,A3.C8 C8,A3.C9 C9,A3.C10 C10,A3.C11 C11,A3.C12 C12,A3.C13 C13,A3.C14 C14,A3.C15 C15,A3.C16

C16,A3.C17 C17,A3.C18 C18,A3.C19 C19,A3.C20 C20,A3.C21 C21,A3.C22 C22,A3.C23 C23,A3.C24

C24,A4.ROWID C25,A4."SYSTEM_NO" C26,A4."PERSON_NAME" C27,A4."ADDRESS"

C28,A4."MASTREP_TRAFFIC_UNIT" C29 FROM (SELECT /*+ ORDERED NO_EXPAND USE_NL(A6) INDEX(A6

"IDX_HIST_VEHICLE_LICENSE") */ A5.C0 C0,A5.C1 C1,A5.C2 C2,A5.C3 C3,A5.C4 C4,A5.C5 C5,A5.C6

C6,A6.ROWID C7,A6."CHASSIS_NO" C8,A6."UPDATE_TYPE" C9,A6."LICENSE_PERMISSION"

C10,A6."LICENSE_TYPE" C11,A6."PLATE_NO" C12,A6."END_DATE" C13,A6."MOTOR_NO" C14,A6."CC"

C15,A6."COLOR1" C16,A6."COLOR2" C17,A6."COLOR3" C18,A6."PLATE_NOX" C19,A6."MARK"

C20,A6."FORM" C21,A6."PERMISSION_END_DATE" C22,A6."REP_TRAFFIC_UNIT"

C23,A6."MASTREP_TRAFFIC_UNIT" C24 FROM (SELECT /*+ NO_EXPAND ROWID(A7) */ A7."SYSTEM_NO"

C0,A7."CHASSIS_NO" C1,A7."START_DATE" C2,A7."END_DATE" C3,A7."MARK" C4,A7."FORM"

C5,A7."MASTREP_TRAFFIC_UNIT" C6 FROM "TRAFFIC"."OWNERSHIP" PX_GRANULE(0, BLOCK_RANGE,

DYNAMIC) A7 WHERE A7."END_DATE" IS NULL AND A7."START_DATE" IS NOT NULL AND

TO_NUMBER(A7."MASTREP_TRAFFIC_UNIT")=0300) A5,"TRAFFIC"."HIST_VEHICLE_LICENSE" A6 WHERE

(A6."UPDATE_TYPE"='8' OR A6."UPDATE_TYPE"='9') AND TO_NUMBER(A6."MASTREP_TRAFFIC_UNIT")=0300

AND A5.C5=A6."FORM" AND A5.C4=A6."MARK" AND A5.C1=A6."CHASSIS_NO") A3,"TRAFFIC"."PERSONS" A4

WHERE A4."PERSON_NAME" LIKE '?C??%' AND TO_NUMBER(A4."MASTREP_TRAFFIC_UNIT")=0300 AND

A3.C0=A4."SYSTEM_NO") A1,"TRAFFIC"."VEHICLES" A2 WHERE

TO_NUMBER(A2."MASTREP_TRAFFIC_UNIT")=0300 AND A2."FORM"=A1.C21 AND A2."MARK"=A1.C20 AND

A2."CHASSIS_NO"=A1.C8
")=0300 AND A2."FORM"=A1.C21 AND A2."MARK"=A1.C20 AND A2."CHASSIS_NO"=A1.C8
-------
Optimizer Mode Used: COST ALL ROWS (optimizer: CHOOSE)
------
Total Cost: 566
-------
Execution Steps:
Step # Step Name
12 SELECT STATEMENT
11 TRAFFIC.VEHICLES TABLE ACCESS [BY INDEX ROWID]
10 NESTED LOOPS
8 NESTED LOOPS
5 NESTED LOOPS
2 GRANULE ITERATOR
1 TRAFFIC.OWNERSHIP TABLE ACCESS [BY ROWID RANGE]
4 TRAFFIC.HIST_VEHICLE_LICENSE TABLE ACCESS [BY INDEX ROWID]
3 TRAFFIC.IDX_HIST_VEHICLE_LICENSE INDEX [RANGE SCAN]
7 TRAFFIC.PERSONS TABLE ACCESS [BY INDEX ROWID]
6 TRAFFIC.IDX0_PERSONS INDEX [RANGE SCAN]
9 TRAFFIC.IDX0_VEHICLES INDEX [RANGE SCAN]
Step # Description Est. Cost Est. Rows Returned Est. KBytes Returned
1 This plan step retrieves a range of rows from table OWNERSHIP based on upper and

lower ROWID bounds. 554 4 0.211
2 This plan step represents an iteration over parallel query 'granules' which are the

units of data (for example, ROWID range granules or partition granules) distributed to

parallel query slave processes.
3 This plan step retrieves one or more ROWIDs in ascending order by scanning the

B*-tree index IDX_HIST_VEHICLE_LICENSE. 1 62 --
4 This plan step retrieves rows from table HIST_VEHICLE_LICENSE through ROWID(s)

returned by an index. 2 62 6.297
5 This plan step joins two sets of rows by iterating over the driving, or outer, row

set (the first child of the join) and, for each row, carrying out the steps of the inner row

set (the second child). Corresponding pairs of rows are tested against the join condition

specified in the query's WHERE clause. 562 1 0.154
6 This plan step retrieves one or more ROWIDs in ascending order by scanning the

B*-tree index IDX0_PERSONS. 1 225 --
7 This plan step retrieves rows from table PERSONS through ROWID(s) returned by an

index. 2 225 12.964
8 This plan step joins two sets of rows by iterating over the driving, or outer, row

set (the first child of the join) and, for each row, carrying out the steps of the inner row

set (the second child). Corresponding pairs of rows are tested against the join condition

specified in the query's WHERE clause. 564 1 0.212
9 This plan step retrieves one or more ROWIDs in ascending order by scanning the

B*-tree index IDX0_VEHICLES. 1 1,302 --
10 This plan step joins two sets of rows by iterating over the driving, or outer, row

set (the first child of the join) and, for each row, carrying out the steps of the inner row

set (the second child). Corresponding pairs of rows are tested against the join condition

specified in the query's WHERE clause. 566 1 0.255
11 This plan step retrieves rows from table VEHICLES through ROWID(s) returned by an

index. 2 1,302 55.945
12 This plan step designates this statement as a SELECT statement.
******************************/
In your book you described DIY parallel queries which I didn't do.
Why is this happening? Where can I get more information on this?

Tulip I M Antar, January 25, 2006 - 6:35 am UTC

Oh, it was an easy answer. No need for the details.
I used Enterprise Manager and unchecked the parallel option on each of these 5 tables.
It was great, instead of queries running in 45min, now they run in less then 30 sec.
But I still can't figure out what has made this option enabled on these tables only. The entire schema is not.
Could it be because during the creation of these table I used the nologging mode to load data?
I returned back the logging mode after I finished.
It never happened before Oracle ver 9.2

January 25, 2006 - 1:38 pm UTC

did you create indexes using parallel?

Prallel exectuion in 9i better then in 10gr2

Yoav, May 26, 2007 - 6:47 am UTC

Hi Tom,
Im executing the same query in DATA WAREHOUSE 9.2.0.6 database and in 10.2.0.3 database.
The elpased time in 9i is much better.
There are 3,709,722 rows in both cases.

The diffrence are that : 9206 is in 8 cpu unix machine,16k block size.
10203 is in 8 cpu linux machine + ASM , 32k block size.

SELECT COUNT (*)
FROM cm_order_dim

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.18 1.39 5 15 0 0
Fetch 2 0.10 8.35 0 0 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.29 9.75 5 15 0 1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 53

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE
0 SORT AGGREGATE
0 PARTITION HASH ALL PARTITION: 1 5
0 TABLE ACCESS FULL OBJ#(1534509) PARTITION: 1 5

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 5 0.01 0.06
rdbms ipc reply 14 0.00 0.00
enqueue 3 0.00 0.00
process startup 16 0.05 0.80
latch free 4 0.00 0.00
PX Deq: Join ACK 13 0.00 0.00
PX Deq: Parse Reply 15 0.05 0.07
SQL*Net message to client 2 0.00 0.00
PX Deq: Execute Reply 620 0.16 8.16
PX Deq: Signal ACK 6 0.10 0.10
SQL*Net message from client 2 0.00 0.00

When runing the same query against 10.2.0.3 :

SELECT COUNT (*)
FROM cm_order_dim

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.01 0 15 0 0
Fetch 2 0.01 12.69 0 0 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.02 12.72 0 15 0 1

Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: 75

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=15 pr=0 pw=0 time=12710906 us)
16 PX COORDINATOR (cr=15 pr=0 pw=0 time=12853498 us)
0 PX SEND QC (RANDOM) :TQ10000 (cr=0 pr=0 pw=0 time=0 us)
0 SORT AGGREGATE (cr=0 pr=0 pw=0 time=0 us)
0 PX BLOCK ITERATOR PARTITION: 1 5 (cr=0 pr=0 pw=0 time=0 us)
0 TABLE ACCESS FULL CM_ORDER_DIM PARTITION: 1 5 (cr=0 pr=0 pw=0 time=0 us)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
PX Deq: Join ACK 10 0.00 0.00
PX Deq: Parse Reply 10 0.00 0.00
SQL*Net message to client 2 0.00 0.00
PX Deq: Execute Reply 212 0.48 12.58
PX qref latch 6 0.00 0.00
PX Deq: Signal ACK 14 0.09 0.10
enq: PS - contention 1 0.00 0.00
SQL*Net message from client 2 0.00 0.00

1. Why the execution plans are not the same ?
2. I saw that the wait event "PX Deq: Execute Reply" is took 12.58 seconds in 10.2.0.3
while in 9.2.0.6 its took only 8.16 seconds.
I understood that this event is an Idel event. what i can do/check to make it run (hopefully) faster ?

Thanks

May 26, 2007 - 12:16 pm UTC

you are comparing apples with toaster ovens and worried about a 3 second elapsed time difference for a 9 second parallel query??????

I don't get it.

probably this query would run faster without parallel query at all.

Comparing apples with toaster ovens :(

Yoav, May 26, 2007 - 3:00 pm UTC

Hi Tom,

I disagree with you regarding the fruit to electic kitchen objects ...
Before we upgrade our DATA WAREHOUSE database from 9i to 10gR2 , we asked oracle support advice regarding those issues.

We asked oracle for their advice regarding machine type (The answer:LINUX) , file system type (The answer:ASM) , oracle verion (The answer:10.2.0.3) , best way to move from unix to linux (The answer:Transportable tablespace cross platrom) and more.

After we finished the prcoess , and start checking performance we found that alot of parallel queries are NOW suffering from performance decrease.

Actually , 3 second elapsed time is not worry me, im worry about statments that run 8 minute in 9ir2 , and now in 10gr2 , is are finised.
All i wanted to show in the example above, is the symptom.
I found that "PX Deq: Execute Reply" event is much higher
in 10gR2 then it was in 9ir2 , and wanted to ask your advice how to reduce it.

Could you please response ?

Thanks again.

May 26, 2007 - 3:08 pm UTC

I am talking about your MACHINES

you have entirely different machines - until you do 9i to 9i or 10g to 10g comparisons on these DIFFERENT BITS OF MACHINERY - you are comparing apples to fruit flies.

why are you blaming 10g when you completely and utterly changed hardware? Can you get a fair and balanced comparison here.

Support really told you to change platforms? Support advised you to change operating systems and hardware platforms??

the parallel execution server is waiting for a response from the other guy in the process. Could be your IO is painfully slow (different hardware), could be your cpus are that much slower. To reduce the "waiting for reply" you make the guy producing the reply go faster.

Look - if you want to say 10g is slower than 9i, you must hold other things constant - you cannot compare your linux to unix - entirely different hardware, different capabilities.

Parallel Response

Jonathan Lewis, May 26, 2007 - 3:34 pm UTC

Yoav, at first sight it looks as if your 9i and 10g systems were running that query at different degrees of parallelism.

The event "PX Deq: Parse Reply" is an indicator of how many PX slaves were used to run the query - in 9i you seemed to be using about 15, in 10g you seem to be using about 10.

Try the test again, and check how many query slaves get involved - a drop from 15 to 10 would explain the response time.

By the way, you seem to have a hash partitioned table with five partitions. For consistent sizing it is usually sensible to use a power of 2 for the number of partitions in a hash partitioned table.

Parallel Response

Yoav, May 27, 2007 - 8:24 am UTC

Hi Tom and Jonathan.

First , i would like to thank you for sharing your tremendous knowlage with oracle developers and DBAs.
Both machines have 8 cpu (The linux have 8 Dual-Core AMD cpus) , and the degree of parallelism is set
to DEFAULT (from dba_tables view) in the 9i and in 10g.
(see more details at the end of this session)

Regarding the number of five partitions in HASH , im aware for that im planing to modify it later.

"...in 9i you seemed to be using about 15, in 10g you seem to be using about 10" -
Im not sure i understood those values. is that related to "Times Waited" value on "PX Deq: Parse Reply" event ?

In 10g :
=========
I runned again the query , using Jonathan scripts ( http://www.jlcomp.demon.co.uk/faq/pq_proc.html )
found that all the 16 process were in use:

Name Status Pid Sid Parent OSUSER Schema
---- ---------- ----- ----- ------ ------------------------------ ----------
CHILD_WAIT PARENT_WAIT
------------------------------ ------------------------------
P000 IN USE 21 244 235 oracle CRM
direct path read PX Deq: Execute Reply
P001 IN USE 22 245 oracle CRM
direct path read PX Deq: Execute Reply
P002 IN USE 23 233 oracle CRM
direct path read PX Deq: Execute Reply
P003 IN USE 24 228 235 oracle CRM
direct path read PX Deq: Execute Reply
P004 IN USE 25 239 oracle CRM
direct path read PX Deq: Execute Reply
P005 IN USE 26 269 oracle CRM
direct path read PX Deq: Execute Reply
P006 IN USE 27 232 235 oracle CRM
direct path read PX Deq: Execute Reply
P007 IN USE 28 215 oracle CRM
direct path read PX Deq: Execute Reply
P008 IN USE 29 204 oracle CRM
direct path read PX Deq: Execute Reply
P009 IN USE 30 212 235 oracle CRM
direct path read PX Deq: Execute Reply
P010 IN USE 31 261 oracle CRM
direct path read PX Deq: Execute Reply
P011 IN USE 32 231 oracle CRM
direct path read PX Deq: Execute Reply
P012 IN USE 33 234 235 oracle CRM
direct path read PX Deq: Execute Reply
P013 IN USE 34 237 oracle CRM
direct path read PX Deq: Execute Reply
P014 IN USE 35 254 oracle CRM
direct path read PX Deq: Execute Reply
P015 IN USE 36 243 235 oracle CRM
direct path read PX Deq: Execute Reply

16 rows selected.

Tkprof for 10g :

SELECT COUNT (*)
FROM CM_ORDER_DIM

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.03 0 15 0 0
Fetch 2 0.00 12.78 0 0 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.01 12.82 0 15 0 1

Misses in library cache during parse: 0
Optimizer mode: CHOOSE
Parsing user id: 75

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=15 pr=0 pw=0 time=12824197 us)
16 PX COORDINATOR (cr=15 pr=0 pw=0 time=12500041 us)
0 PX SEND QC (RANDOM) :TQ10000 (cr=0 pr=0 pw=0 time=0 us)
0 SORT AGGREGATE (cr=0 pr=0 pw=0 time=0 us)
0 PX BLOCK ITERATOR PARTITION: 1 5 (cr=0 pr=0 pw=0 time=0 us)
0 TABLE ACCESS FULL CM_ORDER_DIM PARTITION: 1 5 (cr=0 pr=0 pw=0 time=0 us)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
PX Deq: Parse Reply 8 0.00 0.00
SQL*Net message to client 2 0.00 0.00
PX Deq: Execute Reply 212 0.55 12.67
PX qref latch 4 0.00 0.00
PX Deq: Signal ACK 16 0.09 0.10
enq: PS - contention 1 0.00 0.00
SQL*Net message from client 2 0.00 0.00

IN 9i
=======
from total of 64 process, 48 process are availble , and 16 were in use :

Name Status Pid Sid Parent OSUSER Schema
---- ---------- ----- ----- ------ ------------------------------ ----------
CHILD_WAIT PARENT_WAIT
------------------------------ ------------------------------
P000 IN USE 12 32 72 oracle CRM
direct path read PX Deq: Execute Reply
P001 IN USE 13 76 oracle CRM
PX Deq: Execution Msg PX Deq: Execute Reply
P002 IN USE 14 96 oracle CRM
direct path read PX Deq: Execute Reply
P003 IN USE 15 88 72 oracle CRM
direct path read PX Deq: Execute Reply
P004 IN USE 16 67 oracle CRM
direct path read PX Deq: Execute Reply
P005 IN USE 17 99 oracle CRM
PX Deq: Execution Msg PX Deq: Execute Reply
P006 IN USE 18 41 72 oracle CRM
direct path read PX Deq: Execute Reply
P007 IN USE 19 60 oracle CRM
direct path read PX Deq: Execute Reply
P008 IN USE 20 15 oracle CRM
direct path read PX Deq: Execute Reply
P009 IN USE 21 39 72 oracle CRM
direct path read PX Deq: Execute Reply
P010 IN USE 22 91 oracle CRM
direct path read PX Deq: Execute Reply
P011 IN USE 23 14 oracle CRM
direct path read PX Deq: Execute Reply
P012 IN USE 24 84 72 oracle CRM
direct path read PX Deq: Execute Reply
P013 IN USE 25 53 oracle CRM
direct path read PX Deq: Execute Reply
P014 IN USE 26 11 oracle CRM
PX Deq: Execution Msg PX Deq: Execute Reply
P015 IN USE 27 86 72 oracle CRM
PX Deq: Execution Msg PX Deq: Execute Reply
P016 AVAILABLE 28
P017 AVAILABLE 29
P018 AVAILABLE 30
P019 AVAILABLE 31
P020 AVAILABLE 32
P021 AVAILABLE 33
P022 AVAILABLE 34
P023 AVAILABLE 35
P024 AVAILABLE 36
P025 AVAILABLE 37
P026 AVAILABLE 38
P027 AVAILABLE 39
P028 AVAILABLE 40
P029 AVAILABLE 41
P030 AVAILABLE 42
P031 AVAILABLE 44
P032 AVAILABLE 49
P033 AVAILABLE 50
P034 AVAILABLE 51
P035 AVAILABLE 52
P036 AVAILABLE 53
P037 AVAILABLE 54
P038 AVAILABLE 55
P039 AVAILABLE 56
P040 AVAILABLE 57
P041 AVAILABLE 58
P042 AVAILABLE 46
P043 AVAILABLE 59
P044 AVAILABLE 60
P045 AVAILABLE 61
P046 AVAILABLE 62
P047 AVAILABLE 63
P048 AVAILABLE 64
P049 AVAILABLE 65
P050 AVAILABLE 66
P051 AVAILABLE 67
P052 AVAILABLE 68
P053 AVAILABLE 70
P054 AVAILABLE 71
P055 AVAILABLE 72
P056 AVAILABLE 73
P057 AVAILABLE 74
P058 AVAILABLE 75
P059 AVAILABLE 76
P060 AVAILABLE 77
P061 AVAILABLE 78
P062 AVAILABLE 81
P063 AVAILABLE 82

tkprof for 9i :
SELECT COUNT (*)
FROM
CM_ORDER_DIM
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.02 0.20 0 15 0 0
Fetch 2 0.13 11.07 0 0 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.15 11.27 0 15 0 1

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 53

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE
0 SORT AGGREGATE
0 PARTITION HASH ALL PARTITION: 1 5
0 TABLE ACCESS FULL OBJ#(1535069) PARTITION: 1 5

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
PX Deq: Join ACK 14 0.00 0.00
PX Deq: Parse Reply 16 0.16 0.18
SQL*Net message to client 2 0.00 0.00
PX Deq: Execute Reply 615 0.20 10.67
PX Deq: Signal ACK 4 0.10 0.10
latch free 5 0.00 0.00
SQL*Net message from client 2 0.00 0.00

10g Parallel and cpu parameter
================================
SQL> show parameter parallel

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback string LOW
parallel_adaptive_multi_user boolean TRUE
parallel_automatic_tuning boolean FALSE
parallel_execution_message_size integer 16384
parallel_instance_group string
parallel_max_servers integer 64
parallel_min_percent integer 0
parallel_min_servers integer 0
parallel_server boolean FALSE
parallel_server_instances integer 1
parallel_threads_per_cpu integer 2
recovery_parallelism integer 0

SQL> show parameter cpu_count
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cpu_count integer 8

Linux prompr shows , 8 cpu Dual-Core AMD
cat /proc/cpuinfo
......
processor : 7
vendor_id : AuthenticAMD
cpu family : 15
model : 65
model name : Dual-Core AMD Opteron(tm) Processor 8216
stepping : 2
cpu MHz : 1808.482
cache size : 1024 KB
....

9i Parallel and cpu parameter
=============================
sql > show parameter parallel

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback string LOW
log_parallelism integer 1
parallel_adaptive_multi_user boolean TRUE
parallel_automatic_tuning boolean TRUE
parallel_execution_message_size integer 16384
parallel_instance_group string
parallel_max_servers integer 64
parallel_min_percent integer 0
parallel_min_servers integer 0
parallel_server boolean FALSE
parallel_server_instances integer 1
parallel_threads_per_cpu integer 2
recovery_parallelism integer 0

sql > show parameter cpu_count
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cpu_count integer 8

Also unix prompt show 8 cpus:
/software/oracle/admin/pudwh/udump >ioscan -fkn -C processor
Class I H/W Path Driver S/W State H/W Type Description
===================================================================
processor 0 0/10 processor CLAIMED PROCESSOR Processor
processor 1 0/11 processor CLAIMED PROCESSOR Processor
processor 2 0/12 processor CLAIMED PROCESSOR Processor
processor 3 0/13 processor CLAIMED PROCESSOR Processor
processor 4 2/10 processor CLAIMED PROCESSOR Processor
processor 5 2/11 processor CLAIMED PROCESSOR Processor
processor 6 2/12 processor CLAIMED PROCESSOR Processor
processor 7 2/13 processor CLAIMED PROCESSOR Processor

Could you please explain why there are total 64 process avalible in 9i and only 16
in 10g (parallel_max_servers=64 in init.ora)?

Thanks.

PX count

Jonathan Lewis, May 27, 2007 - 3:51 pm UTC

Yoav. Yes, I was using the PX Deq Parse reply as an indicator of how many slaves were called to exexcute the statement. In fact, the 10g tkprof plan (rowsource execution statistics) should have told me to distrust that figure (PX Co-ordinator reports 16 rows suggesting (but only suggesting, not guaranteeing) one result from each of 16 slaves).

The 16 vs 64 thing is possibly just a consequence of the idle time setting for the parallel slaves. The default is 5 minutes, so the number that are available is dependent on what's been going on recently. (Also, if you've changed the value, you need to know that it is set in minutes in 9i and hundredths of seconds in 10g - and this can lead to surprises when you upgrade).

A couple of further thoughts. Your parallel message size is an unusual 16KB - the default is about 2K or 4K (depending on the setting of parallel_automatic_tuning); has this been fiddled with, if so it is possible that it's had a surprise side-effect in 10g. Since you've got a 10046 trace going, why not run tkprof against the pNNN trace files as well - this may give you a clue about how the work is shared out and where the time is spent. You might also consider running a 10391 trace at level 2112 to get the granularity details and the timing sequence for the parallel execution.

Very well explained

Karthik, May 29, 2007 - 9:35 am UTC

Hello Tom,
Is there any overhead on setting the degree of parallelism to 8 (or any other value) but not really using it? The reason for asking this question, I have observed in my DB (on Oracle 9) the parallel_max_servers parameter is set to 5 and was thinking of increasing it to 8 (CPU count).

Im not aware how many queries do a full table scans done on these tables. So, I was thinking whether altering the table for parallel degree 8 but not really using will incur any overhead?

Example
alter table T1 parallel (degree 8);

Is there anyway I can findout whether the current degree of parallelism (of 5) is used efficently. We don't have partitioned tables.

Thanks

May 30, 2007 - 10:52 am UTC

show us the non-default parameters you have set yourself.

Parallel Query

Karthik, May 31, 2007 - 10:41 am UTC

Hello Tom,

select name, value from v$parameter where name like '%parallel%'

NAME     VALUE                            
log_parallelism                    1  
parallel_server                    FALSE                            
parallel_server_instances          1  
recovery_parallelism               0  
fast_start_parallel_rollback       false                            
parallel_min_percent               0  
parallel_min_servers               0  
parallel_max_servers               5  
parallel_instance_group               
parallel_execution_message_size    2152                             
parallel_adaptive_multi_user       FALSE                            
parallel_threads_per_cpu           2  
parallel_automatic_tuning          FALSE

I suppose the best way would be to set parallel_automatic_tuning to TRUE and alter the table to PARALLEL without specifying the degree. Am I right?

Thanks

May 31, 2007 - 10:52 am UTC

no, i want to know what YOU have set.

not what appears in v$parameter, but what you are explicitly SETTING YOURSELF.

show us the non-default parameters you have set yourself.

HOW to know what you have set?

A reader, May 31, 2007 - 1:19 pm UTC

Determined like this example?

SQL> alter session set query_rewrite_integrity = stale_tolerated;

Session altered.

SQL> select g.value "default", l.*
  2    from V$parameter l
  3         inner join
  4         v$system_parameter g
  5         on g.num = l.num
  6  where l.value <> g.value
  7  /

default
--------------------------------------------------------------------------------
---------------------------------------------------
       NUM NAME
  TYPE
---------- ---------------------------------------------------------------- ----
------
VALUE
--------------------------------------------------------------------------------
---------------------------------------------------
ISDEFAULT ISSES ISSYS_MOD ISMODIFIED ISADJ DESCRIPTION
--------- ----- --------- ---------- ----- -------------------------------------
---------------------------
UPDATE_COMMENT
--------------------------------------------------------------------------------
---------------------------------------------------
enforced
       705 query_rewrite_integrity
     2
STALE_TOLERATED
TRUE      TRUE  IMMEDIATE MODIFIED   FALSE perform rewrite using materialized vi
ews with desired integrity

http://download-east.oracle.com/docs/cd/B19306_01/server.102/b14237/dynviews_2012.htm

May 31, 2007 - 2:32 pm UTC

i don't know what you mean here. are you asking a question?

HOW to know?

A reader, May 31, 2007 - 3:05 pm UTC

Just trying to figure out how I'd answer your question: "what you are explicitly SETTING YOURSELF?" -- suppose I was experimenting with several session parameters over time, and hadn't spooled the entire session. I think I'd just use the preceding query to find out what session parameters are different than the system default parms.

May 31, 2007 - 3:45 pm UTC

look in your init.ora, I want your systems NON-DEFAULT parameter settings that are in place.

you can query v$parameter directly as well by the "isdefault" column.

I want to know what parameters you (your system, your instance, your configuration) set.

They will tell us why your parallel settings are as they are.

Parameters set by us

Karthik, June 01, 2007 - 7:19 am UTC

Hello Tom,

This is the only parameter related to parallel option which we have set to false.

fast_start_parallel_rollback='false'

Thanks

June 01, 2007 - 12:52 pm UTC

sigh,

select name, value from v$parameter where isdefault = 'FALSE';

just post that output...

Here is the output ...

Karthik, June 04, 2007 - 4:54 am UTC

NAME           VALUE               
processes                           530                 
timed_statistics                    TRUE                
shared_pool_size                    1577058304          
sga_max_size                        8315188976          
large_pool_size                     16777216            
java_pool_size                      50331648            
control_files                       /u02/oradata/ARSOMT3/control/c                   
db_block_size                       8192                
db_writer_processes                 5                   
db_keep_cache_size                  16777216            
db_cache_size                       5083496448          
db_cache_advice                     ON                  
compatible                          9.2.0               
log_archive_start                   TRUE                
log_archive_dest                    /u03/oradata/ARSOMT3/arch                        
log_archive_format                  %S.log              
log_buffer                          26214400            
db_file_multiblock_read_count       32                  
undo_management                     AUTO                
undo_tablespace                     undo1               
undo_retention                      10800               
fast_start_parallel_rollback        false               
max_enabled_roles                   50                  
db_domain                           world               
global_names                        FALSE               
instance_name                       ARSOMT3             
service_names                       ARSOMT3             
utl_file_dir                        /u01/app/remedy/menus                            
job_queue_processes                 10                  
cursor_sharing                      force               
background_dump_dest                /u01/app/oracle/ARSOMT3/bdump                    
user_dump_dest                      /u01/app/oracle/ARSOMT3/udump                    
core_dump_dest                      /u01/app/oracle/ARSOMT3/cdump                    
open_links                          40                  
sort_area_size                      60024000            
sort_area_retained_size             10024000            
db_nameARSOMT3             
open_cursors                        1500                
sql_trace                           FALSE               
optimizer_mode                      CHOOSE              
pga_aggregate_target                664000000           
workarea_size_policy                AUTO

June 05, 2007 - 8:13 am UTC

cursor_sharing force, I hate your system already.

the following is from Effective Oracle by Design:

Parallel Query

Parallel query is suitable for a certain class of large problems: very large problems that have no other solution. Parallel query is my last path of action for solving a performance problem; it¿s never my first course of action.

Parallel Query Settings

I will not discuss the physical setup of parallel query operations. That topic is well covered in both the Oracle Concepts Guide and Data Warehousing Guide. As I mentioned earlier, my current favorite way to set up parallelism in Oracle is using the automatic tuning option first introduced in Oracle8i Release 2 (version 8.1.6): PARALLEL_AUTOMATIC_TUNING = TRUE. With this setting, the other parallel settings are automatically set. Now, all I need to do is set the PARALLEL option on the table (not PARALLEL <N>, just PARALLEL) and Oracle will, when appropriate, parallelize certain operations on that table for me. The degree of parallelism (how many processes/threads will be thrown at a problem) will be decided for me and vary over time as the load on the system varies. I have found that, for most cases, this achieves my desired goal, which is usually to get the best performance, with the least amount of work, in a manner that is most manageable. Setting a single parameter is a great way to get there.

For the novice user wanting to play with parallel query for the first time, parallel automatic tuning is a good way to get started. As you develop an understanding of what parallel query does and how it does it, try tweaking some of the other parallel settings:
¿ PARALLEL_ADAPTIVE_MULTI_USER¿Controls whether the degree of parallelism should vary over time as the load on the system does; should the algorithm for assigning resources ¿adapt¿ to the increase in load.
¿ PARALLEL_EXECUTION_MESSAGE_SIZE¿Sets the size of the message buffers used to pass information back and forth between the processes executing the parallel query.
¿ PARALLEL_INSTANCE_GROUP¿Applies only to Oracle RAC configurations (Oracle Parallel Server, OPS, in Oracle8i and earlier). Allows you to restrict the number of instances that will be used to perform a parallel operation (as opposed to the number of processes an instance will use).
¿ PARALLEL_MAX_SERVERS¿Sets the maximum number of parallel query slaves (like dedicated servers but for parallel operations) your instance will ever have.
¿ PARALLEL_MIN_PERCENT¿Useful if you would like to receive an error message when you request a specific degree of parallelism but insufficient resources exist to satisfy that request. You might use this to ensure that a process that takes days unless it gets what it wants doesn¿t run unless it gets what it wants.
¿ PARALLEL_MIN_SERVERS¿Sets the number of servers to start when the instance starts and to keep started permanently. Otherwise, you may need to start the parallel processes in response to a query.
¿ PARALLEL_THREADS_PER_CPU¿Determines the default degree of parallelism and contributes to the adaptive parallelism algorithms, to determine when to back off on the amount of parallel resources.
¿ RECOVERY_PARALLELISM¿For crash recovery, sets how many parallel threads should be used. This setting can be used to speed up the recovery from an instance crash.
¿ FAST_START_PARALLEL_ROLLBACK¿Sets how many processes would be available to perform a parallel rollback after the recovery takes place. This would be useful on systems where many long-running transactions are constantly processing (which would need correspondingly long rollback times in the event of an instance crash).

Query

Anil, June 27, 2007 - 7:26 am UTC

I have a query which queries a big table.

SELECT C1, C2, C3
FROM T1
WHERE C3 = 'MIF'
OR C3 = 'FGA'
OR C3 = 'BMS'
OR C3 = 'BBT'

In DB it goes for a full table scan.

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=8091 Card=46659 Byte
          s=5645739)

   1    0   TABLE ACCESS* (FULL) OF 'T1' (Cost=8091 Card=46659 Bytes :Q648200
          =5645739)                                                    0



   1 PARALLEL_TO_SERIAL            SELECT /*+ NO_EXPAND ROWID(A1) */ A1."C1",A1
                                   ."C536870914",A1."C536870917",A1."C5

and in other it goes for index range.

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=699 Card=709 Bytes=8
          0117)

   1    0   INLIST ITERATOR
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=699 Card=7
          09 Bytes=80117)

   3    2       INDEX (RANGE SCAN) OF 'I1_C3' (NON-UNIQUE)
          (Cost=6 Card=709)

In development it's put in parallel mode. (alter table T1 parallel (degree 8); )

But still both are quite slow.

Any suggestions what could be done to improve the performance?

July 02, 2007 - 10:01 am UTC

"and in other" what the heck does that mean?

how many records in the table
how many records will query retrieve

Parallel Execution

Arindam Mukherjee, July 04, 2007 - 8:42 am UTC

Sir,

On reading your book "Effective Oracle by Design", I have split one procedure into three (3) procedures as that procedure takes long time to execute. So now the three procedures run simultaneously in parallel (DIY). Now the main question is how I can understand those three procedures are executed fully because after executing those three (before one), one procedure for summary calculation will be invoked. All are batch jobs.
Database is Oracle 10g, OS is Linux and Windows Server 2000

July 05, 2007 - 12:50 pm UTC

as they finish, have them log that fact, then anything else can see that they are "done"

or, as they finish, have them issue a "dbms_alert.signal", and the 4th job that runs after they are done will wait on that signal - when it gets it, it'll peek at that log table to see if 1, 2 or 3 (all) of the jobs have finished and then continue.

performance issue fixed by setting parallel_max_servers to zero

Anurag, July 05, 2007 - 8:31 pm UTC

This may not be a relevant followup, (and I'll post it as a new question in that case) but could you comment on our situation below please:

After a recent upgrade from 8i to 9i (yes, I know, not my decision) a query that had millisecond response earlier degraded to about FIVE SECONDS. This query is executed hundreds of times per minute, and the change brought the system to its knees. A DBA recommended setting parallel_max_servers to zero and the problem went away.

The query is an equijoin between a partitioned "big_table" having transactional data, and a "little_table" which holds master data. The plan was (a) Range scan local index on one partition of big_table (b) lookup by rowid from big_table (c) FTS on little_table (d) hash join result sets (b) and (c).

Same plan in 8i and 9i.

The only change made to the system was setting parallel_max_servers to zero. The DBA won't tell us why this resolved the issue.

Again, sorry for the vague post... if you'd like to take this up, I can provide more info. Thanks Tom!

July 06, 2007 - 12:29 pm UTC

someone enabled parallel queries against these objects.

the optimizer chose a parallel query

instead of fixing the problem (that someone enabled parallel on the table/index in question), the dba took the atom bomb to crack the walnut and disabled parallel query altogether.

Effective in the same way as removing a wart on your finger with an axe is, the wart is definitely gone after the procedure....

No PARALLEL?

Anurag, July 06, 2007 - 3:35 pm UTC

Atom bomb to crack open a walnut indeed :)

The create table and create index script has the NOPARALLEL clause, and the query does not have a PARALLEL hint.

http://orafaq.com/faqopq.htm#INVOKE

What else can we check to ensure parallel query is not invoked? NOPARALLEL hint in the query would be one way out, but I'd consider that a cop-out.

July 07, 2007 - 10:45 am UTC

but the link you pointed to has alter's ??? not sure what you mean.

the example you pointed me to shows exactly what I said has happened, someone said "PLEASE DO PARALLEL"

A reader, July 06, 2007 - 11:19 pm UTC

>>Effective in the same way as removing a wart on your finger with an axe is, the wart is definitely gone after the procedure....

Ha Ha Ha..

||

Anurag, July 07, 2007 - 10:51 am UTC

Sorry about the confusion. I was trying to see what all settings could influence the optimizer to select a parallel execution plan. According to orafaq, its either the PARALLEL keyword while creating the table / index; or a /*+ parallel(.,.) */ hint in the query itself. We don't have either of these - the create table and index scripts have NOPARALLEL and the query has no hints whatsoever. So I was just wondering what else could have caused the optimizer to pick a parallel execution plan.

July 07, 2007 - 10:54 am UTC

or the alters they did!

that page shows the ALTER commands I'm saying happened.

probably, the dba rebuilt the index using parallel and that set it. (it not only rebuilds in parallel but leaves parallel on)

99.9% of the time, it was that.

wow

Anurag, July 10, 2007 - 4:56 pm UTC

The DBA just got back to me. That's exactly what it was.

I want to suggest that they enable PARALLEL on the table, with the same DEGREE as the index, set parallel_max_servers back to its original value, and then we'll do a test run. Worth a shot? In general, should the PARALLELism of a table and its indexes be the same? (or would this be considered a ROT ;)

Thanks again.

July 10, 2007 - 8:16 pm UTC

in general - no. they are different segments, used for different purposes.

if we full scan, parallel on the table sets the degree.

if we index <anything>, parallel on the index sets the degree.

we do not index and full scan at the same time

parralell witn outer join

ghd, September 11, 2007 - 7:05 pm UTC

Hi Tom,

Thank you for your pervious response.

We have a problem with parralell table with outer join .


SQL> select * from v$version
  2  /

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - 64bi
PL/SQL Release 10.2.0.2.0 - Production
CORE    10.2.0.2.0      Production
TNS for Solaris: Version 10.2.0.2.0 - Production
NLSRTL Version 10.2.0.2.0 - Production

create table foo_ACCOUNT(ACC_ACCOUNT_ID number, ACC_NAME varchar2(60))
PARALLEL ( DEGREE DEFAULT INSTANCES 1 );
insert into foo_ACCOUNT values ( 100, 'ACC 100');
insert into foo_ACCOUNT values ( 200, 'ACC 200');
insert into foo_ACCOUNT values ( 300, 'ACC 300');
insert into foo_ACCOUNT values ( 400, 'ACC 400');
insert into foo_ACCOUNT values ( 500, 'ACC 500');
insert into foo_ACCOUNT values ( 600, 'ACC 600');

create table foo_PROFILE_ELIGIBILITY(PRE_ACCOUNT_ID number, PRE_ACCOUNT_ID1 number, PRE_DESC varchar2(60))
PARALLEL ( DEGREE DEFAULT INSTANCES 1 );
insert into foo_PROFILE_ELIGIBILITY values ( 200, 200, 'PROF ELIG 200');
insert into foo_PROFILE_ELIGIBILITY values ( 300, 300, 'PROF ELIG 300');
insert into foo_PROFILE_ELIGIBILITY values ( 400, 400, 'PROF ELIG 400');

create table foo_NAMEINFO(NAM_ACCOUNT_ID number, NAM_ACCOUNT_ID1 number, NAM_REC_ID number
, NAM_CONTACT_TYPE varchar2(60), NAM_DESC varchar2(60))
PARALLEL ( DEGREE DEFAULT INSTANCES 1 );
insert into foo_NAMEINFO values ( 100, 100, 110, 'ACCOUNT HOLDER',  'NAMEINFO 100');
insert into foo_NAMEINFO values ( 300, 300, 330, 'ACCOUNT HOLDEr',  'NAMEINFO 300');
insert into foo_NAMEINFO values ( 500, 500, 550, 'ACCOUNT hhOLDER', 'NAMEINFO 500');


commit;


  1  select  t1.ACC_ACCOUNT_ID ,t2.PRE_ACCOUNT_ID
  2  from foo_ACCOUNT t1, foo_PROFILE_ELIGIBILITY t2
  3* where t1.ACC_ACCOUNT_ID = t2.PRE_ACCOUNT_ID(+)
SQL> /

ACC_ACCOUNT_ID PRE_ACCOUNT_ID
-------------- --------------
           300            300
           100
           500
           600
           200            200
           400            400

6 rows selected.


SQL> l
  1  select  t1.ACC_ACCOUNT_ID ,t2.PRE_ACCOUNT_ID,t3.NAM_ACCOUNT_ID
  2  from foo_ACCOUNT t1, foo_PROFILE_ELIGIBILITY t2
  3  ,  foo_NAMEINFO t3
  4  where t1.ACC_ACCOUNT_ID = t2.PRE_ACCOUNT_ID(+)
  5* and t2.PRE_ACCOUNT_ID = t3.NAM_ACCOUNT_ID(+)
SQL> /

ACC_ACCOUNT_ID PRE_ACCOUNT_ID NAM_ACCOUNT_ID
-------------- -------------- --------------
           300            300            300
           100
           500
           600
           200            200
           400            400

6 rows selected.

Now change and t2.PRE_ACCOUNT_ID = t3.NAM_ACCOUNT_ID(+) to and t2.PRE_ACCOUNT_ID1 = t3.NAM_ACCOUNT_ID(+) . Please note t2 column name change.


SQL> 5
  5* and t2.PRE_ACCOUNT_ID = t3.NAM_ACCOUNT_ID(+)
SQL> c/ID/ID1
  5* and t2.PRE_ACCOUNT_ID1 = t3.NAM_ACCOUNT_ID(+)
SQL> l
  1  select  t1.ACC_ACCOUNT_ID ,t2.PRE_ACCOUNT_ID,t3.NAM_ACCOUNT_ID
  2  from foo_ACCOUNT t1, foo_PROFILE_ELIGIBILITY t2
  3  ,  foo_NAMEINFO t3
  4  where t1.ACC_ACCOUNT_ID = t2.PRE_ACCOUNT_ID(+)
  5* and t2.PRE_ACCOUNT_ID1 = t3.NAM_ACCOUNT_ID(+)
SQL> /

ACC_ACCOUNT_ID PRE_ACCOUNT_ID NAM_ACCOUNT_ID
-------------- -------------- --------------
           300            300            300
           200            200
           400            400

SQL> 

Only 3 rows return . Should 6 rows return as previous SQL ?
Also it does not display 3 rows selected either.

Please advise solution for this.

Thank you

September 15, 2007 - 4:31 pm UTC

I cannot reproduce, yes, it should return three rows. Please contact support with your test case...

ops$tkyte%ORA10GR2> select  t1.ACC_ACCOUNT_ID ,t2.PRE_ACCOUNT_ID
  2  from foo_ACCOUNT t1, foo_PROFILE_ELIGIBILITY t2
  3  where t1.ACC_ACCOUNT_ID = t2.PRE_ACCOUNT_ID(+)
  4  /

ACC_ACCOUNT_ID PRE_ACCOUNT_ID
-------------- --------------
           400            400
           600
           200            200
           300            300
           500
           100

6 rows selected.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select  t1.ACC_ACCOUNT_ID ,t2.PRE_ACCOUNT_ID,t3.NAM_ACCOUNT_ID
  2  from foo_ACCOUNT t1, foo_PROFILE_ELIGIBILITY t2
  3  ,  foo_NAMEINFO t3
  4  where t1.ACC_ACCOUNT_ID = t2.PRE_ACCOUNT_ID(+)
  5  and t2.PRE_ACCOUNT_ID = t3.NAM_ACCOUNT_ID(+)
  6  /

ACC_ACCOUNT_ID PRE_ACCOUNT_ID NAM_ACCOUNT_ID
-------------- -------------- --------------
           100
           400            400
           200            200
           300            300            300
           500
           600

6 rows selected.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select  t1.ACC_ACCOUNT_ID ,t2.PRE_ACCOUNT_ID,t3.NAM_ACCOUNT_ID
  2  from foo_ACCOUNT t1, foo_PROFILE_ELIGIBILITY t2
  3  ,  foo_NAMEINFO t3
  4  where t1.ACC_ACCOUNT_ID = t2.PRE_ACCOUNT_ID(+)
  5  and t2.PRE_ACCOUNT_ID1 = t3.NAM_ACCOUNT_ID(+)
  6  /

ACC_ACCOUNT_ID PRE_ACCOUNT_ID NAM_ACCOUNT_ID
-------------- -------------- --------------
           400            400
           500
           100
           600
           300            300            300
           200            200

6 rows selected.

ops$tkyte%ORA10GR2> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - Prod
PL/SQL Release 10.2.0.2.0 - Production
CORE    10.2.0.2.0      Production
TNS for Linux: Version 10.2.0.2.0 - Production
NLSRTL Version 10.2.0.2.0 - Production

Paralell table with outer join

ghd, September 11, 2007 - 9:33 pm UTC

Query return 6 rows when adding hint /*+ use_nl (t3) */


SQL> l
  1   select /*+ use_nl(t3) */
  2    t1.ACC_ACCOUNT_ID ,t2.PRE_ACCOUNT_ID,t3.NAM_ACCOUNT_ID
  3   from foo_ACCOUNT t1, foo_PROFILE_ELIGIBILITY t2
  4   ,  foo_NAMEINFO t3
  5   where t1.ACC_ACCOUNT_ID = t2.PRE_ACCOUNT_ID(+)
  6*  and t2.PRE_ACCOUNT_ID1 = t3.NAM_ACCOUNT_ID(+)
SQL> /

ACC_ACCOUNT_ID PRE_ACCOUNT_ID NAM_ACCOUNT_ID
-------------- -------------- --------------
           100
           500
           600
           200            200
           400            400
           300            300            300

6 rows selected.

SQL> 

The above sample is to show the issue with Paralell table with outer join.

we encounter this problem in our DW database.

parallel hint

Florin, October 04, 2007 - 8:10 am UTC

Hi Tom,

I have 2 questions related to the "paralell hints":

1. In the first example I used "parallel 4" but actually I see 8 processes. Why?

select /*+ parallel (PERFORMANCE_IND 4) */ distinct cycle_year from PERFORMANCE_IND;

2. In the second example:

select /*+ parallel (PERFORMANCE_IND 4) */ count(*) from PERFORMANCE_IND;

The execution plan goes by the index PERFORMANCE_IND_1IX (AGRREMENT_NO). Why not FTS?

Please note: the table is partitioned by (cycle_code,cycle_month) having 36 partitions and 17 GB size.

Many thanks in advance!
Florin

Parallel Hint showing absurd behavior

Anand, January 19, 2008 - 8:37 am UTC

Tom,
in what all scenario Parallel hint will behave opposite of its said functionality of performance improvement.

We are actually encountering one situation in our production environment (Oracle 9i CBO) where by removing teh Parallel Hint from some of the create table select ... command is increasing the performance.

thanks,
Anand

January 19, 2008 - 10:54 pm UTC

hah, the false belief that parallel is the silver bullet of "fast=true"

do you have access to either of efficient oracle by design or expert Oracle database architecture? I give many cases whereby parallel is slower than serial.

for parallel to be faster - you have to have a really really really big problem to solve AND a lot of extra resources that would otherwise be idle if we did not use them.

so, if you have a single cpu, not practical probably.
so, if you have few disks, not practical
so, if you have a query that takes a couple of seconds, probably serial is faster, the time to set up parallel would outweigh anything
so, if you have a couple of users running things and a couple of cpus/disks to run them on - probably not practical

and so on.

Parallel Hint showing absurd behavior

Anand, January 21, 2008 - 12:11 pm UTC

Thanks for your reply Tom. Could you please help me in resolving one of the query which is taking approx 40- 50 min on live... I'd tried all options (hints)... attaching Explain Plan for the same...

many thanks,
Anand

EP:

------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------
| 0 | CREATE TABLE STATEMENT | | 16581 | 2720K| 9462 (1)| | | |
|* 1 | TABLE ACCESS BY INDEX ROWID | S_ASSET_XA | 1 | 32 | 2 (50)| | | |
|* 2 | INDEX RANGE SCAN | S_ASSET_XA_U1 | 18 | | 5 (20)| | | |
|* 3 | TABLE ACCESS BY INDEX ROWID | S_ASSET_XA | 1 | 32 | 2 (50)| | | |
|* 4 | INDEX RANGE SCAN | S_ASSET_XA_U1 | 18 | | 5 (20)| | | |
| 5 | LOAD AS SELECT | | | | | 22,01 | P->S | QC (RAND) |
|* 6 | FILTER | | | | | 22,00 | S->P | RND-ROBIN |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|* 7 | TABLE ACCESS BY INDEX ROWID | S_ASSET_XA | 1 | 40 | 2 (50)| | | |
| 8 | NESTED LOOPS | | 16581 | 2720K| 9462 (1)| | | |
|* 9 | HASH JOIN | | 1183K| 144M| 2913 (1)| | | |
| 10 | TABLE ACCESS BY INDEX ROWID | S_PROD_INT | 2 | 26 | 2 (50)| | | |
|* 11 | INDEX RANGE SCAN | S_PROD_INT_M52 | 1 | | 2 (50)| | | |
| 12 | NESTED LOOPS | | 5326K| 584M| 2907 (1)| | | |
| 13 | NESTED LOOPS | | 5980K| 359M| 836 (1)| | | |
|* 14 | TABLE ACCESS FULL | S_PROD_INT | 2 | 52 | 2 (50)| | | |
|* 15 | TABLE ACCESS BY INDEX ROWID| S_ASSET | 2658K| 93M| 12085 (1)| | | |
|* 16 | INDEX RANGE SCAN | S_ASSET_U2 | 3986K| | | | | |
|* 17 | TABLE ACCESS BY INDEX ROWID | S_ASSET | 1 | 52 | 2 (50)| | | |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|* 18 | INDEX UNIQUE SCAN | S_ASSET_P1 | 1 | | | | | |
|* 19 | INDEX RANGE SCAN | S_ASSET_XA_U1 | 18 | | 2 (50)| | | |
------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("S_ASSET_XA"."ATTR_NAME"='LineNumber')
2 - access("S_ASSET_XA"."ASSET_ID"=:B1)
3 - filter("S_ASSET_XA"."ATTR_NAME"='LineNumber')
4 - access("S_ASSET_XA"."ASSET_ID"=:B1)

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
6 - filter(TO_DATE('$LAST_RUN_DATE','DD-MM-YYYYHH24:MI:SS')<TO_DATE('$END_RUN_DATE','DD-MM-YYYYHH24:MI:SS'))
7 - filter("XA"."CHAR_VAL" IS NOT NULL AND "XA"."LAST_UPD">=TO_DATE('$LAST_RUN_DATE','DD-MM-YYYYHH24:MI:SS') AND
"XA"."LAST_UPD"<TO_DATE('$END_RUN_DATE','DD-MM-YYYYHH24:MI:SS') AND "XA"."ATTR_NAME"<>'LineNumber')
9 - access("PARENT"."PROD_ID"="PARPROD"."ROW_ID")
11 - access("PARPROD"."PROD_CATG_CD"='SE')
14 - filter("PROD"."NAME" LIKE '%CNF%')
15 - filter("SYS_ALIAS_2"."PAR_ASSET_ID" IS NOT NULL)
16 - access("SYS_ALIAS_2"."PROD_ID"="PROD"."ROW_ID")
17 - filter("PARENT"."SERIAL_NUM" IS NOT NULL AND "PARENT"."STATUS_CD"='Active')
18 - access("SYS_ALIAS_2"."PAR_ASSET_ID"="PARENT"."ROW_ID")
19 - access("SYS_ALIAS_2"."ROW_ID"="XA"."ASSET_ID")

44 rows selected.

January 21, 2008 - 9:14 pm UTC

hahahah, no, not really.

All I can say is - if it takes 40-50 minutes and is using indexes, it probably shouldn't be using any indexes, so I'd guess you used the old "indexes=> fast=true" thinking and hinted the heck out of the query (which, well, WE CANNOT SEE) to use indexes.

and that it is using parallel query means you must have anticipated it taking minutes (else you would not dream of using that), so how fast is fast here?

Parallel Hint for insert

chintu, May 22, 2008 - 11:13 am UTC

Hi Tom
I used parallel Hint for an insert but it doesn't seem to be using it. Please advice
INSERT /*+ PARALLEL(OJP,DEFAULT) */ INTO PRICING OJP ( STORE, ITEM,
328 WINNING_PRICE, WINNING_PRICE_TYPE, REGULAR_PRICE, CREATE_SIGN_FLAG,
329 COMMENTS, THEME, EVENT_NAME, CREATION_DATE_TIME, TYPE, ACTION_TYPE,
330 START_DATE, END_DATE, POS_CONSUME_DATE, SIGN_CONSUME_DATE,
331 FUTURE_SUBSCRIBER, SRC_TBL_LAST_UPDATE_BY, SRC_TBL_LAST_UPDATE_DATE )
332 VALUES
333 ( :B9 , :B8 , :B7 , :B6 , :B5 , :B4 , NULL, NULL, NULL, :B3 , 1, 'ADD', :B2 ,
334 :B1 , NULL, NULL, NULL, NULL, NULL )
335
336
337 call count cpu elapsed disk query current rows
338 ------- ------ -------- ---------- ---------- ---------- ---------- ----------
339 Parse 1 0.00 0.00 0 0 0 0
340 Execute 823979 80.19 158.51 25156 23912 3417277 823979
341 Fetch 0 0.00 0.00 0 0 0 0
342 ------- ------ -------- ---------- ---------- ---------- ---------- ----------
343 total 823980 80.19 158.51 25156 23912 3417277 823979
344
345 Misses in library cache during parse: 1
346 Optimizer goal: CHOOSE
347 Parsing user id: 998 (JDA_POS) (recursive depth: 1)
348
349 Rows Execution Plan
350 ------- ---------------------------------------------------
351 0 INSERT STATEMENT GOAL: CHOOSE
352
353 ********************************************************************************

May 23, 2008 - 8:07 am UTC

ummmmm

well........

ummm.....

you are inserting a single row there - what do you THINK could happen???

the problem is you are calling this slow by slow, inserting a single row at a time, in other words - you have written some very serial slow code and there isn't much you can do in the database to correct it, all of the work needs be done in the client that issues this stuff

they need to batch it up at the very least

they should be looking to do it in a single sql statement (external table!!! if possible)

chintu, May 23, 2008 - 10:18 am UTC

This insert is happening in a procedure which is getting called once for every item. The reason I can't use bulk insert is because there is lot other processing happening to get other attributes for the table before this insert will happen. Is there a way we can speed-up this kind of processing. Can you give me an example of how a external table can be used to speed it up if possible in this case.

May 23, 2008 - 6:11 pm UTC

.... The reason I can't use bulk insert is because there is lot other
processing happening to get other attributes for the table before this insert
will happen.
.....

that does not mean you cannot batch up the inserts - if you believe it prevents you from batching - tell us why. I can see an easy way to 'batch em up', instead of inserting - call a procedure, let the procedure save up an array, when array filled (say 100-500 rows) - it flushes. When you exit the loop that generates loop, you call procedure to flush final set.

but you know what, this takes 0.001 seconds per row... 0.001 seconds per row...

I seriously doubt this is your bottleneck - even if we improve performance by say 25%, you are only chopping off a couple of seconds.

what is "events in waitclass Other"?

Mike, July 18, 2008 - 10:39 pm UTC

Tom,
I have a large table with 80 million records. I wanted to test the parallel query on the table as comparision to non-parallel which took 2 hrs to do the full table scan.

select /*+ parallel(v, 4) */ * from v_bo v;

I found the 4 sessions had not been using resources, a couple hundreds physical and logcal reads. and found no db locks.
I used v$session_event
TIME
TOTAL TOTAL OUT TIME AVERAGE MAX
EVENT SID WAITS TIMEOUTS PCT WAITED WAIT WAIT
---------------------------- ----- -------- -------- ------ ---------- --------- ---------
PX Deq Credit: need buffer 7287 4 24 5.9 22.0
PX Deq: Execution Msg 7287 1 .4
PX Deq: Msg Fragment 7287 2 2 1.1 2.0
direct path read 7287 11
events in waitclass Other 7287 1819 185 10.2 93491 51.4 196.0
**************************** -------- -------- ----------
sum 1837 185 93517

Those parallel sessions were waiting on
"events in waitclass Other"
I could not find any on the metalink. what heck is that event?
TIA

July 19, 2008 - 6:36 am UTC

other just isn't classified. it is literally just "other", like miscellaneous.

...
select /*+ parallel(v, 4) */ * from v_bo v;
.....

here, the bottleneck will be.....

the client - you have a single thing fetching from this - it almost certainly doesn't make any sense to use parallel - the thing selecting cannot select faster than the rows would be produced, the query is far too 'simple'

select /*+ parallel(v) */ avg(x), sum(y) from v_bo v;

now that, that might benefit from it as the client will not be the bottleneck.

basically, your server wasn't doing very much because the client could not drive it very quickly.

parallel query

Deepak, August 23, 2008 - 11:00 am UTC

Hi Tom,

Have a doubt on parallel query. Though it is a very basic question but I am very confused. Please help.

We have a table which has 12 partitions and almost 10 million rows per partition.

In order to pick any one row from the table I issued the following query.

select * from mytable where rownum<2;

This query almost got stuck and never came out.

When I issued the following query...

select 1 from mytable where rownum<2;

Immediately it returned the result 1. Hope it is because of index.

Lastly we tried the following

select /*+ parallel (mt,12) */ * from mytable mt where rownum<2;

The query immediately returned a row.

Now my doubt is:

Am just finding any ONE row. It could have been returned by reading any block(s) from that table and should not have taken much time.

But having parallel hint it returned the row very fast. What could have been the reason.

Did the query really needed parallel hint?

Please help...

August 26, 2008 - 8:03 pm UTC

...
select * from mytable where rownum<2;

This query almost got stuck and never came out.
.....

that does not sound right, not at all - not unless you have a TON of partitions that USED to have data, but you DELETED (not truncated) it. Is that the case?

slow select

Deepak, August 27, 2008 - 6:08 am UTC

Hi Tom,

Thanks for the response. I could not understand how it can become slower if a lots of deletes have happened. Is it because of fragmentation? If yes, then in my case it is very less.

Also here I am asking oracle to select the first few rows which it could have returned by reading few blocks.

Also what was the role of parallel hint here. Why did that become so fast. Did it try to get the rows from more than one partitions?

What does oracle really do when do a select ... where rownum<3? Will it not fetch any of the available blocks as I am not specifying any other conditions. Am sorry if I am asking something very trivial. Please help in understanding this very basic thing.

August 27, 2008 - 9:47 am UTC

if you insert into a table lots of stuff

and then you delete all of it

and you issue "select * from t where rownum < 2" - it will full scan the entire set of blocks that HAD BEEN USED.

the parallel hint made the IO go apparently faster.

so, answer me this:

did you have lots of partitions that were full, but are now empty and emptied via a delete - so that they have lots of empty blocks (which we can clear out easily with a truncate or shrink space command)

slow select

Deepak, August 27, 2008 - 2:25 pm UTC

Hi Tom,

Thanks for the answer. To answer your question:

No we never delete rows as this is a warehouse table and it gets loaded in the rolling window fashion. We load data into a table and do exchange partition.

Tom I still do not understand why Oracle needs to read all the used blocks in this particular case unless it does not find any row after scanning almost all blocks.

Can it simply not stop at the moment it got the any row?

Hope I am not bothering you and wasting your time asking these silly questions.

August 29, 2008 - 1:03 pm UTC

I'll need to see a tkprof - I cannot understand why or how you are seeing what you say you see. It would not work that way.

If you did not delete lots of data, if the data is loaded nicely as you say it is, then select * from t where rownum < 2 should return instantly.

Parallel process is not getting used.

NT, June 30, 2009 - 2:45 pm UTC

Hi Tom,

I have a query with parallel set on table is 8 when I run on two database. it generated different plan.

first plan
=============
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ
|IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------------------
----------------------
| 0 | SELECT STATEMENT | | 1616 | 318K| 28153 (16)| 00:05:38 |
| | |
| 1 | PX COORDINATOR | | | | | |
| | |
| 2 | PX SEND QC (ORDER) | :TQ10010 | 1616 | 318K| 28153 (16)| 00:05:38 | Q1,10
| P->S | QC (ORDER) |
| 3 | SORT UNIQUE | | 1616 | 318K| 28153 (16)| 00:05:38 | Q1,10
| PCWP | |
| 4 | PX RECEIVE | | 1616 | 318K| 28153 (16)| 00:05:38 | Q1,10
| PCWP | |
| 5 | PX SEND RANGE | :TQ10009 | 1616 | 318K| 28153 (16)| 00:05:38 | Q1,09
| P->P | RANGE |
| 6 | SORT GROUP BY | | 1616 | 318K| 28153 (16)| 00:05:38 | Q1,09
| PCWP | |
| 7 | PX RECEIVE | | 1616 | 318K| 28151 (16)| 00:05:38 | Q1,09
| PCWP | |
| 8 | PX SEND HASH | :TQ10008 | 1616 | 318K| 28151 (16)| 00:05:38 | Q1,08
| P->P | HASH |
|* 9 | HASH JOIN | | 1616 | 318K| 28151 (16)| 00:05:38 | Q1,08
| PCWP | |
| 10 | PX RECEIVE | | 1616 | 239K| 28127 (16)| 00:05:38 | Q1,08
| PCWP | |
| 11 | PX SEND HASH | :TQ10007 | 1616 | 239K| 28127 (16)| 00:05:38 | Q1,07
| P->P | HASH |
|* 12 | HASH JOIN BUFFERED | | 1616 | 239K| 28127 (16)| 00:05:38 | Q1,07
| PCWP | |
| 13 | BUFFER SORT | | | | | | Q1,07
| PCWC | |
| 14 | PX RECEIVE | | 403 | 13299 | 12 (0)| 00:00:01 | Q1,07
| PCWP | |
| 15 | PX SEND HASH | :TQ10003 | 403 | 13299 | 12 (0)| 00:00:01 |
| S->P | HASH |
| 16 | TABLE ACCESS FULL | W_PRODUCT_D | 403 | 13299 | 12 (0)| 00:00:01 |
| | |
| 17 | PX RECEIVE | | 1616 | 187K| 28114 (16)| 00:05:38 | Q1,07
| PCWP | |
| 18 | PX SEND HASH | :TQ10006 | 1616 | 187K| 28114 (16)| 00:05:38 | Q1,06
| P->P | HASH |
|* 19 | HASH JOIN BUFFERED | | 1616 | 187K| 28114 (16)| 00:05:38 | Q1,06
| PCWP | |
| 20 | BUFFER SORT | | | | | | Q1,06
| PCWC | |
| 21 | PX RECEIVE | | 56 | 728 | 3 (34)| 00:00:01 | Q1,06
| PCWP | |
| 22 | PX SEND BROADCAST | :TQ10001 | 56 | 728 | 3 (34)| 00:00:01 |
| S->P | BROADCAST |
|* 23 | VIEW | index$_join$_005 | 56 | 728 | 3 (34)| 00:00:01 |
| | |
|* 24 | HASH JOIN | | | | | |
| | |
|* 25 | INDEX RANGE SCAN | W_ORDER_MD_M6 | 56 | 728 | 1 (0)| 00:00:01 |
| | |
| 26 | INDEX FAST FULL SCAN | W_ORDER_MD_P1 | 56 | 728 | 1 (0)| 00:00:01 |
| | |
|* 27 | HASH JOIN | | 3926 | 406K| 28111 (16)| 00:05:38 | Q1,06
| PCWP | |
| 28 | BUFFER SORT | | | | | | Q1,06
| PCWC | |
| 29 | PX RECEIVE | | 866 | 41568 | 25 (12)| 00:00:01 | Q1,06
| PCWP | |
| 30 | PX SEND HASH | :TQ10002 | 866 | 41568 | 25 (12)| 00:00:01 |
| S->P | HASH |
|* 31 | TABLE ACCESS FULL | W_POSITION_D | 866 | 41568 | 25 (12)| 00:00:01 |
| | |
| 32 | PX RECEIVE | | 13592 | 769K| 28086 (16)| 00:05:38 | Q1,06
| PCWP | |
| 33 | PX SEND HASH | :TQ10005 | 13592 | 769K| 28086 (16)| 00:05:38 | Q1,05
| P->P | HASH |
|* 34 | HASH JOIN | | 13592 | 769K| 28086 (16)| 00:05:38 | Q1,05
| PCWP | |
| 35 | BUFFER SORT | | | | | | Q1,05
| PCWC | |
| 36 | PX RECEIVE | | 1 | 39 | 2 (0)| 00:00:01 | Q1,05
| PCWP | |
| 37 | PX SEND BROADCAST | :TQ10000 | 1 | 39 | 2 (0)| 00:00:01 |
| S->P | BROADCAST |
|* 38 | TABLE ACCESS BY INDEX ROWID| W_LOV_D | 1 | 39 | 2 (0)| 00:00:01 |
| | |

|* 39 | INDEX RANGE SCAN | W_LOV_D_M1 | 8 | | 1 (0)| 00:00:01 |
| | |

| 40 | PX BLOCK ITERATOR | | 51M| 932M| 27863 (16)| 00:05:35 | Q1,05
| PCWC | |

| 41 | TABLE ACCESS FULL | W_ORDERITEM_F | 51M| 932M| 27863 (16)| 00:05:35 | Q1,05
| PCWP | |

| 42 | BUFFER SORT | | | | | | Q1,08
| PCWC | |

| 43 | PX RECEIVE | | 6347 | 309K| 24 (9)| 00:00:01 | Q1,08
| PCWP | |

| 44 | PX SEND HASH | :TQ10004 | 6347 | 309K| 24 (9)| 00:00:01 |
| S->P | HASH |

| 45 | TABLE ACCESS FULL | W_POSITION_D | 6347 | 309K| 24 (9)| 00:00:01 |
| | |

-----------------------------------------------------------------------------------------------------------------------------
----------------------

Predicate Information (identified by operation id):
---------------------------------------------------

9 - access("T31648"."PR_TEAM_POS_WID"="T36585"."ROW_WID")
12 - access("T31648"."PROD_WID"="T32069"."ROW_WID")
19 - access("T31648"."ORDER_ATTR_WID"="T43249"."ROW_WID")
23 - filter("T43249"."STATUS_CD_I"='Complete')
24 - access(ROWID=ROWID)
25 - access("T43249"."STATUS_CD_I"='Complete')
27 - access("T31648"."PR_TEAM_POS_WID"="T295933"."ROW_WID")
31 - filter("T295933"."X_DIVN_TYPE_CD_I"='Aljawal Outlet' OR "T295933"."X_DIVN_TYPE_CD_I"='Shopping Mall Outlet')
34 - access("T31648"."ACTION_TYPE_WID"="T36622"."ROW_WID")
38 - filter("T36622"."NAME"='Modified' OR "T36622"."NAME"='New')
39 - access("T36622"."TYPE"='DELTA_ACTION_CODE')

67 rows selected.

second plan
============
------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13999 | 2870K| | 5100 (3)| 00:00:14 |
| 1 | SORT ORDER BY | | 13999 | 2870K| 6176K| 5100 (3)| 00:00:14 |
| 2 | HASH UNIQUE | | 13999 | 2870K| | 5100 (3)| 00:00:14 |
| 3 | HASH GROUP BY | | 13999 | 2870K| 6176K| 5100 (3)| 00:00:14 |
| 4 | NESTED LOOPS | | 13999 | 2870K| | 5097 (3)| 00:00:14 |
| 5 | NESTED LOOPS | | 13999 | 2214K| | 5096 (3)| 00:00:14 |
|* 6 | HASH JOIN | | 13999 | 1531K| | 5077 (3)| 00:00:14 |
| 7 | TABLE ACCESS FULL | W_PRODUCT_D | 542 | 22222 | | 10 (0)| 00:00:01 |
|* 8 | HASH JOIN | | 13999 | 970K| | 5066 (3)| 00:00:14 |
| 9 | TABLE ACCESS BY INDEX ROWID | W_ORDER_MD | 82 | 1066 | | 1 (0)| 00:00:01 |
|* 10 | INDEX RANGE SCAN | W_ORDER_MD_M6 | 82 | | | 1 (0)| 00:00:01 |
| 11 | TABLE ACCESS BY INDEX ROWID | W_ORDERITEM_F | 9549K| 610M| | 5063 (3)| 00:00:14 |
| 12 | NESTED LOOPS | | 13999 | 792K| | 5064 (3)| 00:00:14 |
|* 13 | TABLE ACCESS BY INDEX ROWID| W_LOV_D | 1 | 39 | | 1 (0)| 00:00:01 |
|* 14 | INDEX RANGE SCAN | W_LOV_D_M1 | 6 | | | 1 (0)| 00:00:01 |
|* 15 | INDEX RANGE SCAN | W_ORDERITEM_F_F41 | 9549K| | | 65 (16)| 00:00:01 |
| 16 | TABLE ACCESS BY INDEX ROWID | W_POSITION_D | 1 | 50 | | 1 (0)| 00:00:01 |
|* 17 | INDEX UNIQUE SCAN | W_POSITION_D_P1 | 1 | | | 1 (0)| 00:00:01 |
|* 18 | TABLE ACCESS BY INDEX ROWID | W_POSITION_D | 1 | 48 | | 1 (0)| 00:00:01 |
|* 19 | INDEX UNIQUE SCAN | W_POSITION_D_P1 | 1 | | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

6 - access("T31648"."PROD_WID"="T32069"."ROW_WID")
8 - access("T31648"."ORDER_ATTR_WID"="T43249"."ROW_WID")
10 - access("T43249"."STATUS_CD_I"='Complete')
13 - filter("T36622"."NAME"='Modified' OR "T36622"."NAME"='New')
14 - access("T36622"."TYPE"='DELTA_ACTION_CODE')
15 - access("T31648"."ACTION_TYPE_WID"="T36622"."ROW_WID")
17 - access("T31648"."PR_TEAM_POS_WID"="T36585"."ROW_WID")
18 - filter("T295933"."X_DIVN_TYPE_CD_I"='Aljawal Outlet' OR "T295933"."X_DIVN_TYPE_CD_I"='Shopping
Mall Outlet')
19 - access("T31648"."PR_TEAM_POS_WID"="T295933"."ROW_WID")

Parallel_max_server is set to 135 on first server and second server has 80. Following parameter set on the server
NAME TYPE VALUE
------------------------------------ ----------- ---------
fast_start_parallel_rollback string LOW
parallel_adaptive_multi_user boolean TRUE
parallel_automatic_tuning boolean FALSE
parallel_execution_message_size integer 2152
parallel_instance_group string
parallel_max_servers integer 80
parallel_min_percent integer 0
parallel_min_servers integer 0
parallel_server boolean TRUE
parallel_server_instances integer 2
parallel_threads_per_cpu integer 2
recovery_parallelism integer 0

First plan gives output in less than 2 mins and first plan never comes out after 30 min also.

On second server v$px_process and v$px_session doesn't show parallel query.

Please guide me why parallisum is not being used on another server.

Parallelize select of a query

A Reader, April 25, 2011 - 2:19 pm UTC

Hi Tom,
I have a strange problem. I am trying to parallelize a query, however most of the processing time is in the select
clause that does some data parsing. Since it looks like parallel hints mostly optimize I/O , I am not getting any performance improvement. Is there a way to ensure that the select part of the query will be parallelized too? I tried using java (multithreaded) and then call oracle - but after the I/O was done, the parsing (in select) is still single threaded.
Please suggest any alternatives..
thanks a lot!

April 25, 2011 - 3:04 pm UTC

... Since it looks like parallel hints mostly
optimize I/O , ...

false

your "parsing" is being done in parallel as well.

Show us the SQL, show us the PLAN and erase that java code, if you are doing stuff in java, you are doing it slow. We handle the parallelism for you.

details

A reader, April 26, 2011 - 12:01 pm UTC

Here are the query and the times I got -

SQL> set timing on;
SQL> SELECT /*+ parallel (a,8)*/count(1) FROM dghosh.amit a  WHERE dunsnumber_ext='007923311';
 
  COUNT(1)
----------
     66237
 
Executed in 5.359 seconds
 
SQL> SELECT count(1) FROM dghosh.amit a  WHERE dunsnumber_ext='007923311';
 
  COUNT(1)
----------
     66237
 
Executed in 69.89 seconds

---so far so good - it seems to run faster with parallel.
Now I added my function call that parses the BLOB returned in the select and verifies whether its valid or not:


SQL>  SELECT  app.udt_parser.Fn_Validate('AMIT',a.uidamsinterval,a.starttime,a.stoptime,a.spi,
a.intervalcount,a.valuecodes,a.dstparticipant) FROM dghosh.amit a  WHERE dunsnumber_ext='007923311'
  2  /
66237 rows selected
 
Executed in 1966.453 seconds
 
SQL> 

--now with parallel hint:
SQL>  SELECT /*+ parallel (a,8)*/ app.udt_parser.Fn_Validate('AMIT',a.uidamsinterval,a.starttime,a.stoptime,a.spi,a.intervalcount,a.valuecodes,a.dstparticipant) FROM dghosh.amit a  WHERE dunsnumber_ext='007923311'
  2  / 
66237 rows selected
 
Executed in 1504.796 seconds



------------
I see it runs a bit faster, but not a whole lot.

Explain plan:
for 

SELECT /*+ parallel (a,8)*/ app.udt_parser.Fn_Validate('AMIT',a.uidamsinterval,a.starttime,a.stoptime,a.spi,a.intervalcount,a.valuecodes,a.dstparticipant) FROM dghosh.amit a  WHERE dunsnumber_ext='007923311'

----------------------------------------
SELECT STATEMENT, GOAL = ALL_ROWS   3448 56746 7376980
 PX COORDINATOR     
  PX SEND QC (RANDOM) SYS :TQ10000 3448 56746 7376980
   PX BLOCK ITERATOR   3448 56746 7376980
    TABLE ACCESS FULL DGHOSH AMIT 3448 56746 7376980
---------------
It is a full table scan for the one without parallel. if it is running in parallel(including the parsing) , I am wondering why I am not seeing any significant perf improvement(as opposed to when I am just selecting unprocessed columns from the table).
We are running 10.2.0.5.0

thanks a lot for your reply.

April 26, 2011 - 2:51 pm UTC

is your function parallel enabled

is your function even remotely "optimal"

do you have 8 cores that all belong to you

parallel query

A reader, April 28, 2011 - 12:13 pm UTC

thank you so much for the parallel_enable tip.Nobody here was aware of that clause.
Our pipeline function was not parallel enabled. When I did that along with adding the session_id from your other example it worked beautifully! It is taking roughly 25% of the time it was taking originally.
The only question I have is, does my function really has to be a pipeline function in order to parallelize it? or can I apply it to other functions that are not pipeline too?

thank you so much again. you are sucha saviour!

April 28, 2011 - 12:27 pm UTC

there are three types of functions to consider:

select f(x) from t;

select agg(x) from t group by y;

select * from table( z(...) );

You can parallel enable 'normal' functions called in where/select lists. This restricts what your function can do (access to package global variables is restricted for example)

You can parallel enable a user defined aggregate function of your own making (as long as you wrote the supporting code!)

You can parallel enable a pipelined function - as you have.

parallel_query

A reader, April 28, 2011 - 2:57 pm UTC

thank you for your response. for the normal function to be parallelized, do I still need to have a cursor as an input?
I am trying normal parameterized input and it doesnt seem to parallelize.

thanks!

April 29, 2011 - 7:56 am UTC

I'm not understanding this question.

what is a "normal function" (that implies there are 'abnormal' functions - what are those)

Be a tad more explicit in what you are trying to do.

If you mean:

I'm trying to execute a pipelined function in parallel - like "select * from table( my_function( .... ) )" - and unless the inputs are a CURSOR - it won't go in parallel...

Then you are correct, think about it. How could we execute a function in parallel, if we cannot split up the inputs? We can take a cursor and have your function work on different bits of the result set. We cannot take your function that just takes the number "42" as an input and parallelize it - there is nothing to break into bits and send it to more than one copy of your routine.

http://docs.oracle.com/docs/cd/E11882_01/appdev.112/e10765/pipe_paral_tbl.htm#i1004978

pipeline functions with mixed cursor/parameters

Gary Schulte, April 29, 2011 - 3:13 pm UTC

> Then you are correct, think about it. How could we execute
> a function in parallel, if we cannot split up the inputs?
> We can take a cursor and have your function work on
> different bits of the result set. We cannot take your
> function that just takes the number "42" as an input and
> parallelize it - there is nothing to break into bits and
> send it to more than one copy of your routine.

If the inputs are a cursor and one or more scalar values, it should be only a matter of pushing those values into to each parallel thread. "42" makes perfect sense as an input to a parallel function in concert with a cursor

So, is there some provision to mix scalar values with a cursor when trying to parallelize a pipeline function?

TIA

May 01, 2011 - 8:15 am UTC

Yes there is, as long as the function has exactly ONE ref cursor with the partition clause - it can be parallelized

if you run this:

create table t1
as
select object_id id, object_name text
  from all_objects;
begin
    dbms_stats.set_table_stats
    ( user, 'T1', numrows=>10000000,numblks=>100000 );
end;
/
create table t2
as
select t1.*, 0 session_id
  from t1
 where 1=0;

pause


CREATE OR REPLACE TYPE t2_type
AS OBJECT (
 id         number,
 text       varchar2(30),
 session_id number
)
/
create or replace type t2_tab_type
as table of t2_type
/
pause

create or replace
function parallel_pipelined( l_cursor in sys_refcursor, l_some_number in number )
return t2_tab_type
pipelined
parallel_enable ( partition l_cursor by any )
is
    l_session_id number;
    l_rec        t1%rowtype;
begin
    select sid into l_session_id
      from v$mystat
     where rownum =1;
    loop
        fetch l_cursor into l_rec;
        exit when l_cursor%notfound;
        -- complex process here
        pipe row(t2_type(l_rec.id,to_char( l_some_number ) || '-' || substr( l_rec.text,1,20) ,
                         l_session_id));
    end loop;
    close l_cursor;
    return;
end;
/
pause

alter session enable parallel dml;
insert /*+ append */
into t2(id,text,session_id)
select *
from table(parallel_pipelined
          (CURSOR(select /*+ parallel(t1) */ *
                    from t1 ), 42
           ))

/
commit;
select session_id, count(*)
  from t2
 group by session_id;

you'll see in t2 afterwards something like this:

ops$tkyte%ORA11GR2> select session_id, count(*)
  2    from t2
  3   group by session_id;

SESSION_ID   COUNT(*)
---------- ----------
        11       8806
       132       8475
        68      10535
       135       9860
        69       9412
       130      10747
       197       6801
         9       7662

8 rows selected.

which indicates it was done in many parallel sessions.

Parallel Query - 11gr2

Rajeshwaran, July 11, 2013 - 8:29 am UTC

Tom:

I was reading about parallel query execution from docs. When running below query in 11.2.0.1 (table b1 is set to parallel(4) and b2 is set to noparallel). but I see 8 px session in v$px_session. Can you help me to understand why 8 px_session when requested DOP is only 4?

Session#1

rajesh@ORA11G> select userenv('sid')
  2  from dual ;

USERENV('SID')
--------------
            15
rajesh@ORA11G> @d:\script.sql

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------
SQL_ID  4h8xdzwcuf320, child number 0
-------------------------------------
select /*+ parallel(b1,4) noparallel(b2) */ b1.object_name,
b2.object_type from big_table b1,      big_table b2 where b1.id = b2.id

Plan hash value: 2664905929

------------------------------------------------------------------------------------------
| Id  | Operation               | Name      | Rows  | Bytes |    TQ  |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |           |       |       |        |      |            |
|   1 |  PX COORDINATOR         |           |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)   | :TQ10002  |    10M|   442M|  Q1,02 | P->S | QC (RAND)  |
|*  3 |    HASH JOIN BUFFERED   |           |    10M|   442M|  Q1,02 | PCWP |            |
|   4 |     BUFFER SORT         |           |       |       |  Q1,02 | PCWC |            |
|   5 |      PX RECEIVE         |           |    10M|   143M|  Q1,02 | PCWP |            |
|   6 |       PX SEND HASH      | :TQ10000  |    10M|   143M|        | S->P | HASH       |
|   7 |        TABLE ACCESS FULL| BIG_TABLE |    10M|   143M|        |      |            |
|   8 |     PX RECEIVE          |           |    10M|   295M|  Q1,02 | PCWP |            |
|   9 |      PX SEND HASH       | :TQ10001  |    10M|   295M|  Q1,01 | P->P | HASH       |
|  10 |       PX BLOCK ITERATOR |           |    10M|   295M|  Q1,01 | PCWC |            |
|* 11 |        TABLE ACCESS FULL| BIG_TABLE |    10M|   295M|  Q1,01 | PCWP |            |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("B1"."ID"="B2"."ID")
  11 - access(:Z>=:Z AND :Z<=:Z)

Session#2

rajesh@ORA11G> select qcsid,sid,serial#,qcsid,server_group,server_set,degree,req_degree
  2  from v$px_session
  3  where qcsid = 15 ;

     QCSID        SID    SERIAL#      QCSID SERVER_GROUP SERVER_SET     DEGREE REQ_DEGREE
---------- ---------- ---------- ---------- ------------ ---------- ---------- ----------
        15         17        320         15            1          1          4          4
        15        142        436         15            1          1          4          4
        15         21         97         15            1          1          4          4
        15        144         23         15            1          1          4          4
        15         14        248         15            1          2          4          4
        15        141        676         15            1          2          4          4
        15         16        275         15            1          2          4          4
        15        148          4         15            1          2          4          4
        15         15        671         15

9 rows selected.

http://docs.oracle.com/cd/E11882_01/server.112/e25523/parallel002.htm#autoId2

July 16, 2013 - 3:46 pm UTC

see http://docs.oracle.com/cd/E11882_01/server.112/e25523/parallel002.htm#sthref862

Parallel Query

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

More doubts

Parallel query

Parallel Query

Parallel Query

Another problem.

Inspite of Noparallel Hint Parallel Query is executed.

Parallel Query

cpu utilization

Quad Xeon in above post

Multiple CPUs and utilization

Windows is spreading the utilization across

parallel query

follow up to oracle.exe across multiple cpus

Spawning multiple Jobs for parallel processine

parallel and free lists

sqlloader

Parallel query problem

more parallel slaves,

follow up,

thanks

makes sense,

hash join,

Checkpoints

Segment Checkpoint

Parallel hint not used

Parallel processes

relation between parallel query and db_file_multiblock_read_count

Parallel query problem

relation between parallel query and db_file_multiblock_read_count

why?

parallel query

parallel query and direct io

How do you determine the degree of parallelism

Follow-up to May 2005 - checkpoints

why does automatic parallel query happen?

Prallel exectuion in 9i better then in 10gr2

Comparing apples with toaster ovens :(

Parallel Response

Parallel Response

PX count

Very well explained

Parallel Query

HOW to know what you have set?

HOW to know?

Parameters set by us

Here is the output ...

Query

Parallel Execution

performance issue fixed by setting parallel_max_servers to zero

No PARALLEL?

||

wow

parralell witn outer join

Paralell table with outer join

parallel hint

Parallel Hint showing absurd behavior

Parallel Hint showing absurd behavior

Parallel Hint for insert

what is "events in waitclass Other"?

parallel query

slow select

slow select

Parallel process is not getting used.

Parallelize select of a query

details

parallel query

parallel_query

pipeline functions with mixed cursor/parameters

Parallel Query - 11gr2

More to Explore

PL/SQL demos

PL/SQL docs