Thanks for the question, Nag.

Asked: October 18, 2001 - 1:04 pm UTC

Last updated: October 25, 2010 - 5:38 pm UTC

Version: 8.1.7

Viewed 10K+ times! This question is

You Asked

Tom

We have 10 cpu's on our production server, we have the following concerns, and missing pieces of knowledge before we start using parallel query on some select tables.(I have got the go ahead for using PQ ,but only after doing the research and proving the results.)

1. What is the max degree of parallelism for a single cpu.
2. How do we arrive at the ideal set up values for
parallel_max_servers
parallel_min_servers
3.What are the max values for the above parameters.
4.I have 10 cpu's and my parallel max servers is set at 40. Now if I give a parallel degree of 10 for table t( with full hint), how many cpu's will be used to privide the parallelism of 10.
5.How can we find out how many cpu's are being for a particular query.
6.One of the concerns is that if we start using parallel query, some of the other processes and applications running, will not get any cpu time. Iam not convinced, how can this be true or false.
7.How can I arrive at a degree of parallelism, where in my process uses only 4 cpu's out of 10 , and leave the other 6 alone.
8. 'A PQ degree of 1 will use 1 cpu, a pq degree of 2 will use 2 cpu'
this statement is false , can you explain why.

The answers to the above will be very crucial to us.

Thank you very much.

and Tom said...

1) 3,599

2) How many processes do you want going at the same time? Probably no more then 100. Min servers -- upto you totally. start at zero and let them go up from there.

3) max, 0..3599
min, 0..max

4) it'll use 10 processes -- the OPERATING system is responsible for scheduling the processes. It'll use whatever it feels like. Don't forget, when they are doing IO, they are not on a CPU -- a full scan is not a CPU bound thing, its IO bound. For many operations, don't expect your CPU monitors to light up like a christmas tree or anything.

5) system management tools. You can see in the database how many PROCESSES (v$session) but they aren't all active at the same time.

6) this can be true. You might want to get Jonathan Lewis's book "practical Oracle8i -- building efficient databases". I think he put it beautifully when he described parallel query as a "non scalable operation". It does not scale up with users (more users, less scalability). It gives you the ability to consume the ENTIRE machine with your one query. It is designed to use EVERYTHING, it is very selfish.

7) use a parallel hint with degree = 4. Again, you won't starve them on CPU as much as you'll hit the disks and light them up like crazy. You might be looking at the wrong resource.

8) See the above answers. While one process is reading the other could be reading as well (we are using 0 cpu at that point). We could be using 0, 1 or 2 -- its up to the SCHEDULER in the OS.

Rating

(72 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Parallel Query

George Spears, October 18, 2001 - 3:08 pm UTC

Tom,

Just a general comment on question 4. Assuming that the table degree of parallelism is set to 10, either 10 or 20 processes / threads may be used, 10 for the reading, and an additional 10 if the query is doing a sort operation. I would assume the behaviour in this circumstance is the same, regardless of the degree of parallelism being set at the table or in a hint.

Comments?

October 18, 2001 - 3:33 pm UTC

Ok, just to be pedantic, it could use more then 20 (i was keeping it simple and trying to point out a process != CPU)

Each step in the query that feeds another step can be done at the same time and in parallel as well. if the query were a full scan AND a sort, it could be the 10 scanners feed to the 10 sorters feed to the one coordinator.

If you look at:
</code> http://docs.oracle.com/docs/cd/A81042_01/DOC/server.816/a76965/c22paral.htm#12937 <code>

you can see that if I did a join of two tables each parallel 10, I might have upto 40 parallel query "slaves" and one coordinator (in that example).

A reader, October 18, 2001 - 3:10 pm UTC

Tom, kindly distinguish between

"a full scan is not a CPU bound thing, its IO
bound"

CPU bound and IO bound

1.Give examples of CPU bound operations and IO bound operations.
2.( how come an operation which is IO bound i.e. reading disks, does not use CPU).

3.When you say process
is 1 process= 1 CPU

4.Can you explain the role of SCHEDULER in the OS, how does it distribute cpu between queries.

October 18, 2001 - 3:25 pm UTC

1) when you SORT data in memory, that eats a ton of CPU.
when you read 5 gig of data from disk, that'll spin your disks.

(i'm at a loss to come up with "simpilier" examples -- things that read the disk = IO bound. things that process data = CPU)

2) the storage device does that, your SCSI controller gets the read request and does the work. There are drive technologies where this is not 100% true (like IDE drives) but you wouldn't use them on a database machine.

3) no, i thought that would be clear. A process is a process is a process. On a single CPU machine, you can have hundreds of processes running. On a 10 cpu machine you can have hundreds of processes. A cpu is a resource (like a printer, or disk space) that a process will reques an the OS will coordinate access to.

4) beyond the scope of asktom -- thats a whole class in computer science ;)

Linear performance gain

john, May 14, 2002 - 4:56 am UTC

Tom,

Many people from Oracle in their books say that one can get almost linear performance gain out of OPQ.
I've used every single thought that came on my mind
and couldn't receive even 50% gain.
I used one tablespace with 2 datafiles in 3 configurations, and those are:
1. different disks
2. different disks and on different controllers
3. different raid storages through one scsi controller

What am i missing to gain the result?

May 14, 2002 - 6:09 am UTC

operative word "CAN get".

You would optimally have separate devices (to reduce contention).
You would be doing something that can easily be sped up (full scans, index builds).
You would have more then 1 cpu.You would have sufficient ram.
You would have sufficient spare resources to begin with (eg: if the machine is overloaded doing serial processing -- PQ will make it run slower)

You will make sure that the PQ slaves do different types of work at the same time. For example -- I see people do index rebuilds in parallel with HUGE sort area sizes. Well -- each PQ slave tends to read a ton, sort a ton, read a ton, sort a ton. A smaller sort area size would help them get out of "sync" -- instead of all of them reading together then sorting then reading then sorting -- they will start to interleave so that some are reading, some are sorting (some using disk, some using cpu at the same time).

A reader, May 14, 2002 - 8:22 am UTC

We are using RULE base optimizer.

Can we take advantage of parellel query option in oracle.

Thanks.

May 16, 2002 - 9:24 am UTC

Yes, but it'll be using CBO *not* RBO as soon as you start using it. (hence you'll want statistics).

A reader, November 22, 2002 - 11:44 am UTC

Hi Tom,

What do you infer from the following info?

SQL> select name, value from v$sysstat where name like 'Parallel%';

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
Parallel operations not downgraded                                     4321
Parallel operations downgraded to serial                               2163
Parallel operations downgraded 75 to 99 pct                             544
Parallel operations downgraded 50 to 75 pct                            2063
Parallel operations downgraded 25 to 50 pct                            3960
Parallel operations downgraded 1 to 25 pct                             2006

6 rows selected.

SQL> sho parameter parallel

NAME                                 TYPE    VALUE
------------------------------------ ------- ------------------------------
fast_start_parallel_rollback         string  LOW
optimizer_percent_parallel           integer 0
parallel_adaptive_multi_user         boolean TRUE
parallel_automatic_tuning            boolean TRUE
parallel_broadcast_enabled           boolean FALSE
parallel_execution_message_size      integer 16384
parallel_instance_group              string
parallel_max_servers                 integer 200
parallel_min_percent                 integer 0
parallel_min_servers                 integer 0
parallel_server                      boolean FALSE
parallel_server_instances            integer 1
parallel_threads_per_cpu             integer 4
recovery_parallelism                 integer 0
SQL> 

Thanks.

November 22, 2002 - 12:46 pm UTC

such a large downgraded to serial indicates that lots of your operations were done serially instead of in parallel. That is, your system was maxed out -- tons of people trying to do parallel query that couldn't.

given that you have max server at 200 i would say that you have a database with lots of concurrent DW queries and are maxing it out.

A reader, November 22, 2002 - 1:28 pm UTC

Thanks. Now what would you say if my DBA or manager does not really care when I pointed this out?

This is a warhouse with about 800 GB data on a Sun10K domain (24 CPU). There are about 20-50 concurrent users all the time. We are using parallel_automatic_tuning. But the tables still have a DOP set. Will it help if we keep the DOP on tables to default and let oracle decide? Also, do you think the parallel_max_servers can be higher (to something like 800-1000) or since we have parallel_threads_per_cpu at 4, will the actual max_servers be 4*200=800?

How can I find more info on the details of parallel query? (what to look for in v$pq_systat etc..). Thank you for your time.

November 22, 2002 - 2:52 pm UTC

50 users -- parallel 4 (little) -- you are out of power.

I would say -- if performance is acceptable - it is doing what it is supposed to be doing (backing off on resources as resources become scarce).

I might remove the DOP on the objects and just make the parallel. Be prepared to put them back as they are (just in case).

200 parallel, active processes on a 24 cpu machine is already getting up there, I would not think HIGHER no.

A reader, November 23, 2002 - 9:37 am UTC

Oracle suggests parallel_max_servers as 2xDOPxmax_concurrent_users. How do you relate the no. of cpus to parallel_max_servers? (ie how do you say that a parallel_max_servers of 200 is high for a 24 CPU box? I could not find a relation between no of cpus and parallel max servers in Oracle Doc). How do we monitor the system load when we have 200 parallel slave processes running (at o/s level what to look for? Its solaris 2.8).

You said if performance is ok then it is doing the right thing. But performance is not OK. If we have more than 20 users, the system slows down. It has a scalability limitation which we want to identify and fix. Thank you very much for your kind help on this.

November 23, 2002 - 9:50 am UTC

(where is that guidance, I can't look at things out of context -- I would bet there is a caveat lurking somewhere near that sentence that says "of course, the number of CPU's had better be considered")

If you have 200 processes all going active on a 24 cpu box -- tell me -- what is going to happen? can you spell "backed up run queue". You can use vmstat, sar, top -- any number of OS utilities to monitor this.

As for the last paragraph -- I would expect it to slow down. You only have but 24 CPU's. You let users use lots more then one at a time. Parallel query is NOT A SCALABLE solution (i believe Jonathan Lewis said it best in his book Practical Oracle8i). PQ was designed to let a few users totally consume a machine. As you add more and more users -- you have less and less resources -- hence by definition I would expect things to run slower and slower.

Now the question is -- given the resources you have (24 cpus) -- and more then 20 users -- what is the fair and equitable ratio of USERS to CPUS. It is getting nearer and nearer to 1:1 as the number of users approaches and then exceeds the number of CPUs.

Think what would happen if you got 50 users doing parallel 10? 500 concurrently execution processes doing CPU grabbing and context switching all over the place.

This is exactly why connection pooling, connection concentration, connection multiplexing was invented -- to reduce the amount of concurrently executing stuff as the machine gets swamped.

20 users doing parallel query on a 24 cpu machine cannot reasonably expect the performance that same machine gives 1 user, 5 users.

What you should be looking for is a predicable, near linear change in response times as concurrency goes up and demand for these shared resources increases.

A reader, November 23, 2002 - 1:13 pm UTC

Thank you for the fast response.

We have plans of moving out to a bigger machine. But till then, we have to somehow improve things on this machine. We are planning to use dbms_resource_manager to limit unimportant users from hogging the resources. Isn't it the right thing to do? Any caveats?

Also I am just wondering, to support a DW system with about 1TB of data and around 30-50 concurrent users, what would be the ideal system config? Should something like 48 cpus and 96GB memory (that is what we are planning to upgrade to) would be enough?. It sounded like an overkill to me first, but I realise now may be I am wrong. As you said parallel query is not a scalable solution. Curious to know first hand experience from anyone running a really busy DWH out there!

Your valuable insights are apprecited as always!

November 23, 2002 - 1:37 pm UTC

using the dbms_resource manager would be a way to prioritize yes, you could limit the DOP by user groups -- making some users parallizable and others -- not.

You SHOULD also look at just POT -- plain old tuning. Parallel query is great at "big things" -- well, are there frequently asked questions of your DW that you could pre-answer via materialized views and using query rewrite knock off the need for parallel query all over the place (rhetorical question, I'm confident the answer is YES).

Have you read the data warehousing guide?
</code> http://docs.oracle.com/docs/cd/A87860_01/doc/server.817/a76994/toc.htm <code>

look at MVs, partitioning, star transformations and other traditional DW features in order to reduce or entirely REMOVE the need for parallel query.

If by 30-50 concurrent users you really mean 30-50 concurrent DW Queries all wanting to use parallel operations for "fast=true", you will need an truly big machine. You could consider that instead of a multi-million dollar piece of hardware, you look at commidity servers and use something like 9iRAC to spread the load out.

A reader, November 24, 2002 - 12:53 pm UTC

You are very correct in your observations... No doubt, a lot can be achived by making use of the good old tuning techniques. We are infact using partitioning, star schema and MVs. But we can sure do a lot more.

A very interesting point you have made regarding RAC. If I were the one to make the decision, I would have considered it seriously. But I feel, it will take few more years before it will be accepted across corporate america as the platform of choice for deploying critical applications.

Thanks again for your time.

November 24, 2002 - 1:05 pm UTC

just like SMP in the early 90's -- I remember the project I was working on, having heated discussions over whether to go with a 2 cpu vs 1 cpu machine. Not even a consideration anymore....

PQ is a big headache

Peter Tran, April 01, 2003 - 8:20 pm UTC

We had a situation where we installed oracle 9i and it had as default:

parallel_max_servers=40
parallel_min_servers=0

We used the same base code that we had been developing against our 8.1.7 database to test against this new 9i instance. Performance was abysmal.

We couldn't figure out why until we reset the values to 0. When we disabled the PQ parameters, performance returned to acceptable (i.e. 8i) level.

Does ORACLE documentation even mention that PQ is a non-scalable solution?

If you set both parallel_max_servers and parallel_min_servers to 0, does that mean PQ is forever turned off? Can you enable PQ with hints when these two values are set to 0?

Other than the trial and error with the init.ora parameters, what other methods can be used to determine that enabling PQ is not appropriate? We looked at a bunch of reports, but nothing said "Disable PQ!" Is there any reports or query you can recommend running against the V$ tables that would indicate that enabling PQ is bad?

Thanks and keep up the good work!

April 01, 2003 - 8:45 pm UTC

Umm, you had to have had parallel on your tables there then. You ASKED us to do this. You changed the init.ora settings, after having set all or some of your tables to parallel -- and well, the system performed differently.

And yes, parallel query (unless you are using the parallel automatic stuff) is by design "non scalable". It's very design allows -- is supposed to -- let a single user completely nail the machine.

If you have max set to zero, its off. You have parallel set on your tables. If you want to use it with hints -- you would put the default DOP (degree of parallelism) back on the tables -- set max to whatever and hint when you wanted to use it.

Benchmarking, understanding how it works, understanding what you are asking the system to do and for how many users.

PQ is not a big headache -- no more then a hammer is to a carpenter. It is a tool and all tools, when inappropriately applied cause headaches. Ever try to put a nail in with a stapler?

You're right

Peter Tran, April 09, 2003 - 9:20 pm UTC

Okay, I stand corrected. PQ isn't a headache. It only is when you don't understand what's going on underneath the hood.

When thing that puzzled us is we had a situation where the parallel_min_servers and parallel_max_servers were set to the same values (e.g. min=0 and max=40) on both 8i and 9i instances (assuming all other configuration parameters being equal), we noticed using the same table/index creation script (with the parallel keyword), the performance on the 9i instance was much worse. Using the same code base to benchmark the difference, we noticed a 5x performance degradation with the 9i instance. If we set the parallel_min_servers=0 (affectively disabling PQ), the 9i instance performance matched that of the 8i instance.

From your reply above, we could achieve the same result by removing the parallel keyword from our table/index creation script without having to disable PQ for the 9i instance.

I noticed that 8i has the optimizer_percent_parallel parameter that has been deprecated in 9.0.1. If this setting is left at the default 0 on the 8i instance, does this mean that this affectively disables PQ on an 8i instance even when the parallel_min_servers > 0?

From the ORACLE 8i documentation:

</code> http://download-west.oracle.com/docs/cd/A87860_01/doc/server.817/a76961/ch1125.htm#66146 <code>

<quote>
OPTIMIZER_PERCENT_PARALLEL specifies the amount of parallelism that the optimizer uses in its cost functions.

The default of 0 means that the optimizer chooses the best serial plan. A value of 100 means that the optimizer uses each object's degree of parallelism in computing the cost of a full table scan operation. Low values favor indexes, and high values favor table scans.
</quote>

Thank you for your help.

A reader, July 02, 2003 - 2:55 pm UTC

Hi Tom,

We are running RULE base optimizer. Can we use Parallel Query or any other parallel stuff in Rule base Optimizer?

Thanks

July 02, 2003 - 3:12 pm UTC

parallel requires the use of the CBO.

according to James Morle´s book

pingu_san, July 03, 2003 - 6:52 am UTC

Hi

I am reading James Morle´s Scaling Oracle 8i book he states that

a CPU can only run a process

This was true before several years ago, is it still true now days???

July 03, 2003 - 9:11 am UTC

add "at a time"

a cpu can only run a process at a time. true they are getting into pipelining and such but in general "a cpu can only run a process at a time".

the registers and all have to be setup for that process, then a context switch to another process and so on.

Why parallel query ?

Arun Gupta, January 05, 2004 - 4:24 pm UTC

Tom,
We have a production OLTP database. We have set parallel_max_servers=5 to speed up some DBA operations on staging tables. However, all the OLTP tables and indexes have the NOPARALLEL option specified. During the day, we see lot of application user's queries being parallelized even though they are being fired against noparallel tables/indexes. We do not understand what is causing this. There is no parallel hint in any of the queries. The only objects with degree>1 are in the staging area. Application users do not have access to this area. We do not want OLTP queries to be parallelized.
Thanks.

January 06, 2004 - 7:58 am UTC

versions?

show parameter parallel output?

select table_name, degree from user_tables where degree <> 1 or instances <> 1

and please verify NONE of your tables pop up.

Why parallel query ?

Arun Gupta, January 06, 2004 - 3:33 pm UTC

We are on Oracle version 9.2.0.4. The outputs are:
SQL> show parameter parallel;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback         string      LOW
log_parallelism                      integer     1
parallel_adaptive_multi_user         boolean     FALSE
parallel_automatic_tuning            boolean     FALSE
parallel_execution_message_size      integer     2152
parallel_instance_group              string
parallel_max_servers                 integer     5
parallel_min_percent                 integer     0
parallel_min_servers                 integer     0
parallel_server                      boolean     FALSE
parallel_server_instances            integer     1

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0
SQL>

SQL> select table_name, degree from user_tables where degree <> 1 or instances <> 1
  2  ;

no rows selected

SQL>

SQL> ed
Wrote file afiedt.buf

  1* select table_name, degree from dba_tables where degree <> 1 or instances <> 1
  2  /

no rows selected

SQL>

Thanks

January 06, 2004 - 4:07 pm UTC

well, you said SOME of the tables were parallel no? where are they?

I have (given the information I have) nothing postive for you here -- All I can say is "something is amiss"...

can you tell me how you are seeing that the queries are being done in parallel during the day -- give me an example output that you are seeing that leads you down that path.

Why parallel query ?

Arun Gupta, January 06, 2004 - 5:28 pm UTC

Between the time I posted the question and generated the output, my colleague DBA switched all tables to noparallel to eliminate the problem. We are still seeing the parallel queries. I ran the following script from Metalink:
SELECT
decode (px.qcinst_id,
null,
username,
' - '||lower(substr(s.program,
length(s.program)-4,
4) ) ) "Username",
decode(px.qcinst_id,
null,
'QC',
'(Slave)') "QC/Slave",
to_char(px.server_set) "Slave Set",
to_char(s.sid) "SID",
decode(px.qcinst_id,
null,
to_char(s.sid),
px.qcsid) "QC SID",
px.req_degree "Requested DOP",
px.degree "Actual DOP"
FROM
v$px_session px,
v$session s
WHERE
px.sid=s.sid(+) and
px.serial#=s.serial#
order by 5,1 desc;

The results show us that the Requested DOP is 8 but the Actual DOP is 5, which is correct since we have 5 parallel servers. The only place 8 appears is that one of tables referenced in the query has 8 partitions. Though not likely, is it possible that presence of partitions is causing parallelism?
Thanks

January 06, 2004 - 8:40 pm UTC

WHAT SQL is actually being done in parallel?

Arun Gupta, January 08, 2004 - 8:57 am UTC

The DOP on all indexes is 1. The difference is that for two indexes, the value of instances is DEFAULT. For all other indexes, the value of instances=1.

Table Index Degree Instances
T_RESP T_RESP_NBR_CTL_DOC_INDEX 1 DEFAULT
T_RESP T_RESP_PR_DCN_INDEX 1 DEFAULT

January 08, 2004 - 2:17 pm UTC

that is what is doing it then.


consider:

ops$tkyte@ORA817DEV> create table t ( x int not null );
 
Table created.
 
ops$tkyte@ORA817DEV> create index t_idx on t(x);
 
Index created.
 
ops$tkyte@ORA817DEV>
ops$tkyte@ORA817DEV> exec dbms_stats.set_table_stats( user, 'T', numrows => 1000000, numblks => 100000 );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA817DEV>
ops$tkyte@ORA817DEV> set autotrace traceonly explain
ops$tkyte@ORA817DEV> select count(*) from t where x > 0;
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=13)
   1    0   SORT (AGGREGATE)
   2    1     INDEX (FAST FULL SCAN) OF 'T_IDX' (NON-UNIQUE) (Cost=1 Card=50000 Bytes=650000)
 
 
 
ops$tkyte@ORA817DEV> alter index t_idx parallel ( degree 1 instances default );
 
Index altered.
 
ops$tkyte@ORA817DEV> select count(*) from t where x > 0;
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=13)
   1    0   SORT (AGGREGATE)
   2    1     SORT* (AGGREGATE) :Q357000
   3    2       INDEX* (FAST FULL SCAN) OF 'T_IDX' (NON-UNIQUE) (Cost=1 Card=50000 Bytes=650000) :Q357000
 
 
   2 PARALLEL_TO_SERIAL            SELECT /*+ PIV_SSF */ SYS_OP_MSR(COUNT(*)) F
                                   ROM (SELECT 0 FROM :I."T"."T_IDX"  PX_GRANUL
                                   E(0, BLOCK_RANGE, DYNAMIC) A2 WHERE A2."X">0
                                   ) A1
 
   3 PARALLEL_COMBINED_WITH_PARENT
 
<b>
ops$tkyte@ORA817DEV> alter index t_idx parallel ( degree 1 instances 1 );
 
Index altered.
 </b>
ops$tkyte@ORA817DEV> select count(*) from t where x > 0;
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=13)
   1    0   SORT (AGGREGATE)
   2    1     INDEX (FAST FULL SCAN) OF 'T_IDX' (NON-UNIQUE) (Cost=1 Card=50000 Bytes=650000)


ops$tkyte@ORA817DEV> set autotrace off

parallel automatic tuning parameter

Vinnie, January 08, 2004 - 9:41 am UTC

Tom,

If I set the automatic tuning parameter=TRUE
& create a table using the parallel clause
what can I except the effect on other users when
a single user is quering this table constantly?

Does is make a difference if that table has a BLOB?

thanks

January 08, 2004 - 2:42 pm UTC

you can expect to see the DOP go up and down on this table as the load goes up and down (this is the purpose of the auto part -- to increase/decrease the resources used by PQ in reaction to the current load)

I cannot "quantify" what will happen to the other users -- not quite possible.

blobs over 4k are stored out of line, so they will not necessarily impact full table scans.

Arun Gupta, January 08, 2004 - 4:05 pm UTC

What is "instances" and why a value of default causes parallel query? We have the database mounted by single instance only, no RAC.
Thanks...

January 08, 2004 - 8:15 pm UTC

instances = number of instances to parallelize the query over.

In Oracle -- you can have more than one database instance open and mount a single database (set of files).

So, you could have 4 computers in a cluster -- each running Oracle -- against a single database.

Now, along comes query -- we can parallelize the query in one of three ways:

a) on a single machine, in parallel
b) across ALL FOUR machines -- each machine doing serial
c) across ALL FOUR machines -- each machine further doing its part in parallel

Someone at your site explicitly set this, explicitly set the instances. This trips off parallel query capabilities and with parallel automatic tuning, well -- the rest as they say is history

Just like default degree kicks in PQ, default instances does as well. The query plan was what keyed me to ask "and what about your indexes" since it is showing a parallel index scan -- not a parallel full scan of a table.

Arun Gupta, January 08, 2004 - 8:31 pm UTC

It sure looks like someone set the wrong parameters. It was the index in the plan which gave things away.

I was told that the indexes were altered to noparallel, but obviously that information was incorrect. I just set both the indexes to noparallel and the degree and instances were both set to 1. The parallel queries disappeared.

Though of academic interest now, I cannot reproduce the example you worked out. I always get INDEX(RANGE SCAN) and unless I set both table and index to parallel with degree>1, the parallel query just does not appear in the plan. I am working on some more combinations to see if I can reproduce the problem.

Thanks a lot for your help. I was sure that if anyone can help, it would be you.

Is it true that Parallel query is meaningfull only if table is partitioned?

Tony, March 11, 2004 - 7:08 am UTC

March 11, 2004 - 1:23 pm UTC

No, not at all. Parallel query predates partitioning by many releases

PQ = 7.1.6
Partitioning = 8.0

find the process which spawned p00* processes

A reader, March 17, 2004 - 6:32 am UTC

Hi

How can I find the parent process which spawned parallel slave processes? Can I rely on v$session.audsid? It seems that the PQ coordinator and slave processes share same audsid

March 17, 2004 - 8:14 am UTC

v$px_ views

v$px_session in particular has the "session identifier of the parallel coordinator"

How many degree of parallelism?

Tony, April 19, 2004 - 2:30 am UTC

Oracle9i doc says "UPDATE, MERGE, and DELETE operations are parallelized by partition or subpartition. Updates, merges, and deletes can only be parallelized on partitioned
tables. Update, merge, and delete parallelism are not possible within a partition, nor on a nonpartitioned table."

The following doubts come to my mind after reading the above.
1. Does it mean that we can have 5 degree of parallelism if there are 5 partitions for a table? What if I apply more parallelism?
2. What if I apply parallelism for non-partitoned tables and indexes? Will it work or not?

Please clear my doubts.

April 19, 2004 - 6:49 am UTC

9iR2 does not have that implementation restriction:

</code> http://docs.oracle.com/docs/cd/B10501_01/server.920/a96531/ch2_9ir2.htm#74432 <code>

in 9ir1 and above, the DOP was limited by the number of partitions, yes. If you had 5 partitions, DOP=5 would be the max. They gave a partition to each "update" PQ process.

if you tried to do parallel DML (PDML) on a non-partitioned table, it would have (in 9ir1 and before) done it serially. So, it would work.

How to stop using parallel slaves

Sami, October 10, 2005 - 11:08 pm UTC

Environment:
9.2.0.6 - 3 Node RAC,Sun OS 2.9

I don't know how optimizer decides to use PARALLEL SLAVES  to run such a small query.


One of my non-resource intensive query(look at the statistics) is started using parallel slave unnecessarily even though
1) any of the tables and indexes involved in the query does not have parallel degree
2) no parallel hint in SQL statement

I am thinking it may be because of parallel_automatic_tuning is set to TRUE. Since it is static parameter I cannot do anything now to stop using parallel slaves for this small query.

The reasons why I don't want this query to use parallel slaves are

1) This query runs very often & it consumes all possible 50 parallel slaves (max limit).

2) Because of the above, our replication Push Job fails which requires parallel slaves.

ORA-12012: error on auto execute of job 61
ORA-23386: replication parallel push cannot create slave processes
ORA-06512: at "SYS.DBMS_DEFER_SYS", line 1716
ORA-06512: at "SYS.DBMS_DEFER_SYS", line 1804
ORA-06512: at line 1

I cannot modify the query to put NO PARALLEL hint either.


Mu question:

Is there any work around to stop using parallel slaves for this small query?

-----------------------------------------------

SQL> show parameter parallel
 
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback         string      LOW
log_parallelism                      integer     1
parallel_adaptive_multi_user         boolean     TRUE
parallel_automatic_tuning            boolean     TRUE
parallel_execution_message_size      integer     4096
parallel_instance_group              string
parallel_max_servers                 integer     50
parallel_min_percent                 integer     0
parallel_min_servers                 integer     0
parallel_server                      boolean     TRUE
parallel_server_instances            integer     3
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0

SQL> set autotrace on
SQL> get res.sql
  1* SELECT DISTINCT RES_REGION.REGION FROM RES_ISSUE RESI,RES_REGION,(SELECT RES_PK FROM RES_ISSUE MINUS (SELECT RESI.RES_PK FROM RES_ISSUE RESI,RES_RESTRICTED_CNTRY,CNTRY WHERE RESI.RES_PK=RES_RESTRICTED_CNTRY.RES_PK AND RES_RESTRICTED_CNTRY.CNTRY_PK=CNTRY.CNTRY_PK AND CNTRY.CNTRY_CODE='GB')) RES_FILTERED_LIST WHERE RESI.TIER <= 100 AND RESI.DELETE_DOC=0 AND RES_FILTERED_LIST.RES_PK=RESI.RES_PK AND RESI.RES_PK=RES_REGION.RES_PK AND RESI.FILE_FMT='PDF' ORDER BY RES_REGION.REGION ASC
SQL> /
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=341 Card=5 Bytes=215
          )
 
   1    0   SORT (UNIQUE) (Cost=273 Card=5 Bytes=215)
   2    1     HASH JOIN (Cost=145 Card=33618 Bytes=1445574)
   3    2       HASH JOIN (Cost=126 Card=18862 Bytes=546998)
   4    3         TABLE ACCESS (FULL) OF 'RES_ISSUE' (Cost=57 Card=18862 Bytes=301792)
   5    3         VIEW (Cost=70 Card=37884 Bytes=492492)
   6    5           MINUS
   7    6             SORT (UNIQUE)
   8    7               INDEX (FAST FULL SCAN) OF 'P_RES_ISSUE' (UNIQUE) (Cost=6 Card=37884 Bytes=227304)
   9    6             SORT (UNIQUE)
  10    9               NESTED LOOPS* (Cost=17 Card=439 Bytes=10097)   :Q55028000
  11   10                 NESTED LOOPS* (Cost=17 Card=439 Bytes=7463)  :Q55028000
  12   11                   INDEX* (FAST FULL SCAN) OF 'RES_RESTRICTED :Q550280_CNTRY_CHK1' (NON-UNIQUE) (Cost=17 Card=107992 Bytes=107992000)
  13   11                   INDEX* (RANGE SCAN) OF 'CNTRY_CHK1' (NON-U :Q550280NIQUE)  00
  14   10                 INDEX* (UNIQUE SCAN) OF 'P_RES_ISSUE' (UNIQU :Q550280 E)     00
  15    2       INDEX (FAST FULL SCAN) OF 'RES_REGION_CHK1' (NON-UNIQUE) (Cost=19 Card=49824 Bytes=697536)
  10 PARALLEL_TO_SERIAL            SELECT /*+ ORDERED NO_EXPAND USE_NL(A2) INDE  X(A2 "P_RES_ISSUE") */ A1.C0,A1.C1,A
  11 PARALLEL_COMBINED_WITH_PARENT
  12 PARALLEL_COMBINED_WITH_PARENT
  13 PARALLEL_COMBINED_WITH_PARENT
  14 PARALLEL_COMBINED_WITH_PARENT

 Statistics
 ----------------------------------------------------------
           0  recursive calls
           0  db block gets
        1769  consistent gets
           0  physical reads
           0  redo size
         608  bytes sent via SQL*Net to client
         651  bytes received via SQL*Net from client
           2  SQL*Net roundtrips to/from client
           4  sorts (memory)
           0  sorts (disk)
           5  rows processed

October 11, 2005 - 6:39 am UTC

sorry - but one of your tables or indexes is in fact parallel - please check again.

did someone rebuild an index in parallel recently? that would do it.

paralle_min_percent-does not seem to work properly

Sami, October 10, 2005 - 11:40 pm UTC

Related to previous post.

Why did our job failed with "cannot create slave processes".
We have defined
parallel_min_percent integer 0
parallel_min_servers integer 0

So when it reaches max number of parallel slaves (parallel_max_servers=50) limit, it should start using serial operation... right? Why is it throwing the below error?

ORA-12012: error on auto execute of job 61
ORA-23386: replication parallel push cannot create slave processes
ORA-06512: at "SYS.DBMS_DEFER_SYS", line 1716
ORA-06512: at "SYS.DBMS_DEFER_SYS", line 1804
ORA-06512: at line 1

Which parameter sets parallel query enable ?

Parag Jayant Patankar, October 11, 2005 - 8:36 am UTC

Hi Tom,

Which parameter/parameters set ups parallel query option enabled ? Sorry I am confused, as I was thinking by setting only parallel_automatic_tuning will only set up parallel query option enabled.

As I was running dbms_stats parallelly for following parameters

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback string LOW
log_parallelism integer 1
parallel_adaptive_multi_user boolean FALSE
parallel_automatic_tuning boolean FALSE
parallel_execution_message_size integer 2152
parallel_instance_group string
parallel_max_servers integer 5
parallel_min_percent integer 0
parallel_min_servers integer 0
parallel_server boolean FALSE
parallel_server_instances integer 1
parallel_threads_per_cpu integer 2
recovery_parallelism integer 0

and found that in Unix parallel background processes were enabled

79062 2.2 0.0 59832 31656 - A 15:57:56 0:53 ora_p003_atpp06
169004 2.2 0.0 59812 31560 - A 15:57:56 0:51 ora_p002_atpp06
28314 0.7 0.0 61732 31912 - A 15:57:56 0:17 ora_p001_atpp06
11910 0.6 0.0 644 228 - A Sep 28 215:13 /usr/sbin/syncd
92082 0.5 0.0 61732 31612 - A 15:57:56 0:13 ora_p000_atpp06

Q2. on server I am having 2 CPUs, then what should be my
parallel_threads_per_cpu ?

regards & thanks
pjp

October 11, 2005 - 3:22 pm UTC

parallel_max_servers was sufficient.

you should let parallel_threads_per_cpu default to 2.

parallel

dost, October 11, 2005 - 5:51 pm UTC

Tom,

Please comment on following
SQL> show parameter parallel

NAME                                 TYPE                  VALUE
------------------------------------ --------------------- ------------------------------
fast_start_parallel_rollback         string                LOW
log_parallelism                      integer               1
parallel_adaptive_multi_user         boolean               TRUE
parallel_automatic_tuning            boolean               TRUE
parallel_execution_message_size      integer               4096
parallel_instance_group              string
parallel_max_servers                 integer               96
parallel_min_percent                 integer               0
parallel_min_servers                 integer               0
parallel_server                      boolean               FALSE
parallel_server_instances            integer               1
parallel_threads_per_cpu             integer               2
recovery_parallelism                 integer               0

System = SunOS
Node = sclpdb05
Release = 5.8
KernelID = Generic_117350-11
Machine = sun4u
BusType = <unknown>
Serial = <unknown>
Users = <unknown>
OEM# = 0
Origin# = 1
NumCPU = 56

October 12, 2005 - 6:51 am UTC

it is a list of parameters? what is the question?

parallel

dost, October 12, 2005 - 2:32 pm UTC

Question is :

Is parallel_max_server parameter is right?
Please suggest on parameters if you feel we need to change the settings
Oracle 9.2.0.5 64 bit on sun os

October 13, 2005 - 9:20 am UTC

depends - do you want to *use* parallel query, are you OLTP or what - what do you *need* to do.

parallel processing

Jerry Robinson, October 29, 2005 - 8:36 pm UTC

Tom, you will never know the incredible impact your advice has helped the most critical parts of my customer's mission.
thanks,

-Jerry

Donald, November 07, 2005 - 11:02 am UTC

Tom
Need your expert advice....
I have a table A partitioned by week.each parition has 7 days worth of data.set to parallel degree instances default
each partition has 60 million rows ..local paritioned unique index on (cust_no,item_no,date)

table A
------
cust_no
item_no
date,
sales,
cost,
returns,
units

table B partitioned by week..local paritioned unique index on (cust_no,item_no,week)..each partition will have 11 millon rows
TABLE B
------
cust_no
item_no
week,
sales,
cost,
returns,
units

i need to populate table b from table a as follows. I basically need to sum up the 7 days of data and load it for a week(yyyyww)
for example if iam loading for nov 13,2005 through nov 19 2005 t

insert into table b
(
cust_no
item_no
week,
sales,
cost,
returns,
units
)
select
cust_no
item_no
200545 week,
sum(sales),
sum(cost),
sum(returns),
sum(case when date = :value2 then units else 0 end)
from table a
where a.date between :value1 and:value2
group by
cust_no,
item_no

value 1 will be nov 13 and value 2 will be nov 19...the select statement will summarize 60 mill rows to 11 million...
The process takes more than 1 1/2 to complete..any ideas to speed it up...I was thinking of parallel dml...like setting table b
to parallel(degree default instances) and insert /*+ parallel / into table bselect /*+ parallel */ from table a??
Should i mention a degree instead?

November 08, 2005 - 9:16 pm UTC

Have you determined what part is "slow" - does the query take 1.4 hours and the insert part 0.1 or vice versa?

you'd need to know that before setting about tuning it (is it index maintainence or query runtime that is the problem)

and before using pdml, make sure you understand the implications - no space reused, writes above the high water mark, WILL generate redo unless you use nologging and if you do that make sure you really understand what that implies!

default degree of parallelism - let the system figure it out based on your available resources.

parallel_automatic_tuning

Candba, December 08, 2005 - 5:01 pm UTC

Tom ,
I am currently reading your book and i started with the parallel processing chapter..I would like to use the parallel processing option for some of my dba activities like you suggested. We recently upgraded to 9i and i am in the process of moving data from dict managed tablespaces to locally managed tablespaces . I will be using alter table move/index rebuild approach and this is what I intend to do..:
a. set parallel_automatic_tuning=true
b. move all the tables/rebuild indexes using parallel clause

Now if I leave this parameter set to true , do you see any negative impact ? ours is a datawarehouse running on solaris 2.8/oracle 9204 and I dont want our loading processes or users reports start eating up all of cpu ?
We dont have any table/index with DOP > 1 .

Thanks for your help as always.

December 09, 2005 - 1:30 am UTC

after you rebuild the indexes, un-set the parallel on them (to avoid parallel query kicking in unexpectedly!)

they will not start eating your cpu unless you change the parallel setting on tables/indexes - just make sure it is not "set" and then they would have to use parallel hints in order to use parallel.

Thanks , Tom

A reader, December 09, 2005 - 10:03 am UTC

high CPU on ora_p003

J, March 16, 2006 - 2:32 pm UTC

I have very high CPU time on ora_p003 using ps-ef, and it is still growing:

oracle 9739 1 0 Nov 27 ? 670:32 ora_p004_prod
oracle 9708 1 5 Nov 27 ? 21344:40 ora_p003_prod

I checked out in v$session:
USERNAME OWNERID STATUS PROCESS LOGON_TIME
XXXX 65609 KILLED 9708 03/02/2006 11:40:12 AM
XXXX 2147483644 KILLED 192:3732 03/02/2006 9:27:29 AM

where process 9708 are child process from (192:3732), both are all marked as killed.

The wait event on process 9708 is PX Deq: Table Q Normal

Can I do server kill command on process 9708, ora_p003_prod? Will it cause any problem since it is oracle process?

How can I prevent this happen in future? Thanks in advance for any suggestion!

ORA-23386: replication parallel push cannot create slave processes

Tariq, May 25, 2006 - 7:43 am UTC

I am working on Real time Replication. In my database I have already set the max parallel server 10.
How can I resolve this problem. Can you please put light on it.

May 25, 2006 - 1:47 pm UTC

contact support.

I question anyone that thinks they need real time replication, or replication really....

pdml on non-partitioned table

Chris Ellis, September 13, 2006 - 2:28 am UTC

clarification on followup given from Tony's comments on April 19th, 2004

The documentation still says (in only certain places) that it's not possible on non-partitioned tables:

(sections entitled 'Rules for Parallelizing UPDATE, MERGE, and DELETE')

9ir2 - </code> http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96520/tuningpe.htm#66441


10gr2 -

http://download-west.oracle.com/docs/cd/B19306_01/server.102/b14223/usingpe.htm#i1007553 <code>

I'm happily using it though in 9ir2 :)

September 13, 2006 - 7:26 am UTC

it is a new 9i feature, pdml on non-partitioned objects. I'll file a doc bug, thanks

sometimes DML runs in parallel,

A reader, October 19, 2006 - 6:06 pm UTC

In our application the developer has used parallel hint in his update statement. The tables involved in the update statement also had parallel degree DEFAULT.

The update job runs slow whenever we see PX Deq: Execute Reply as the topmost wait event.

Later we thought of altering the degree value to 1 (the hint in the SQL still exists).

We rebooted the database and the Update statement started running with single session (no parallel slaves) and was completing the job on time.

Today, we are seeing the same update statement is executing in parallel and the top wait event today is PX Deq: Execute reply. The job is running for more than 12 hours.

We know the culprit is the parallilism in this case. But the question is how come oracle runs with parallel slaves sometimes and without on someother days? The big question is WHY IT RUNS IN PARALLEL TODAY AND WHY NOT YESTERDAY. What to answer to the poeple who ask this question?

Without changing the code is there a way to make this update statement run without any parallel slaves (I can't bring down the parallel_max_servers to a small number since this is required for other statements).

Thanks,

October 20, 2006 - 4:06 am UTC

hmm, how did you come to the conclusion that

a) because I ssee PX deq
b) that is the cause

that is not the cause, that is a symptom. that is simply your coordinator waiting for the execution servers running in parallel to finish.

it runs in parallel because - you HAVE IT HINTED TO DO THAT. I cannot answer the "why not yesterday", because I don't know exactly what you are doing/did.

You get what you ask for sometimes.

Parallel Processes Consuming All TEMP

A reader, November 02, 2006 - 6:47 pm UTC

I am using parallel 4 to build a new table based on three base tables. The base table sizes are 15M (narrow width), 140M (narrow), and 130M (very wide). The resulting table will be about 10M (very wide) -- basically, I am taking some operational data and overlaying with demographic data for our customers.

The create table operation fails running out of TEMP extents. Our DBA says we already have a 44GB. I'm expecting full scans on all three tables, with the largest table containing about 150GB of data.

The only parallel execution parameter we have set is max servers = 4 -- the DBA team is not experienced with parallel execution, so they are reluctant to change the parameters. Do you know if I can look into any of the settings specifically related to not being able to extend TEMP?

Thanks. I understand the question is not so concise, and I'm not able to provide an example, unless a listing of the table stats will help.

November 02, 2006 - 9:06 pm UTC

depending on the query plan - and depending on the pga memory settings, you could easily need more than 44gb of temp for this operation.

Unless you want it to run really really slow that is.

look at the plan, look at your pga settings, if you are hashing to disk.... well, you could easily exceed 44gb.

A reader, November 03, 2006 - 1:22 pm UTC

Let's say:
Table A = 15M rows but narrow (< 2GB)
Table B = 150M rows but narrow (< 14GB)
Table C = 130M rows but wide (> 150GB)

This is the short version of the query plan:
Nested Loops
  Hash Join
     Hash Join - A
     Hash Join - B
  Index Range - C

SQL> show parameter pga

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target                 big integer 2147483648

Would you be able to make anything out of this?

November 04, 2006 - 12:05 pm UTC

got a dbms_xplan.display output for that?

A reader, November 05, 2006 - 9:00 pm UTC

I left out the predicate information:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2711K| 2774M| | 2054K| | | |
|* 1 | HASH JOIN | | 2711K| 2774M| 24G| 2054K| 77,07 | P->S | QC (RAND) |
|* 2 | HASH JOIN | | 2714K| 344M| 1026M| 101K| 77,05 | P->P | HASH |
|* 3 | HASH JOIN | | 2714K| 277M| 127M| 34198 | 77,03 | P->P | HASH |
|* 4 | TABLE ACCESS FULL | CUSTOMERS | 9007K| 94M| | 3378 | 77,01 | P->P | HASH |
|* 5 | HASH JOIN | | 4067K| 372M| | 26349 | 77,02 | P->P | HASH |
|* 6 | TABLE ACCESS FULL| DATE_DIM | 61 | 732 | | 3 | 77,00 | S->P | BROADCAST |
| 7 | TABLE ACCESS FULL| FACT | 140M| 10G| | 26309 | 77,02 | PCWP | |
| 8 | TABLE ACCESS FULL | OVERLAY_KEYS | 130M| 3246M| | 11207 | 77,04 | P->P | HASH |
| 9 | TABLE ACCESS FULL | OVERLAY_DATA | 130M| 114G| | 330K| 77,06 | P->P | HASH |
--------------------------------------------------------------------------------------------------------------------

November 06, 2006 - 9:42 am UTC

I see 140m, 130m, 130m records - not what you quoted above. Is that right?

A reader, November 06, 2006 - 3:07 pm UTC

The one 130M I left out because I wasn't thinking clearly. It's a table that maps the overlay ids to the customer keys. It should be very small, but "yes" you are right.

November 06, 2006 - 3:14 pm UTC

but - it is NOT very small?

are those estimated cardinalities for all rows *near* correct or not?

A reader, November 06, 2006 - 5:57 pm UTC

I completed the operation with noparallel and the resulting dataset was about 7.5MM records, so the date filter on the fact is a bit off. The customer dim also has about 14MM records, but the big ones are correct.

I left out the mapping table because I thought it might be relatively small (5.5GB) compared to the 150+GB of the overlay. In hindsight, I should have presented it either way, so you have more information to work with.

A reader, November 09, 2006 - 2:36 pm UTC

I am instill interested in how to tune it to run in parallel :)

It would be great to hear your thoughts on it.

November 09, 2006 - 2:43 pm UTC

no, i won't. you won't answer me, I cannot possible make up stuff for you.

A reader, November 10, 2006 - 4:08 am UTC

Sorry, I thought I answered.

The cardinalities are near correct, yes.

November 10, 2006 - 8:53 am UTC

sorry, thought this was a different thread where I've asked a dozen times that question and never get an answer - indeed you did above.

As for whether this can go "faster in parallel", will really be a function of your hardware

do you have sufficient cpu's (2xcpu would be the degree of parallelism to consider if you are the only one on the machine)

do you have sufficient IO bandwidth for that many processes to read and write at the same time?

A reader, November 10, 2006 - 3:00 pm UTC

No need for the apology as I understand you are answering dozens of questions each day -- I'm extremely impressed by your throughput. Thanks for it, anyway :)

We have a 6 cpu machine, so setting max parallel servers to 4 is a bit conservative -- I'm trying to figure out what settings we want to change our environment to for this parallel operation to work, and maximize other parallel operations.

Disk i/o has been pretty good -- too bad I don't know enough to give you hard numbers. I've been at different shops and this one seems to have the best disk i/o setup.

Based on your line of questioning, does the temp tb problem come down to how fast parallel servers can handle the orchestration of data processing between the different slaves? After the data gets orchestrated, the temp tb gets cleared of that data, and should allow for the next "batch". If disk i/o throughput is not high enough, then the orchestration gets backed up, and more temp tb is needed. Is that how it works?

November 10, 2006 - 3:02 pm UTC

about the IO, do you have the necessary "multiple paths" to it or would everyone pretty much be hitting the same pieces of hardware.

A reader, November 13, 2006 - 3:27 pm UTC

That's one area I may not be able to answer. I know they have the database layed out on 28 disks, but I'm not sure how the space is allocated to the different areas of the database. Is there a query that I can ask the DBAs to run for this information?

November 14, 2006 - 4:14 am UTC

ask the DBA's about this, take your query - the parallel one and ask them "do we have the hardware to make this thing scale - is the data striped in such a manner as to make this scalable?"

A reader, November 14, 2006 - 2:20 pm UTC

This is the same DBA team that doesn't want me to use parallel because it will affect other processes, though the database has options to mitigate against that. I had to do a bunch of analysis showing the benefits of parallel execution for them to bump up parallel_max_servers to 4 on a 6 cpu machine.

When I reported my problem to them, they said: we already allocated 44GB to TEMP and don't want to increase it more just for your process.

I'm not the first to run into a conservative DBA team, and I won't be the last.

I'll try to find out as much as possible.

Thanks.

A reader, November 16, 2006 - 3:18 am UTC

Here's the answer I got:
"disks are stripped and mirrored. So, all the data is spread across multiple disks"

November 16, 2006 - 3:10 pm UTC

unfortunately that does not answer "is there sufficient hardware resources that won't be swamped by PARALLEL processing"

A reader, November 17, 2006 - 2:46 pm UTC

The problem is, I don't have confidence that they know how to answer your question. Keep in mind that they are a DBA team that has limited experience with supporting parallel execution. In fact, the questions I'm asking you should be asked by them because some things you ask me, I don't know because I'm not familiar with the underlying system architecture.

The line between a developer and DBA is quite grey, so I'm very open to learning what I need to learn to do my job better.

How can I go about determining if our hardware will support this type of operation?

As always, thanks for your input.

November 20, 2006 - 2:25 am UTC

do you know about disk subsystems in general, it is way beyond the scope of what I can go into here.

A reader, November 21, 2006 - 6:10 pm UTC

Unfortunately, that area is one of my weaknesses. Can you recommend some reading or websites where I can learn more about this?

DOP

Yoav, December 03, 2006 - 3:50 pm UTC

Hi Tom,
In our data warehouse staging database(9.2.0.6) on HP-UX (8 CPU) , i found that we have the following DOP : 1,4,7,8,'DEFAULT' on our tables.
DEGREE CNT
---------- ----------
1 554
4 1
7 8
8 1
DEFAULT 495

1. What is different between degree '1' and degree 'DEFAULT' ?

I runed the following queries :

select name ,value from V$PARAMETER WHERE NAME LIKE '%parallel%';
NAME val
---------------------------------------- ----------
log_parallelism 1
parallel_server FALSE
parallel_server_instances 1
recovery_parallelism 0
fast_start_parallel_rollback LOW
parallel_min_percent 0
parallel_min_servers 0
parallel_max_servers 16
parallel_instance_group
parallel_execution_message_size 16384
parallel_adaptive_multi_user TRUE
parallel_threads_per_cpu 2
parallel_automatic_tuning TRUE

select name, value from v$sysstat where name like 'Parallel%';
NAME VALUE
---------------------------------------------------------------- ----------
Parallel operations not downgraded 9295
Parallel operations downgraded to serial 3628
Parallel operations downgraded 75 to 99 pct 7
Parallel operations downgraded 50 to 75 pct 4
Parallel operations downgraded 25 to 50 pct 1040
Parallel operations downgraded 1 to 25 pct 1

Its clear that large percent of operations were done serially instead of in parallel.
Do you think its good idea to alter all the tables to degree 'DEFAULT' for a while
and check how its affect on the "Parallel operations downgraded"?

3. If parallel_automatic_tuning=TRUE is it overwrite the value of parallel_max_servers ?
4. Is there a "rule of thumb" regarding the value of DOP and parallel_max_servers ?
Thank You.

disable parallel query

Jerry Ford, September 11, 2007 - 7:49 pm UTC

Tom,

How to disable parallel query execution.

I am executing a remote query from local database. One table used in remote query's indexes have DOP set to 6.
I have no control over remote database schema.

I tried with "alter session disable parallel query' from local database but no effect on remote database. Still the remote query is running in parallel.

Is there any command to direct remote database to ignore the invoking of parallel query execution while we execute from local database.

Thank you

disable parallel query

Jerry Ford, September 18, 2007 - 2:01 pm UTC

Tom,

When you get a chance could you please respond to the query regarding "disable parallel query execution" right above.

Thank you

parallel_max_servers settings in 10g and parallel Hints

John, March 30, 2009 - 4:39 pm UTC

Tom,

Database version : 10.2.0.4

We have migrated the database to a new server which has 128 CPUs available.

This is reporting database where jobs are using parallelism as part the queries/DMLs.

Questions:

a) We would like to advatage of more CPUs available.
Whats would be the optimal value for PARALLEL_MAX_SERVER parameter with 128 CPUs availability ? How to tune this parameter ?

Currently set to 50 as with old sever setting. Also, PARALLEL_AUTOMATIC_TUNING is deprecated in 10g.

b) Our database DMLs use /*+ PARALLE(table, degree) */ Hint.

For example,
SELECT /*+ parallel(a,4) parallel(c,4) parallel(d,4)*/

Can we just specify /*+ PARALLEL(table) */ by excluding DOP. Will Oracle able to figure the DOP in this case ?
Is it a good practice ? Any known issue with this ?

Would like to take you opinion.Thanks

March 30, 2009 - 6:06 pm UTC

a) unset it and let them all default, that is best. Have as FEW things in your init.ora as possible.

b) Yes, that would be best - let it figure out the degree. It'll do that based on the available resources at runtime.

parallel_max_servers in 10g

John, March 31, 2009 - 12:36 am UTC

Thanks Tom!

Is parallel_adaptive_multi_user=TRUE is replacement for parallel_automatic_tuning in 10g ?

>> a) unset it and let them all default, that is best. Have as FEW things in your init.ora as possible. 

Do you mean i can just reset this parameter to have it default.

alter system reset parallel_max_servers scope=spfile sid='*'; 


Are there any other parameter along with this need to be reset?


SQL>  show parameter parallel

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback         string      LOW
parallel_adaptive_multi_user         boolean     TRUE
parallel_automatic_tuning            boolean     FALSE
parallel_execution_message_size      integer     2152
parallel_instance_group              string
parallel_max_servers                 integer     50
parallel_min_percent                 integer     0
parallel_min_servers                 integer     0
parallel_server                      boolean     FALSE
parallel_server_instances            integer     1
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0
SQL>

March 31, 2009 - 8:02 am UTC

unset (reset, get rid of) every parameter you can. all of them. look at

select name || '=' || value from v$parameter where isdefault = 'FALSE';

to see what you've set and unset them.

If you have cpu_count at 128 (you need your SGA to be large enough and processes set high enough) you should see:

ops$tkyte%ORA10GR2> select name || '=' || value from v$parameter where name like '%parallel%' or name = 'cpu_count';

NAME||'='||VALUE
-------------------------------------------------------------------------------
cpu_count=128
parallel_server=FALSE
parallel_server_instances=1
recovery_parallelism=0
fast_start_parallel_rollback=LOW
parallel_min_percent=0
parallel_min_servers=0
parallel_max_servers=1009
parallel_instance_group=
parallel_execution_message_size=2148
parallel_adaptive_multi_user=TRUE
parallel_threads_per_cpu=2
parallel_automatic_tuning=FALSE

13 rows selected.

parallel_max_servers settings

John, April 01, 2009 - 2:07 pm UTC

Tom,

I reset the parallel_max_servers and bounced the instance.

Now its showing value as 585. Is this correct

Also, will there be any performance gain by setting parallel_execution_message_size to 4K or 8K ?
I have 100GB of memory available.

SQL> select name || '=' || value from v$parameter where name like '%parallel%' or name = 'cpu_count' ;

NAME||'='||VALUE
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
cpu_count=128
parallel_server=FALSE
parallel_server_instances=1
recovery_parallelism=0
fast_start_parallel_rollback=LOW
parallel_min_percent=0
parallel_min_servers=0
parallel_max_servers=585
parallel_instance_group=
parallel_execution_message_size=2152
parallel_adaptive_multi_user=TRUE
parallel_threads_per_cpu=2
parallel_automatic_tuning=FALSE

April 01, 2009 - 4:47 pm UTC

please, just let everything default. defaults are good. non-default values make you 'different', harder to diagnose issues.

continuation from the previous post

John, April 01, 2009 - 2:34 pm UTC

Tom:

It was shown 585 because processes value was set to 600.

I bumped the PROCESSES to 1000. Now it is showing 985.

Just curious, how parallel_max_servers value is determined by oracle ? any formula.

SQL> select name || '=' || value param_value from v$parameter where name like '%parallel%' or name = 'cpu_count' ;

PARAM_VALUE
------------------------------------------------------------
cpu_count=128
parallel_server=FALSE
parallel_server_instances=1
recovery_parallelism=0
fast_start_parallel_rollback=LOW
parallel_min_percent=0
parallel_min_servers=0
parallel_max_servers=985
parallel_instance_group=
parallel_execution_message_size=2152
parallel_adaptive_multi_user=TRUE
parallel_threads_per_cpu=2
parallel_automatic_tuning=FALSE

13 rows selected.

April 01, 2009 - 4:48 pm UTC

documentation tells us what this is based on:

http://docs.oracle.com/docs/cd/B19306_01/server.102/b14237/initparams153.htm#REFRN10158

Any more detail?

Stewart Bryson, May 06, 2009 - 10:12 am UTC

Any chance of finding the exact formula used for parallel_max_servers? I often find clients with manually-specified values of this parameter which are way too low, usually set to the number of CPU's, or the nuber of CPU's X DOP, which is usually 2. When I explain that this value is too low, they ask me "What should it be?". I usually say "Well... let's let the database decide"... but it would be easier if I could explain to them what the database would decide if given a chance.

May 11, 2009 - 9:58 am UTC

it changes from release to release.

it is a function of

... 9i
Derived from the values of CPU_COUNT, PARALLEL_AUTOMATIC_TUNING, and PARALLEL_ADAPTIVE_MULTI_USER
....

... 10g+
Derived from the values of CPU_COUNT, PARALLEL_THREADS_PER_CPU, and PGA_AGGREGATE_TARGET
......

Oracle9i Enterprise Edition Release 9.2.0.8.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production

ops$tkyte%ORA9IR2> show parameter cpu

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cpu_count                            integer     4
parallel_threads_per_cpu             integer     2
ops$tkyte%ORA9IR2> show parameter pga

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target                 big integer 25165824
ops$tkyte%ORA9IR2> show parameter parallel_max

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
parallel_max_servers                 integer     5







Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

ops$tkyte%ORA10GR2> show parameter cpu

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cpu_count                            integer     4
parallel_threads_per_cpu             integer     2
ops$tkyte%ORA10GR2> show parameter pga

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target                 big integer 24M
ops$tkyte%ORA10GR2> show parameter parallel_max

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
parallel_max_servers                 integer     25





Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

ops$tkyte%ORA11GR1> show parameter cpu

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cpu_count                            integer     4
parallel_threads_per_cpu             integer     2
resource_manager_cpu_allocation      integer     4
ops$tkyte%ORA11GR1> show parameter pga

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target                 big integer 24M
ops$tkyte%ORA11GR1> show parameter parallel_max

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
parallel_max_servers                 integer     80

How about you have them explain why they picked the number they picked - and when they say "we picked that number magically from space".....

Found the answer.

Stewart Bryson, May 06, 2009 - 11:02 am UTC

I found the answer in the 10g Upgrade Companion. It seems to be:

CPU_COUNT x PARALLEL_THREADS_PER_CPU x (2 if PGA_AGGREGATE_TARGET > 0; otherwise 1) x 5

Just curious... what is your opinion on this value? I've found it to be high on occasion... but just right in certain circumstances. If I recall correctly, you've often advocated for CPU_COUNT X 2.

May 11, 2009 - 10:08 am UTC

I advocate letting every single parameter we can default - every single one. It would take a really really good reason to change from the defaults.

and the formula is obviously wrong :)

cpu_count = 4
parallel threads/cpu = 2
pga_aggregate_target = 24m > 0

4*2*2*5 = 80. But it was 25 in 10gr2.

That doesn't look right

Doug Burns, May 11, 2009 - 11:12 am UTC

I'm so used to seeing parallel_max_servers=80 on 4 cpu boxes that the 25 doesn't look right. I've just tested on a 10.2.0.3 and a 10.2.0.4 here, both with all default parameters and I'm getting values that match the formula.

SQL> show parameter cpu_count

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cpu_count                            integer     2
SQL> show parameter parallel_threads_per_cpu

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
parallel_threads_per_cpu             integer     2
SQL> show parameters pga_agg

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target                 big integer 399M
SQL> show parameter parallel_max_servers

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
parallel_max_servers                 integer     40

(Oh, and for the record, this is one default parameter that I've seen bring several servers to their knees because it's way too high for most people's storage infrastructure. Just my opinion, though!)

May 11, 2009 - 6:54 pm UTC

ops$tkyte%ORA10GR2> select name || '=' || value from v$parameter where isdefault = 'FALSE';

NAME||'='||VALUE
-------------------------------------------------------------------------------
sessions=300
sga_target=536870912
control_files=/home/ora10gr2/oradata/ora10gr2/control01.ctl, /home/ora10gr2/ora
data/ora10gr2/control02.ctl, /home/ora10gr2/oradata/ora10gr2/control03.ctl

db_block_size=8192
compatible=10.2.0.1.0
db_create_file_dest=/home/ora10gr2/oradata/ora10gr2
db_recovery_file_dest=/home/ora10gr2/oradata/fbra
db_recovery_file_dest_size=10737418240
undo_management=AUTO
undo_tablespace=UNDOTBS
undo_retention=5000
db_domain=
dispatchers=(protocol=tcp)
job_queue_processes=10
db_name=ora10gr2
open_cursors=300
os_authent_prefix=OPS$
pga_aggregate_target=25165824

18 rows selected.


ops$tkyte%ORA10GR2> select name || '=' || value from v$parameter where name like '%cpu%' or name like 'parallel_max%' or name like 'pga_agg%';

NAME||'='||VALUE
-------------------------------------------------------------------------------
cpu_count=4
parallel_max_servers=25
parallel_threads_per_cpu=2
pga_aggregate_target=25165824

did it with as few init.ora's as I can

sys%ORA10GR2> startup force pfile=initora10gr2.ora
ORACLE instance started.

Total System Global Area  536870912 bytes
Fixed Size                  1268484 bytes
Variable Size             150996220 bytes
Database Buffers          377487360 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
sys%ORA10GR2> select name || '=' || value from v$parameter where name like '%cpu%' or name like 'parallel_max%' or name like 'pga_agg%';

NAME||'='||VALUE
-------------------------------------------------------------------------------
cpu_count=4
parallel_max_servers=25
parallel_threads_per_cpu=2
pga_aggregate_target=25165824

sys%ORA10GR2> !cat initora10gr2.ora
*.compatible='10.2.0.1.0'
*.control_files='/home/ora10gr2/oradata/ora10gr2/control01.ctl','/home/ora10gr2/oradata/ora10gr2/control02.ctl','/home/ora10gr2/oradata/ora10gr2/control03.ctl'
*.db_block_size=8192
*.db_name='ora10gr2'
*.os_authent_prefix='OPS$'
*.pga_aggregate_target=25165824
*.sga_target=512m
*.undo_management='AUTO'
*.undo_retention=5000
*.undo_tablespace='UNDOTBS'

Trying to see the difference here ...

Doug Burns, May 11, 2009 - 7:15 pm UTC

This is from a 10.2.0.4 on Windows at home (I was going to say Vista, but could do without any additional abuse ;-))

SYSTEM@TEST1020> select name || '=' || value from v$parameter where isdefault =
'FALSE';

NAME||'='||VALUE
--------------------------------------------------------------------------------

processes=150
nls_language=ENGLISH
nls_territory=UNITED KINGDOM
sga_target=373293056
control_files=C:\ORACLE\PRODUCT\10.2.0\ORADATA\TEST1020\CONTROL01.CTL, C:\ORACLE\PRODUCT\10.2.0\ORADATA\TEST1020\CONTROL02.CTL, C:\ORACLE\PRODUCT\10.2.0\ORADATA\TEST1020\CONTROL03.CTL
db_block_size=8192
compatible=10.2.0.3.0
db_file_multiblock_read_count=16
undo_management=AUTO
undo_tablespace=UNDOTBS1
remote_login_passwordfile=EXCLUSIVE
db_domain=
dispatchers=(PROTOCOL=TCP) (SERVICE=TEST1020XDB)
job_queue_processes=10
audit_file_dest=C:\ORACLE\PRODUCT\10.2.0\ADMIN\TEST1020\ADUMP
background_dump_dest=C:\ORACLE\PRODUCT\10.2.0\ADMIN\TEST1020\BDUMP
user_dump_dest=C:\ORACLE\PRODUCT\10.2.0\ADMIN\TEST1020\UDUMP
core_dump_dest=C:\ORACLE\PRODUCT\10.2.0\ADMIN\TEST1020\CDUMP
db_name=TEST1020
open_cursors=300
pga_aggregate_target=122683392
_awr_flush_threshold_metrics=TRUE

22 rows selected.

SYSTEM@TEST1020> select name || '=' || value from v$parameter where name like '%
cpu%' or name
  2  like 'parallel_max%' or name like 'pga_agg%';

NAME||'='||VALUE
--------------------------------------------------------------------------------

cpu_count=2
parallel_max_servers=40
parallel_threads_per_cpu=2
pga_aggregate_target=122683392

(Skipped the startup - just did it)

SYSTEM@TEST1020> show parameter spfile

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\SPFILETEST1020.ORA
SYSTEM@TEST1020> host type C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\SPFILETEST1020.ORA

(Stripped strange binary chars ...)

test1020.__db_cache_size=184549376
test1020.__java_pool_size=4194304
test1020.__large_pool_size=4194304
test1020.__shared_pool_size=167772160
test1020.__streams_pool_size=4194304
*._awr_flush_threshold_metrics=TRUE
*.audit_file_dest='C:\oracle\product\10.2.0\admin\TEST1020\adump'
*.background_dump_dest='C:\oracle\product\10.2.0\admin\TEST1020\bdump'
*.compatible='10.2.0.3.0'
*.control_files='C:\oracle\product\10.2.0\oradata\TEST1020\control01.ctl','C:\oracle\product\10.2.0\oradata\TEST1020\control02.ctl','C:\oracle\product\10.2.0\oradata\TEST1020\control03.ctl'
*.core_dump_dest='C:\oracle\product\10.2.0\admin\TEST1020\cdump'
*.db_block_size=8192
*.db_domain=''
*.db_file_multiblock_read_count=16
*.db_name='TEST1020'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=TEST1020XDB)'
*.job_queue_processes=10
*.nls_language='ENGLISH'
*.nls_territory='UNITED KINGDOM'
*.open_cursors=300
*.pga_aggregate_target=122683392
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.sga_target=370147328
*.undo_management='AUTO'
*.undo_tablespace='UNDOTBS1'
*.user_dump_dest='C:\oracle\product\10.2.0\admin\TEST1020\udump'

I've seen that 80/40 parallel_max_servers so many times now ... and I have the docs on my side ;-)

I don't know, I'm baffled. (But tired too ... nearly bedtime)

Earlier examples ...

Doug Burns, May 11, 2009 - 7:17 pm UTC

... were from two different 10.2's on Solaris, with no special parallel parameter set as far as I'm aware.

Compatible?

Stewart Bryson, May 12, 2009 - 11:17 am UTC

The obvious difference I see is the COMPATIBLE parameter.

That's what I was thinking, as well as processes/sessions

A reader, May 12, 2009 - 1:38 pm UTC

Will probably play around with it a bit later

May 13, 2009 - 10:10 am UTC

face palm....

processes, of course that has to be considered.

idle> select name || '=' || value from v$parameter where name like '%cpu%' or name like 'parallel_max%' or name like 'pga_agg%' or name = 'processes';

NAME||'='||VALUE
-------------------------------------------------------------------------------
processes=40
cpu_count=4
parallel_max_servers=25
parallel_threads_per_cpu=2
pga_aggregate_target=25165824

idle> alter system set processes = 150 scope=spfile;

System altered.

idle> startup force;
ORACLE instance started.

Total System Global Area  536870912 bytes
Fixed Size                  1268484 bytes
Variable Size             134219004 bytes
Database Buffers          394264576 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
idle> select name || '=' || value from v$parameter where name like '%cpu%' or name like 'parallel_max%' or name like 'pga_agg%' or name = 'processes';

NAME||'='||VALUE
-------------------------------------------------------------------------------
processes=150
cpu_count=4
parallel_max_servers=80
parallel_threads_per_cpu=2
pga_aggregate_target=25165824

Easily done

Doug Burns, May 13, 2009 - 11:01 am UTC

Well, that's what I *would* say ... Jonathan Lewis picked me up on it once!

http://oracledoug.com/serendipity/index.php?/archives/847-Hotsos-Day-0.html

LOL

Parallel Query dubts

A reader, July 01, 2009 - 12:17 pm UTC

Hi Tom,

We have a query which joins 3 huge tables (500 million,350 million, 30 million) followed by a BIG Group BY for which we have set degree and instances as DEFAULT on all the instances of 5 node cluster.

We have 8 core CPU's on each node with parallel_threads_per_cpu set to 2 and parallel_max_servers set to 100 against each node.

When we ran this query we observed that there are 263 processes (PQ slaves) working for ths query on this 5 node cluster.

My assumptions:

We have 5 node RAC and each has 8 core CPU, which essentially means 40 CPU's with parallel_threads_per_cpu = 2 means that the system is able to support concurrently 40 * 2 = 80 ACTIVE PROCESSES at one point of time.

If we have more than 80 ACTIVE processes on a 40 CPU machine we are doing context switching, making a process to sleep and providing time to another process. Is there any wait event which justifies that in order to provide CPU rsource to another process the context switch happened.

I also understand that slaves doing Full table scan is an I/O bound operation but when these slaves feed to the parent process to do a hash join CPU comes into picture. Finally these hash joined sets are GROUPED BY and depending on the PGA_TARGET_ALLOCATED we can do it in memory or spill on to temp tablespace for one pass/multi pass executions.

Now 263 Processes is n't a lot (some doing full table scan, some participating in hash join and further passed on to do Sort operation) checked from GV$session_longops.

Why Oracle opened up so many PQ slaves when we don't have enough CPU power as we did let oracle decide by stting default to degree and instances parameter.

Can you please provide me your suggestions. How do we estimate the no of processes should be ACTIVE on a 40 CPU box so that there is no context switching. I want to learn about this concept in great detail.

Thanks

July 06, 2009 - 7:10 pm UTC

... If we have more than 80 ACTIVE processes on a 40 CPU machine we are doing
context switching, making a process to sleep and providing time to another
process. ...

not necessarily - some are reading, some are sorting, some are writing. 80 is no the limit, it is a good starting point.

the hash join has the same precise "memory issues" as the group by - when you write:

I also understand that slaves doing Full table scan is an I/O bound operation
but when these slaves feed to the parent process to do a hash join CPU comes
into picture. Finally these hash joined sets are GROUPED BY and depending on
the PGA_TARGET_ALLOCATED we can do it in memory or spill on to temp tablespace
for one pass/multi pass executions.

it sounds like you think that maybe hash join is different from group by in that regards - it is not.

Why Oracle opened up so many PQ slaves when we don't have enough CPU power as
we did let oracle decide by stting default to degree and instances parameter.

tell me, did you observe large run queues for the cpu, were you 100% cpu bound during this operation, what did you observe physically on the machine.

Parallel Sorting

Deepak, August 04, 2009 - 7:11 am UTC

Hi Tom,
Our query becomes 200 times slower after applying ORDER BY claase. And also the sorts are happening in memory only (no spill-over to disks). Can parallel query be used to make the sorting faster? How can we make a soring faster other than increasing the PGA_AGGREGATE_TARGET value. Will partitioning help? Please help.

August 04, 2009 - 3:21 pm UTC

define 200 times, 200 times what exactly? 0.001 seconds or 100 seconds (times 200)

... How can we make a soring faster other than
increasing the PGA_AGGREGATE_TARGET value. ...

think about this please, you just said "sort are happening in memory" already. How could increasing memory solve anything here? You already have sufficient ram to do it in memory, adding more - what could that possibly do?

... Will partitioning help?...

it totally and entirely depends, you give us NOTHING to work with here.

There are an infinite number of solutions to the set of all problems.

Unfortunately, you haven't given us a problem to work with - so we have no clue what solution might be appropriate for you.

Restricting Parallel Slaves for a Query

Rob, December 04, 2009 - 7:26 pm UTC

As you mentioned in "Followup October 18, 2001 - 3pm Central time zone:", Oracle might use more parallel slaves than asked (for example with hint PARALLEL (table_name, 2)). My question is , is there a way I can ensure that query not use more than specified number of parallel slaves?

We have tuned process and we want to give it a boost. It has 10 similar queries that executes at same time. I want to use no more than 20 parallel processes with equal ditribution, is there a way to achieve this? I am seeing each process taking 4 parallel processes right now.

Thanks in advance.

December 07, 2009 - 2:05 pm UTC

... We have tuned process and we want to give it a boost. It has 10 similar queries
that executes at same time. I want to use no more than 20 parallel processes
with equal ditribution, is there a way to achieve this? I am seeing each
process taking 4 parallel processes right now.
...

I did not follow that at all.

"we have tuned process" - singular.

"it has ten similar queries" - how can a *process* have similar queries? and what is the relevance of them? and why is 10 important?

"no more than 20 parallel processes
with equal ditribution ... I am seeing each
process taking 4 parallel processes right now." - Now I am totally lost. processes do not use parallel execution servers, queries do.

I don't know what you mean by "20 parallel processes"

I'm not at all sure what you want to accomplish or what your architecture here looks like.

Limiting PQ Processes

Rob, December 08, 2009 - 1:20 pm UTC

Sorry Tom, I should have been more clear.

I have an application process that runs 10 queries (same queries) using 10 application threads in parallel on different data sets (number of rows in
each data set is same). It is 10 by design that cannot be changed. This is a complex query and takes almost same amount of time when run everyday as
data volume is same everyday. Volume of each data set has increased now (that will last for few weeks) and our process is taking more time which is
expected.

I want to give each of this query a boost in performance by running each of them with parallelism. My database host (server) can safely handle 20
parallel queries (20 parallel query slaves, I meant p00x processes in oracle). So my question is, is there any way I can restrict each of the query
to just use 2 parallel query slaves? If I give a hint of parallel(table_name, 2), it ends up taking 4 parallel query processes for each query and total of 40 PQ processes and
my database server host becomes unresponsive.

Your help is really appreciated.

December 10, 2009 - 1:21 pm UTC

... It is 10 by design that cannot be changed....

that is silly, what kind of developer would not make something like that parameterized?? But I digress.

... I want to give each of this query a boost in performance by running each of
them with parallelism ...

as long as you realize that by doing this you

a) might decrease response time
b) not change response time
c) massively INCREASE response time.

you can limit the parallel_max_servers to 20.

but I would not count on anything going any faster here, if 20 is all you can do, you have a smallish machine - and I doubt your IO can deal with everyone hitting it at the same time either.

Limiting PQ Processes

Rob, December 18, 2009 - 6:11 pm UTC

Thanks Tom.

Setting the max limit to 20 was the first thing I tried but it ended up with first 5 application processes taking the all 20 pq slaves and my overall throughput had no gain.

December 18, 2009 - 6:18 pm UTC

you would have to step down your degree of parallel - for it to work - you have many times processes feeding processes feeding processes (mostly just processes feeding processes feeding the query coordinator). Ask for parallel two and you'll many times see a pair of scanners feeding a pair of sorters feeding a query coordinator - by design. it wouldn't make sense to have two scanners feeding "nothing" at that point.

what it sounds like is

you want Oracle do to the parallel and you should stop running N of your own processes. Let ONE THING be parallel and let is consume the machine. Instead of running 10 of your threads, run ONE using parallel against all of your data.

Or, run more of your threads, against smaller slices of data, using non-parallel query.

It doesn't really make sense to do both at the same time, use one or the other.

parallel_max_servers Vs v$px_process_sysstat.server highwater

Vijay, January 18, 2010 - 1:02 pm UTC

Hi Tom,

I believe, "parallel_max_servers" represents max limit for number of Parallel Query(PQ) slaves that can be allocated in an instance. Is this correct?

Assuming that above is the case, in our scenario, total PQ slave count should not exceed 600. For your information, in my case, "parallel_max_servers" parameter has been set to 200 on each node of 3-node RAC database(10.1.0.5).

SQL> ed
Wrote file afiedt.buf

  1* select inst_id,name,value from gv$parameter where name in ('parallel_threads_per_cpu','parallel
SQL> /

   INST_ID NAME                           VALUE
---------- ------------------------------ ----------------------------------------------------------
         1 parallel_max_servers           200
         1 parallel_threads_per_cpu       2
         2 parallel_max_servers           200
         2 parallel_threads_per_cpu       2
         3 parallel_max_servers           200
         3 parallel_threads_per_cpu       2

6 rows selected.

Also, Adaptive Multi-user algorithm has been enabled on all three instances:

SQL> select name,value from gv$parameter where name like 'parallel_adaptive_multi_user';

NAME                           VALUE
------------------------------ ---------------------------------------------------------------------
parallel_adaptive_multi_user   TRUE
parallel_adaptive_multi_user   TRUE
parallel_adaptive_multi_user   TRUE

SQL> 

However, if I look at v$px_process, I could hardy see that total PQ slave presence(both idle+busy) going beyond 35% of "parallel_max_servers" setting.

SQL> select inst_id,status,count(*) from gv$px_process group by inst_id,status;

   INST_ID STATUS      COUNT(*)
---------- --------- ----------
         1 IN USE             3
         1 AVAILABLE         30
         2 IN USE             3
         2 AVAILABLE         57
         3 IN USE             3
         3 AVAILABLE         53

6 rows selected.

SQL> 

Note that, I have been observing above consistently.

Any idea, why it's the case?

Secondly, of late, we have been noticing that nightly batch jobs are suffering due to slave downgrad issue and thus missing agreed Service Level Agreements(SLAs). It's been proven as well.

So, in order to circumvent the problem, we have decided to increased "parallel_threads_per_cpu" (from 2 to 4) to increase throughput. We have also "tested" this change, and indeed it's working charm.

The technical management group, however, wants us to decrease setting for parallel_max_servers (still unsure by what degree), to avoid the system being crippled to halt (resource-wise). Their point of contention it, this change may excessively spawn of PQ slaves on an already very heavily loaded system (88% to 90% CPU and memory consumption). But, I fear that, if we do so then it "may" nullify the effect of "parallel_threads_per_cpu" change. 

Looking forward to you expert feedback in this regard?

Finally, if I look at column v$px_process_sysstat."Servers Highwater", I could see that PQ slave allocation has breached the limit set by "parallel_max_servers" at some point in the past. 

SQL> select * from gv$px_process_sysstat where statistic like '%Servers Highwater%';

   INST_ID STATISTIC                           VALUE
---------- ------------------------------ ----------
         3 Servers Highwater                     273
         2 Servers Highwater                     231
         1 Servers Highwater                     198

SQL> 

Oracle manual says, 

<quote>

Servers HWM - Maximum number of concurrent PX server processes

If this number is equal to the PARALLEL_MAX_SERVERS initialization parameter, then consider increasing the parameter.
This could allow you to increase your throughput, especially if your system is under-utilized and the V$SYSSTAT 
statistic "Parallel operations downgraded to serial" is large.

<quote>

This definition has caused even more confusion:

--> Firstly, in our case, v$px_process_sysstat.server highwater(at some point) > parallel_max_servers! Why?
--> I cannot certainly think of increasing parallel_max_servers, as system is already bit heavily loaded (resource-wise, CPU, memory and all) right now.

Looking forward to your expert feedback.

Kind Regards,
Vijay

January 19, 2010 - 3:47 pm UTC

parallel_max_servers:

http://docs.oracle.com/docs/cd/E11882_01/server.112/e10820/initparams180.htm#REFRN10158

yes, your definition was correct.

...
However, if I look at v$px_process, I could hardy see that total PQ slave
presence(both idle+busy) going beyond 35% of "parallel_max_servers" setting.
..
Any idea, why it's the case?
....

well, because that is the MAX (200), not the minimum, not the average - the max. If you have only a handful of users going at a time - it'll use a handful of resources. the adaptive multi user is designed to try to be "fair" - to spread out the resources for many users (like pga_aggregate_target). Also, you have parallel threads / cpu set to 2, unless you have triple digit numbers of cpu's, you won't get very high numbers there.

... Their point of contention it, this change may
excessively spawn of PQ slaves on an already very heavily loaded system (88% to
90% CPU and memory consumption). ..

they could have a point, if you start doing something very cpu intensive and do 4 of them per cpu at the same time - you'll swamp the machine. If the 4 things per cpu spend a lot of time doing IO and some time doing CPU, you'll be OK. The only answer is "it depends", it depends on what they are doing.

as for the highwater, that is easy:

ops$tkyte%ORA10GR1> show parameter parallel_max

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
parallel_max_servers                 integer     109
ops$tkyte%ORA10GR1> alter system set parallel_max_servers = 6;

System altered.

ops$tkyte%ORA10GR1> show parameter parallel_max

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
parallel_max_servers                 integer     6
ops$tkyte%ORA10GR1> select * from v$px_process_sysstat where statistic like '%Servers High%';

STATISTIC                           VALUE
------------------------------ ----------
Servers Highwater                       8

you guys have been testing different values right?

A reader, September 16, 2010 - 4:31 pm UTC

Parallel operations downgraded too much

REAC, October 23, 2010 - 4:22 pm UTC

Hi Tom,

I have a problem with that...i think, this is te result of the query to GV$SYSSTAT

queries parallelized 12448
DML statements parallelized 11
DDL statements parallelized 13
DFO trees parallelized 12505
Parallel operations not downgraded 0
Parallel operations downgraded to serial 256486
Parallel operations downgraded 75 to 99 pct 1267
Parallel operations downgraded 50 to 75 pct 0
Parallel operations downgraded 25 to 50 pct 11238
Parallel operations downgraded 1 to 25 pct 0
PX local messages sent 83419761
PX local messages recv'd 83419755
PX remote messages sent 0
PX remote messages recv'd 0

I'm in a new job and the last DBA have this configuration,

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback string LOW
log_parallelism integer 1
parallel_adaptive_multi_user boolean TRUE
parallel_automatic_tuning boolean TRUE
parallel_execution_message_size integer 4096
parallel_instance_group string
parallel_max_servers integer 490
parallel_min_percent integer 0
parallel_min_servers integer 4
parallel_server boolean FALSE
parallel_server_instances integer 1
parallel_threads_per_cpu integer 4
recovery_parallelism integer 0

The CPU_COUNT show me = 8, but the but the platform manager tells me that the server has 2 dual core processors each, which also has me confused.

The application runs 700 more or less concurrent user sessions, but are really active around 10% - 20%, but the cpu stays usually around 60% - 70%. Most movements are related to 5 large tables that are partitioned.

Since I'm the new DBA, I would start with the optimization and the use of Parallel, I hope you can help me with this.

Many thanks in advance,

October 25, 2010 - 5:38 pm UTC

cpu_count is set by the OS - we just ask the OS. Why don't you ask the OS how many cpu's it thinks it has? I don't know what OS you are on, so I cannot tell you "how" to do that - but your platform manager should be able to help you.

with 70 to 140 concurrently active users - using parallel query with at most 8 cpu's isn't going to be very practical.

You'd need either A LOT MORE cpu or A LOT LESS concurrency.

Live SQL

Share and learn SQL and PL/SQL; free access to the latest version of Oracle Database!

Dev Gym

Classes, workouts and quizzes on Oracle Database technologies. Expertise through exercise!

Parallel Query concerns

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Parallel Query

Linear performance gain

PQ is a big headache

You're right

according to James Morle´s book

Why parallel query ?

Why parallel query ?

Why parallel query ?

parallel automatic tuning parameter

Is it true that Parallel query is meaningfull only if table is partitioned?

find the process which spawned p00* processes

How many degree of parallelism?

How to stop using parallel slaves

paralle_min_percent-does not seem to work properly

Which parameter sets parallel query enable ?

parallel

parallel

parallel processing

parallel_automatic_tuning

Thanks , Tom

high CPU on ora_p003

ORA-23386: replication parallel push cannot create slave processes

pdml on non-partitioned table

sometimes DML runs in parallel,

Parallel Processes Consuming All TEMP

DOP

disable parallel query

disable parallel query

parallel_max_servers settings in 10g and parallel Hints

parallel_max_servers in 10g

parallel_max_servers settings

continuation from the previous post

Any more detail?

Found the answer.

That doesn't look right

Trying to see the difference here ...

Earlier examples ...

Compatible?

That's what I was thinking, as well as processes/sessions

Easily done

Parallel Query dubts

Parallel Sorting

Restricting Parallel Slaves for a Query

Limiting PQ Processes

Limiting PQ Processes

parallel_max_servers Vs v$px_process_sysstat.server highwater

Parallel operations downgraded too much

More to Explore

Live SQL

Dev Gym