The AskTOM team is taking a break over the holiday season, so we're not taking questions or responding to comments. Please have a happy and safe festive season and as always, thanks for being a member of the AskTOM community.

Thanks for the question, Von.

Asked: April 30, 2005 - 1:05 pm UTC

Last updated: July 15, 2022 - 10:39 am UTC

Version: 9.2

Viewed 100K+ times! This question is

You Asked

Hi Tom
From a developer perspective, Whats the best way to determine the degree of parallelism for sql statements(select,insert)...parallel DML,defining degree at table level
What factors do i need to account in determining the degree of parallelism?
Thanks
Venkat

and Tom said...

I would suggest you don't...

That you use the automatic parallel tuning.

Here is a short quote from my book Effective Oracle by Design on this topic:

Parallel Query

Parallel query is suitable for a certain class of large problems: very large problems that have no other solution. Parallel query is my last path of action for solving a performance problem; it's never my first course of action.

Parallel Query Settings

I will not discuss the physical setup of parallel query operations. That topic is well covered in both the Oracle Concepts Guide and Data Warehousing Guide. As I mentioned earlier, my current favorite way to set up parallelism in Oracle is using the automatic tuning option first introduced in Oracle8i Release 2 (version 8.1.6): PARALLEL_AUTOMATIC_TUNING = TRUE. With this setting, the other parallel settings are automatically set. Now, all I need to do is set the PARALLEL option on the table (not PARALLEL <N>, just PARALLEL) and Oracle will, when appropriate, parallelize certain operations on that table for me. The degree of parallelism (how many processes/threads will be thrown at a problem) will be decided for me and vary over time as the load on the system varies. I have found that, for most cases, this achieves my desired goal, which is usually to get the best performance, with the least amount of work, in a manner that is most manageable. Setting a single parameter is a great way to get there.

For the novice user wanting to play with parallel query for the first time, parallel automatic tuning is a good way to get started. As you develop an understanding of what parallel query does and how it does it, try tweaking some of the other parallel settings:

PARALLEL_ADAPTIVE_MULTI_USER Controls whether the degree of parallelism should vary over time as the load on the system does; should the algorithm for assigning resources "adapt" to the increase in load.
PARALLEL_EXECUTION_MESSAGE_SIZE Sets the size of the message buffers used to pass information back and forth between the processes executing the parallel query.
PARALLEL_INSTANCE_GROUP Applies only to Oracle RAC configurations (Oracle Parallel Server, OPS, in Oracle8i and earlier). Allows you to restrict the number of instances that will be used to perform a parallel operation (as opposed to the number of processes an instance will use).
PARALLEL_MAX_SERVERS Sets the maximum number of parallel query slaves (like dedicated servers but for parallel operations) your instance will ever have.
PARALLEL_MIN_PERCENT Useful if you would like to receive an error message when you request a specific degree of parallelism but insufficient resources exist to satisfy that request. You might use this to ensure that a process that takes days unless it gets what it wants doesn't run unless it gets what it wants.
PARALLEL_MIN_SERVERS Sets the number of servers to start when the instance starts and to keep started permanently. Otherwise, you may need to start the parallel processes in response to a query.
PARALLEL_THREADS_PER_CPU Determines the default degree of parallelism and contributes to the adaptive parallelism algorithms, to determine when to back off on the amount of parallel resources.
RECOVERY_PARALLELISM For crash recovery, sets how many parallel threads should be used. This setting can be used to speed up the recovery from an instance crash.
FAST_START_PARALLEL_ROLLBACK Sets how many processes would be available to perform a parallel rollback after the recovery takes place. This would be useful on systems where many long-running transactions are constantly processing (which would need correspondingly long rollback times in the event of an instance crash).

Rating

(187 ratings)

We're not taking comments currently, so please try again later if you want to add a comment.

Comments

noparallel at table level

Eugene, May 02, 2005 - 2:45 pm UTC

Hi Tom,
I am using 9r2 on Windows.
I'd like to setup parallel query. What I am going to do is to set
PARALLEL_AUTOMATIC_TUNING = TRUE. All my tables are creted with
NOPARALLEL clause. Could I still use /*+ PARALLEL */ hint to run
queries in parallel?

Thanks,
Eugene

May 02, 2005 - 7:55 pm UTC

yes

I dont agree with

The Human Fly, May 03, 2005 - 2:41 am UTC

Hi Tom you said, it is possible to run the queries in paralle using PARALLEL hin when table is noparallen, but,PARALLEL_AUTOMATIC_TUNING is set to TRUE.

Following is the excerpt from metalink:

When PARALLEL_AUTOMATIC_TUNING is set to true, Oracle determines the default values for parameters that control parallel execution. In addition to setting this parameter, you must specify the PARALLEL clause for the target tables in the system. Oracle then tunes all subsequent parallel operations automatically.

and I did testing on my database.

jaffar@PRIMEDB> show parameter automatic;

NAME TYPE VALUE
------------------------------------ -------------------------------- -----
parallel_automatic_tuning boolean TRUE

TABLE_NAME DEGREE
------------------------------ ----------
TESTP 1

jaffar@PRIMEDB> explain plan for select /*+ PARALLEL*/ * from testp;

Explained.

jaffar@PRIMEDB> select * from table(dbms_xplan.display) dual;

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------

--------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 20 | 2 |
| 1 | TABLE ACCESS FULL | TESTP | 10 | 20 | 2 |
--------------------------------------------------------------------

Note: cpu costing is off

9 rows selected.

my quries doesn't run in parallel mode?

May 03, 2005 - 1:13 pm UTC

ops$tkyte@ORA10G> drop table t;
Table dropped.
                                                                                                          
ops$tkyte@ORA10G> create table t ( x int );
Table created.
                                                                                                          
ops$tkyte@ORA10G> exec dbms_stats.set_table_stats( user, 'T', numrows=>1000000, numblks =>100000 );
PL/SQL procedure successfully completed.
                                                                                                          
ops$tkyte@ORA10G> show parameter parallel;
                                                                                                          
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback         string      LOW
parallel_adaptive_multi_user         boolean     TRUE
parallel_automatic_tuning            boolean     TRUE
parallel_execution_message_size      integer     4096
parallel_instance_group              string
parallel_max_servers                 integer     20
parallel_min_percent                 integer     0
parallel_min_servers                 integer     0
parallel_server                      boolean     FALSE
parallel_server_instances            integer     1
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0

ops$tkyte@ORA10G> @plan "select<b> /*+ parallel(t) */ </b>count(*) from t"
ops$tkyte@ORA10G> delete from plan_table;
7 rows deleted.
                                                                                                          
ops$tkyte@ORA10G> explain plan for &1;
old   1: explain plan for &1
new   1: explain plan for select /*+ parallel(t) */ count(*) from t
                                                                                                          
Explained.
                                                                                                          
ops$tkyte@ORA10G> select * from table(dbms_xplan.display);
                                                                                                          
PLAN_TABLE_OUTPUT
----------------------------------------------
Plan hash value: 545668572
                                                                                                          
---------------------------------------------
| Id  | Operation              | Name     | Rows  |... |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |     1 |... |      |            |
|   1 |  SORT AGGREGATE        |          |     1 |... |      |            |
|   2 |   PX COORDINATOR       |          |       |... |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |... | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |          |     1 |... | PCWP |            |
|   5 |      PX BLOCK ITERATOR |          |  1000K|... | PCWC |            |
|   6 |       TABLE ACCESS FULL| T        |  1000K|... | PCWP |            |
---------------------------------------------------------------------------
                                                                                                          
13 rows selected.
                                                                                                          
ops$tkyte@ORA10G> spool off

For The Human Fly

Muhammad Riaz Shahid, May 03, 2005 - 4:44 am UTC

Try SELECT /*+parallel(TableName,n) */ FROM TableName
where n=2,3,......

May 03, 2005 - 1:44 pm UTC

or just parallel( tablename )

Ignoring PARALLEL hints

Suvendu, August 06, 2005 - 8:28 am UTC

Hi Tom,

Here is the step what I did:

a) The statement is like so;
insert /*+ append parallel ( t,4) */ into abc_partition_tab
(
col1, col2, col3
) select /*+ parallel (x, 4) */ col1, col2, col3 from abc_external_tab;

b) After running, I checked out from HP-UNIX top command, it shows me for 10 second there are 4 parallel process going to use 10-60% CPU resource. Then all vanished from the list.
c) But v$session still showing 4 parallel process session executing 2 separate statement ( referencing to v$SQL) , and statement like so:

Process : P000 , P001
INSERT /*+ APPEND */ INTO ..

Process : P002, P004
SELECT /*+ Q5244000 NO_EXPAND FULL(A1) */ A1."COF_CLIENT_ID".

But still, there is session (sqlplus) running the original statement as on query:

insert /*+ append parallel(i,4) */ into abc_partition_tab

Above all information says, there is no PARALLEL insertion going on. And initial TOP commands parallel process output says about they were used for FTS of external table as per my understanding.

So, my question:
1) If parallel process P000 , P001 not showing any parallel hint, why sqlplus sessions statement showing parallel hint : insert /*+ append parallel(i,4) */ into abc_partition_tab

2) Why its not using parallel insertion?
3) Does APPEND hint dont allows to parallel insertion?

I dont have any explicitly PARALLEL parameter setting on database. As Im using degree of parallelism relate number of CPU. And my database on HP-UNIX , Oracle 9.2.

Hope, my question is explanatory. Correct me if Im wrong.

Every one will happy on this thread, if you elaborate more about same with some good example.

Thanking you,

Suvendu

August 06, 2005 - 9:44 am UTC

1) don't know what you mean.  but the parallel execution servers SQL is just generated for it, you and i don't have any control over it, whatever it is -- it is "correct"

2) you didn't enable parallel DML (PDML) probably?  You got parallel query but not PDML, PDML must be enabled explicitly.

3) at least a double negative in there.  When using parallel direct path insert, you must be in "append" mode.


"everyone will be happy" hmm, speaking for the masses?

Assuming BIG_TABLE is "parallel" (alter table big_table parallel), then:


ops$tkyte@ORA10GR1> alter session enable parallel dml;
Session altered.
 
ops$tkyte@ORA10GR1> insert /*+ append */ into UNIFORM_TEST 
  2 select * from big_table;
10000000 rows created.


is all you need, the degree of parallelism will be derived from you system settings (defaults based on the number of cpus)

Unclear on PDML and parallel settings

A reader, August 17, 2005 - 5:47 am UTC

Hi Tom

I'm not sure I understand the the answer correctly.

My understanding is that the best approach to making SELECT, INSERT, DELETE etc. work in parallel is to set:
PARALLEL_AUTOMATIC_TUNING = TRUE
and then change the tables enabling the parallel option (not parallel <n>).

But, later in the answer, the following query is suggested:

alter session enable parallel dml;
and
insert /*+ append */ into UNIFORM_TEST
2 select * from big_table;

I am now confused about what needs to be done so that all SELECT, DELETE and INSERT statements will run in parallel for whoever connects to the DB.

August 17, 2005 - 1:36 pm UTC

alter session enable parallel dml;

is a prerequisite for parallem DML, without it - won't matter WHAT parameters you have set.

for SELECT, the parallel parameters in the init.ora plus - parallel on the table, parallel hint -- would be enough.

</code> http://docs.oracle.com/docs/cd/B10501_01/server.920/a96524/c20paral.htm#365 <code>
you might want to read that and the chapters in the data warehousing guide as well

I seriously DOUBT that:
...all SELECT, DELETE and
INSERT statements will run in parallel for whoever connects to the DB....

is something you actually want to have happen in real life. parallel execution is a tool, one that needs to be used carefully.

integrity constraints with append hint

Jagannath Dalvi, August 19, 2005 - 5:07 pm UTC

Tom,

APPEND hint will ignore integrity constraints..
Is that the same behaviour for PARALLEL as well?

August 20, 2005 - 4:46 pm UTC

no it won't.

NO IT WON'T.

ops$tkyte@ORA10G> create table t ( x int check (x > 0) );
Table created.
 
ops$tkyte@ORA10G> create table t2 ( x int primary key );
Table created.
 
ops$tkyte@ORA10G> insert into t2 values ( 1 );
1 row created.
 
ops$tkyte@ORA10G> insert /*+ APPEND */ into t select 1 from dual;
1 row created.
 
ops$tkyte@ORA10G> select * from t;
select * from t
              *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

<b>that shows we did a direct path, now:</b> 
 
ops$tkyte@ORA10G> commit;
Commit complete.
 
ops$tkyte@ORA10G> insert /*+ APPEND */ into t select 0 from dual;
insert /*+ APPEND */ into t select 0 from dual
*
ERROR at line 1:
ORA-02290: check constraint (OPS$TKYTE.SYS_C0014018) violated
 
<b>that clearly shows integrity constraints are *not* ignored, what you may have meant was that direct path is disabled for a specific type of constraint:</b>
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> alter table t add constraint t_fk foreign key(x) references t2;
 
Table altered.
 
ops$tkyte@ORA10G> insert /*+ APPEND */ into t select 1 from dual;
 
1 row created.
 
ops$tkyte@ORA10G> select * from t;
 
         X
----------
         1
         1
 
ops$tkyte@ORA10G>

<b>we did not direct path because of the foreign key</b>

One more thing...

Jagannath Dalvi, August 19, 2005 - 5:17 pm UTC

I am trying to use PARALLEL hint for INSERT, the data insertion will be in ONE single partition, as per ORACLE, even if I use PARALLEL hint to insert data in one partition, DB will not open multiple sessions (rather it will be serial).

Is there any way to force database to use parallel sessions while inserting as well?

Thanks,

Jagan

August 20, 2005 - 4:48 pm UTC

not following you here?

if you use parallel direct path insert (remember to alter session to enable parallel dml!) Oracle will use parallel execution servers - each in their own session to do the work in parallel.

Constraints and parallel DML

Jagannath Dalvi, August 22, 2005 - 10:35 am UTC

Reg. Constraints: Yes, I meant by referential integrity constraints, sorry being not very clear on the problem.

About the second problem of parallel DMLs, we are using 9.2.0.4 and there is a bug:
<<<
</code> http://metalink.oracle.com/metalink/plsql/ml2_documents.showDocument?p_database_id=NOT&p_id=2929556.8

(Bug 2929556  ORA-372 ORA-1110 inserting into partition with enable PARALLEL DML set)
>>>

To overcome this, I was using some workarounds provided by Oracle:
<<<

http://metalink.oracle.com/metalink/plsql/showdoc?db=Bug&id=2929556 <code>
WORKAROUND:
-----------
Disable the PARALLEL DML
>>>

Because of volume of the data, client is not ready to make all partitions read/write, so I cannot alter the session to enable the parallel DML. I have just enabled parallel query, but as intended, it is only working for selects.

My question was, with all these constraints, is there a way to force parallelism for inserts? OR are there any suggestions to improve the performance?

One again Thanks a lot for your help.

Jagan

August 23, 2005 - 4:02 am UTC

but still you said it very wrong. they are NOT IGNORED, they are very very much obeyed.

you can use conventional path sql and break your inserts up of course. If you have referential integrity enabled, direct path is not going to happen.

Thanks

Jagannath Dalvi, August 23, 2005 - 9:35 am UTC

I end up doing BULK INSERTS in chunk of rows.
Let's see once we migrate to 10g!

Thanks,

Jagan

Help on performance

JD, September 19, 2005 - 12:51 pm UTC

Tom,

I have some performance related issues to insert data into partitioned tables. I have changed the code from traditional FOR LOOP inserts to the SELECT INTO using PARALLEL hints (because of RULE based optimization).

The daily process is to refresh the data from live (OLTP) tables to so called warehouse tables. There is one table in particular that takes almost 80% of the entire process, it has monthly partitions from 1999 till date (about 700 M plus rows). Unfortunately there are 6 GLOBAL indexes on the table (used for reporting purpose) along with LOCAL indexes. I have traced the session and ran Trace Analyzer to prove that indexes are taking more time (about 70%):
Non-idle Wait (secs): 5222.307 (out of approximately 7200 secs) ==> (70%)
I/O Wait Event: db file sequential read
SQL example:
INSERT /*+ PARALLEL(target_tab,12) */ INTO target_table target_tab (columns.)
SELECT /*+ PARALLEL(source_tab,12) */ columns
FROM target_table target_tab, source_table source_tab
WHERE conditions to check primary keys.

Could you please suggest better approach for this?

I am planning to remove GLOBAL indexes and create them as LOCAL. If I go this way, would there be any performance impact on reports (data selection criteria for may be across the months and partition is based on date/month)?

Thanks,

JD

September 19, 2005 - 1:45 pm UTC

did you enable PDML before doing that? else it was a conventional path, serial insert (perhaps with a parallel query).

with pdml, that insert would be using APPEND (direct path) and would be very different.

you cannot just nilly willy remove global indexes - you might make the load go faster but absolutely KILL retrieval There was a design reason to use global, you have to find out why.

Lets say you have a table with 100 partitions, partitioned by date.

You have a GLOBAL index on last_name, range partitioned into 50 partitions.

You query: select * from t where last_name = :x;

we can find the EXACT single index partition to range scan, scan it and find the row(s).

Now, you make that global index on last_name a local one......

Now, that query has to

INDEX RANGE SCAN 100 INDEXES in order to find the row(s)!! Your retrieval would die.

PDML is not enabled

JD, September 19, 2005 - 2:15 pm UTC

Thanks for the quick reply.

The PDML is not enabled because of 9.2.0.4 bug (mentioned above on this page). Only parallel query is enabled. But you are right this query is not using parallel for inserts, it is only selecting using parallel hint.

About the indexes, we have exactly same thing, there is one GLOBAL index on name and as you said any search using that index would die :(

Is there a way out in 9.2.0.4 or I have to wait until the next migration?

Thanks,

JD

September 19, 2005 - 4:48 pm UTC

have you tried using APPEND instead of a conventional path insert.

Thanks

JD, September 21, 2005 - 9:49 am UTC

I will try APPEND as well.

How about global partition index?

JD, September 23, 2005 - 4:28 pm UTC

Would it be better if I replace the local index with global partition index?

The table is partitioned by date/month and the index (local) is on name column.
If I try to partition index (globally) based on the distribution of first letter in the name column, would that create any performance problem on data retrieval?

September 23, 2005 - 8:54 pm UTC

you do not make choices between global and local based on load speeds,

their use is driven by RETRIEVAL needs, not much else.

Sorry.. please ignore the above post I meant to say.

JD, September 23, 2005 - 4:31 pm UTC

Would it be better if I replace the GLOBAL index with global partition index?

The table is partitioned by date/month and the index (GLOBAL) is on name column.
If I try to partition index (globally) based on the distribution of first letter in the name column, would that create any performance problem on data retrieval?

September 23, 2005 - 9:01 pm UTC

it'll still be a global index - so what was the motivation?

you'll either have 1 index partition to maintain or N index partitions to maintain?

What about Parallel PLSQL?

Andrew, September 27, 2005 - 7:18 pm UTC

I don't suppose Oracle has an option to parallelize PLSQL?

For example, I have a 12-CPU box and I want to do:
BEGIN
FOR rec in (select *,rowid from very_large_table) LOOP
do_some_very_expensive_PLSQL_here();
END LOOP;
END;

If I do the above as is in serial, it takes 12 hours. If I manually break it down into 12 equal-sized chunks and run it in 12 different sql*plus sessions, then it will run in 1 hour, e.g. the cursor from above can become:

select *, rowid from very_large_table where MOD(object_id,12) = &thread_num

So my question is, I guess:
Are there any better alternatives to "manually" coding a parellelism into the process?

(Note that the do_some_very_expensive_PLSQL_here() function is actually a number of quite complex functions which do DML on a number of different tables, so the functions cannot simply be put into a single SELECT /*+ PARALLEL */ statement.)

September 27, 2005 - 8:38 pm UTC

from my latest book (see homepage for details if you are interested):


Procedural Parallelism

I would like to discuss two types of procedural parallelism:

    *    Parallel pipelined functions, which is a feature of Oracle.
    *    "Do-it-yourself (DIY) parallelism," which is the application to 

your own applications of the same techniques that Oracle applies to parallel full table scans. DIY parallelism is more of a development technique than anything built into Oracle directly.

Many times you'll find that applications-typically batch processes-designed to execute serially will look something like the following procedure:

Create procedure process_data
As
Begin
For x in ( select * from some_table )
   Perform complex process on X
   Update some other table, or insert the record somewhere else
End loop
end 

In this case, Oracle's parallel query or PDML won't help a bit (in fact, parallel execution of the SQL by Oracle here would likely only cause the database to consume more resources and take longer). If Oracle were to execute the simple SELECT * FROM SOME_TABLE in parallel, it would provide this algorithm no apparent increase in speed whatsoever. If Oracle were to perform in parallel the UPDATE or INSERT after the complex process, it would have no positive affect (it is a single-row UPDATE/INSERT, after all).

There is one obvious thing you could do here: use array processing for the UPDATE/INSERT after the complex process. However, that isn't going to give you a 50 percent reduction or more in runtime, and often that is what you are looking for. Don't get me wrong, you definitely want to implement array processing for the modifications here, but it won't make this process run two, three, four, or more times faster.

Now, suppose this process runs at night on a machine with four CPUs, and it is the only activity taking place. You have observed that only one CPU is partially used on this system, and the disk system is not being used very much at all. Further, this process is taking hours, and every day it takes a little longer as more data is added. You need to reduce the runtime by many times-it needs to run four or eight times faster-so incremental percentage increases will not be sufficient. What can you do?

There are two approaches you can take. One approach is to implement a parallel pipelined function, whereby Oracle will decide on appropriate degrees of parallelism (assuming you have opted for that, which would be recommended). Oracle will create the sessions, coordinate them, and run them, very much like the previous example with parallel DDL where, by using CREATE TABLE AS SELECT OR INSERT ./*+APPEND*/, Oracle fully automated parallel direct path loads for us. The other approach is DIY parallelism. We'll take a look at both approaches in the sections that follow.


Parallel Pipelined Functions

We'd like to take that very serial process PROCESS_DATA from earlier and have Oracle execute it in parallel for us. To accomplish this, we need to turn the routine "inside out." Instead of selecting rows from some table, processing them, and inserting them into another table, we will insert into another table the results of fetching some rows and processing them. We will remove the INSERT at the bottom of that loop and replace it in the code with a PIPE ROW clause. The PIPE ROW clause allows our PL/SQL routine to generate table data as its output, so we'll be able to SELECT from our PL/SQL process. The PL/SQL routine that used to procedurally process the data becomes a table, in effect, and the rows we fetch and process are the outputs. We've seen this many times throughout this book every time we've issued the following:

Select * from table(dbms_xplan.display);

That is a PL/SQL routine that reads the PLAN_TABLE; restructures the output, even to the extent of adding rows; and then outputs this data using PIPE ROW to send it back to the client. We're going to do the same thing here in effect, but we'll allow for it to be processed in parallel.

We're going to use two tables in this example: T1 and T2. T1 is the table we were reading previously, and T2 is the table we need to move this information into. Assume this is some sort of ETL process we run to take the transactional data from the day and convert it into reporting information for tomorrow. The two tables we'll use are as follows:

ops$tkyte-ORA10G> create table t1
  2  as
  3  select object_id id, object_name text
  4    from all_objects;
Table created.

ops$tkyte-ORA10G> begin
  2      dbms_stats.set_table_stats
  3      ( user, 'T1', numrows=>10000000,numblks=>100000 );
  4  end;
  5  /
PL/SQL procedure successfully completed.

ops$tkyte-ORA10G> create table t2
  2  as
  3  select t1.*, 0 session_id
  4    from t1
  5   where 1=0;
Table created.

We used DBMS_STATS to "trick" the optimizer into thinking that there are 10,000,000 rows in that input table and that it consumes 100,000 database blocks. We want to simulate a big table here. The second table, T2, is simply a copy of the first table's structure with the addition of a SESSION_ID column. That column will be useful to actually "see" the parallelism that takes place.
Next, we need to set up object types for our pipelined function to return. The object type is simply a structural definition of the "output" of the procedure we are converting. In this case, it looks just like T2:

ops$tkyte-ORA10G> CREATE OR REPLACE TYPE t2_type
  2  AS OBJECT (
  3   id         number,
  4   text       varchar2(30),
  5   session_id number
  6  )
  7  /
Type created.

ops$tkyte-ORA10G> create or replace type t2_tab_type
  2  as table of t2_type
  3  /
Type created.


And now for the pipelined function, which is simply the original PROCESS_DATA 
procedure rewritten. The procedure is now a function that produces rows. It accepts as an input the data to process in a ref cursor. The function returns a T2_TAB_TYPE, the type we just created. It is a pipelined function that is PARALLEL_ENABLED. The partition clause we are using says to Oracle, "Partition, or slice up, the data by any means that works best. We don't need to make any assumptions about the order of the data."


You may also use hash or range partitioning on a specific column in the ref cursor. That would involve using a strongly typed ref cursor, so the compiler knows what columns are available. Hash partitioning would just send equal rows to each parallel execution server to process based on a hash of the column supplied. Range partitioning would send nonoverlapping ranges of data to each parallel execution server, based on the partitioning key. For example, if you range partitioned on ID, each parallel execution server might get ranges 11000, 100120000, 2000130000, and so on (ID values in that range). 

Here, we just want the data split up. How the data is split up is not relevant to our processing, so our definition looks like this:

ops$tkyte-ORA10G> create or replace
  2  function parallel_pipelined( l_cursor in sys_refcursor )
  3  return t2_tab_type
  4  pipelined
  5  parallel_enable ( partition l_cursor by any )

We'd like to be able to see what rows were processed by which parallel execution servers, so we'll declare a local variable L_SESSION_ID and initialize it from V$MYSTAT:

  6
  7  is
  8      l_session_id number;
  9      l_rec        t1%rowtype;
 10  begin
 11      select sid into l_session_id
 12        from v$mystat
 13       where rownum =1;

Now we are ready to process the data. We simply fetch out a row (or rows, as we could certainly use BULK COLLECT here to array process the ref cursor), perform our complex process on it, and pipe it out. When the ref cursor is exhausted of data, we close the cursor and return:

 14      loop
 15          fetch l_cursor into l_rec;
 16          exit when l_cursor%notfound;
 17          -- complex process here
 18          pipe row(t2_type(l_rec.id,l_rec.text,l_session_id));
 19      end loop;
 20      close l_cursor;
 21      return;
 22  end;
 23  /
Function created.

And that's it. We're ready to process the data in parallel, letting Oracle figure out based on the resources available what the most appropriate degree of parallelism is:

ops$tkyte-ORA10G> alter session enable parallel dml;
Session altered.

ops$tkyte-ORA10G> insert /*+ append */
  2  into t2(id,text,session_id)
  3  select *
  4  from table(parallel_pipelined
  5            (CURSOR(select /*+ parallel(t1) */ *
  6                      from t1 )
  7             ))
  8  /
48250 rows created.

ops$tkyte-ORA10G> commit;
Commit complete.

Just to see what happened here, we can query the newly inserted data out and group by SESSION_ID to see first how many parallel execution servers were used, and second how many rows each processed:

ops$tkyte-ORA10G> select session_id, count(*)
  2    from t2
  3   group by session_id;

SESSION_ID   COUNT(*)
---------- ----------
       241       8040
       246       8045
       253       8042
       254       8042
       258       8040
       260       8041
6 rows selected.

Apparently, we used six parallel execution servers for the SELECT component of this parallel operation, and each one processed about 8,040 records each.
As you can see, Oracle parallelized our process, but we underwent a fairly radical rewrite of our process. This is a long way from the original implementation. So, while Oracle can process our routine in parallel, we may well not have any routines that are coded to be parallelized. If a rather large rewrite of your procedure is not feasible, you may well be interested in the next implementation: DIY parallelism.



Do-It-Yourself Parallelism

Say we have that same process as in the preceding section: the serial, simple procedure. We cannot afford a rather extensive rewrite of the implementation, but we would like to execute it in parallel. What can we do? My approach many times has been to use rowid ranges to break the table up into some number of ranges that don't overlap (yet completely cover the table).

This is very similar to how Oracle performs a parallel query conceptually. If you think of a full table scan, Oracle processes that by coming up with some method to break the table into many "small" tables, each of which is processed by a parallel execution server. We are going to do the same thing using rowid ranges. In early releases, Oracle's parallel implementation actually used rowid ranges itself.

We'll use a BIG_TABLE of 1,000,000 rows, as this technique works best on big tables with lots of extents, and the method I use for creating rowid ranges depends on extent boundaries. The more extents used, the better the data distribution. So, after creating the BIG_TABLE with 1,000,000 rows, we'll create T2 like this:

big_table-ORA10G> create table t2
  2  as
  3  select object_id id, object_name text, 0 session_id
  4    from big_table
  5   where 1=0;
Table created.

We are going to use the job queues built into the database to parallel process our procedure. We will schedule some number of jobs. Each job is our procedure slightly modified to just process the rows in a given rowid range.
Note    In Oracle 10g, you could use the scheduler as well for something so simple.  In order to make the example 9i compatible, we'll use the job queues here.

To efficiently support the job queues, we'll use a parameter table to pass inputs to our jobs:

big_table-ORA10G> create table job_parms
  2  ( job        number primary key,
  3    lo_rid  rowid,
  4    hi_rid  rowid
  5  )
  6  /
Table created.

This will allow us to just pass the job ID into our procedure, so it can query this table to get the rowid range it is to process. Now, for our procedure. The code in bold is the new code we'll be adding:

big_table-ORA10G> create or replace
  2  procedure serial( p_job in number )
  3  is
  4      l_rec        job_parms%rowtype;
  5  begin
  6      select * into l_rec
  7        from job_parms
  8       where job = p_job;
  9
 10      for x in ( select object_id id, object_name text
 11                   from big_table
 12                  where rowid between l_rec.lo_rid
 13                                  and l_rec.hi_rid )
 14      loop
 15          -- complex process here
 16          insert into t2 (id, text, session_id )
 17          values ( x.id, x.text, p_job );
 18      end loop;
 19
 20      delete from job_parms where job = p_job;
 21      commit;
 22  end;
 23  /
Procedure created.

As you can see, it is not a significant change. Most of the added code was simply to get our inputs and the rowid range to process. The only change to our logic was the addition of the predicate on lines 12 and 13.

Now let's schedule our job. We'll use a rather complex query using analytics to divide the table. The innermost query on lines 19 through 26 breaks the data into eight groups in this case. The first sum on line 22 is computing a running total of the sum of blocks; the second sum on line 23 is the total number of blocks. If we integer divide the running total by the desired "chunk size" (the total size divided by 8 in this case), we can create groups of files/blocks that cover about the same amount of data. The query on lines 8 through 28 finds the high and low file numbers and block numbers by GRP and returns the distinct entries. . It builds the inputs we can then send to DBMS_ROWID to create the rowids Oracle wants. We take that output and, using DBMS_JOB, submit a job to process the rowid range: 

big_table-ORA10G> declare
  2          l_job number;
  3  begin
  4  for x in (
  5  select dbms_rowid.rowid_create
            ( 1, data_object_id, lo_fno, lo_block, 0 ) min_rid,
  6         dbms_rowid.rowid_create
            ( 1, data_object_id, hi_fno, hi_block, 10000 ) max_rid
  7    from (
  8  select distinct grp,
  9         first_value(relative_fno) 
              over (partition by grp order by relative_fno, block_id
 10           rows between unbounded preceding and unbounded following) lo_fno,
 11         first_value(block_id    ) 
              over (partition by grp order by relative_fno, block_id
 12           rows between unbounded preceding and unbounded following) lo_block,
 13         last_value(relative_fno) 
              over (partition by grp order by relative_fno, block_id
 14           rows between unbounded preceding and unbounded following) hi_fno,
 15         last_value(block_id+blocks-1) 
              over (partition by grp order by relative_fno, block_id
 16           rows between unbounded preceding and unbounded following) hi_block,
 17         sum(blocks) over (partition by grp) sum_blocks
 18    from (
 19  select relative_fno,
 20         block_id,
 21         blocks,
 22         trunc( (sum(blocks) over (order by relative_fno, block_id)-0.01) /
 23                (sum(blocks) over ()/8) ) grp
 24    from dba_extents
 25   where segment_name = upper('BIG_TABLE')
 26     and owner = user order by block_id
 27         )
 28         ),
 29         (select data_object_id 
               from user_objects where object_name = upper('BIG_TABLE') )
 30  )
 31  loop
 32          dbms_job.submit( l_job, 'serial(JOB);' );
 33          insert into job_parms(job, lo_rid, hi_rid)
 34          values ( l_job, x.min_rid, x.max_rid );
 35  end loop;
 36  end;
 37  /
PL/SQL procedure successfully completed. 

That PL/SQL block would have scheduled up to eight jobs for us (fewer if the table could not be broken in to eight pieces due to insufficient extents or size). We can see how many jobs were scheduled and what their inputs are as follows:

big_table-ORA10G> select * from job_parms;

       JOB LO_RID             HI_RID
---------- ------------------ ------------------
       172 AAAT7tAAEAAAAkpAAA AAAT7tAAEAAABQICcQ
       173 AAAT7tAAEAAABQJAAA AAAT7tAAEAAABwICcQ
       174 AAAT7tAAEAAABwJAAA AAAT7tAAEAAACUICcQ
       175 AAAT7tAAEAAACUJAAA AAAT7tAAEAAAC0ICcQ
       176 AAAT7tAAEAAAC0JAAA AAAT7tAAEAAADMICcQ
       177 AAAT7tAAEAAADaJAAA AAAT7tAAEAAAD6ICcQ
       178 AAAT7tAAEAAAD6JAAA AAAT7tAAEAAAEaICcQ
       179 AAAT7tAAEAAAEaJAAA AAAT7tAAEAAAF4ICcQ
8 rows selected.

big_table-ORA10G> commit;
Commit complete.

That commit released our jobs for processing. We have JOB_QUEUE_PROCESSES set to 0 in the parameter file, so all eight started running and shortly finished. The results are as follows:

big_table-ORA10G> select session_id, count(*)
  2    from t2
  3   group by session_id;

SESSION_ID   COUNT(*)
---------- ----------
       172     130055
       173     130978
       174     130925
       175     129863
       176     106154
       177     140772
       178     140778
       179      90475
8 rows selected.

It's not as evenly distributed as the Oracle built-in parallelism in this case, but it's pretty good. If you recall, earlier we saw how many rows were processed by each parallel execution server and, using the built-in parallelism, the row counts were very close to each other (they were off only by one or two). Here we had a job that processed as few as 90,475 rows and one that processed as many as 140,778. Most of them processed about 130,000 rows in this case.

Suppose, however, that you do not want to use the rowid processing-perhaps the query is not as simple as SELECT * FROM T and involves joins and other constructs that make using the rowid impractical. You can use the primary key of some table instead. For example, say you want to break that same BIG_TABLE into ten pieces to be processed concurrently by primary key. You can do that easily using the NTILE built-in analytic function. The process is rather straightforward:

big_table-ORA10G> select nt, min(id), max(id), count(*)
  2    from (
  3  select id, ntile(10) over (order by id) nt
  4    from big_table
  5         )
  6   group by nt;

        NT    MIN(ID)    MAX(ID)   COUNT(*)
---------- ---------- ---------- ----------
         1          1     100000     100000
         2     100001     200000     100000
         3     200001     300000     100000
         4     300001     400000     100000
         5     400001     500000     100000
         6     500001     600000     100000
         7     600001     700000     100000
         8     700001     800000     100000
         9     800001     900000     100000
        10     900001    1000000     100000

10 rows selected.

Now you have ten nonoverlapping primary key ranges, all of nice equal size, that you can use to implement the same DBMS_JOB technique as shown earlier to parallelize your process.

JOB_QUEUE_PROCESSES set to 0 ?

Arul Ramachandran, September 27, 2005 - 10:53 pm UTC

"We have JOB_QUEUE_PROCESSES set to 0 in the parameter file, so all eight started running and shortly finished"

I guess it's a typo. I think you meant 8 ?

September 28, 2005 - 9:20 am UTC

indeed, thanks

How to calculate parallel_max_servers value?

sami, September 28, 2005 - 10:02 pm UTC

We have 8 CPUs on each node (Sun-Fire-V490 -  3 Node RAC)
Oracle 9.2.0.6, Sun OS 2.9

Anytime we look at the number of parallel process, it is close to 50 on all 3 nodes. Only 5 to 10 tables and indexes have parallel degree 4 or 5. We are not using any parallel hint in application/query.

From Statspack Report:
======================
                                 Current      Maximum     Initial
Resource Name                  Utilization  Utilization  Allocation      Limit
------------------------------ ------------ ------------ ---------- ----------
parallel_max_servers                     52          101        101        101
          -------------------------------------------------------------

SQL> show parameter parallel
 
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback         string      LOW
log_parallelism                      integer     1
parallel_adaptive_multi_user         boolean     TRUE
parallel_automatic_tuning            boolean     TRUE
parallel_execution_message_size      integer     4096
parallel_instance_group              string
parallel_max_servers                 integer     100
parallel_min_percent                 integer     0
parallel_min_servers                 integer     0
parallel_server                      boolean     TRUE
parallel_server_instances            integer     3
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0

Questions
=========
1) Since we have  8 CPUs and parallel_threads_per_cpu=2, is there any advantage of having more than 16 parallel process?

2) We defined the "parallel_max_servers=100", is it the right value of 8 CPU machine? In other words, how to determine the correct value for "parallel_max_servers"?

Sometimes the number of parallel process exceeds 100 and we are getting error in  Oracle Advanced Replication which uses parallel degree 1.

September 29, 2005 - 6:47 am UTC

1) I'd say 16 is much closer to correct than 100, yes.

2) let it default. use the auto tuning stuff. set FEWER, not MORE parameters. in 9i, give parallel automatic tuning (not needed in 10g) and parallel adaptive multi user a chance.

having 100 processes all running simultaneously with 8 cpus must "squash" that machine.

(please *test*)

Thanks for the detailed information

JD, September 30, 2005 - 11:36 am UTC

Thanks Tom for the detailed information about pipelined functions and DIY technique. This is exactly I was looking for. I wish, I could have explained my problem in the first posting!

.. JD

Ik, October 10, 2005 - 6:17 am UTC

Tom,

Question regarding parallel ddl.

We run on 9i R2.

In the DDL given below:

CREATE TABLE t PARALLEL 4 AS SELECT /*+ PARALLEL(tab,4)*/ FROM t1 partition (p_t1) tab ;

In this, do we need to specify a parallel hint for the select statement? We are not seeing a difference in plan and assuming that the hint is not relevant for CTAS. Could you confirm this for us please?

The query selects data from a single partition. I believe that 9i onwards PARALLEL operations are supported on queries accessing single partitions as well. am i correct here? Do i need to alter table and set it as parallel explicitly to be able to fire a parallel query?

Thanks,

October 10, 2005 - 8:59 am UTC

</code> http://docs.oracle.com/docs/cd/B10501_01/server.920/a96520/tuningpe.htm#19664 <code>

excellent chapter to read.

the parallel on the create table part was sufficient.

I believe you mean "parallel DML modification (update, delete, merge) operations are supported without the need to physically partition the table"

parallel QUERY worked with or without partitioning and did not parallelize only on partition boundaries.

To have parallel plans generated - parallel query needs to be configured (init.ora) and you either hint it or have the base tables be "parallel" or use a sql statement that is parallelized like that CTAS.

Try using parallel_index hint

JD, October 10, 2005 - 3:55 pm UTC

Ik, October 11, 2005 - 5:20 am UTC

Tom,

Sorry for not being clear :

I wanted to know how different is

CREATE TABLE t PARALLEL 4 SELECT * from t1 partition (p1)

vs

CREATE TABLE t PARALLEL 4 SELECT /*+ PARALLEL(tab,4)*/ * from t1 partition (p1) tab

My question was; if i omit the parallel hint in the select statement - would the select statement then run in serial?

One more question is: would this select (if in parallel) bypass the buffer cache? Direct reads?

Thanks,

October 11, 2005 - 6:48 am UTC

The parallel on the create table (not the hint) was sufficient to create the table in parallel.

That select has the opportunity to bypass the buffer cache and do direct reads, yes.

Thanks very much

Ik, October 11, 2005 - 12:00 pm UTC

Parallel query in 10g

John Calvin, December 13, 2005 - 10:31 am UTC

In Oracle 10g R2, parallel_automatic_tuning is deprecated.
In your initial response to this you had recommended using this parameter to setup parallel query.

What do you recommend in 10g R2?

Thanks.

December 13, 2005 - 12:58 pm UTC

</code> http://docs.oracle.com/docs/cd/B19306_01/server.102/b14223/usingpe.htm#i1007254 <code>

parallel adaptive multi user does what we want.

parallel_min_percent

Libo Zhang, March 13, 2006 - 11:28 am UTC

Hi Tom,

I have a need to create a global partitioned index on a large partitioned IOT table, with about 800million rows and size of 146G.

I used options, parallel 16, nologging, compress , and created a 100G dedicated LMT temp tablespace for it. However, during the index creation I did not see any parallelism kicking in. I only see one active session, which means it is creating the index serially which took about 8 hours to complete.

Someone suggest me to use parallel_min_percent=50 instead of 0, Setting PARALLEL_MIN_PERCENT to 0 means that if while creating the index, Oracle would haven't got enough resource then it will execute the operation in serial rather than parallel.

Now, my question is:
1. Will parallel_min_percent help? since I already set up
parallel_automic_tuning=true.
2. if I set parallel_min_percent = 50, I still need enough resource for DOP of 8 out of 16, otherwise , oracle will run serially. Now what is the value for parallel_min_percent will allow oracle the flexibility to kick in any parallelism when it has only that enough resource, will that be 1? if 1 is the magic number , can we set this at instance level, that will take care this issue once for all, right?

2. is setting the value for parallel_min_percent to 0 as the same as setting the value to 100? my logic is, if I set this to parallel_min_percent = 75 than oracle needs at least enough resource for 12 out of 16 of the DOP, right?

3. Say, if I set this value to 50, and oracle do have enough resource for DOP 8 out of 16, oracle will run in DOP of 8. What if afterwards, oracle have enough resource to run DOP of 12 out 16, will oracle adjust to that 12 DOP from 8 since parallel_automatic_tuning is set to TRUE?

Thanks for your help.
Libo Zhang
libo.zhang@reuters.com

Direct read

A reader, April 07, 2006 - 11:10 am UTC

Tom:

I have a query like below:

SELECT /*+ ordered  parallel_index(PT PT_TABLE_IX1,20,1)  */ 
Col1, col2, col3, col4
FROM TEMPORARY_TABLE tmp,  PARTITIONED_TABLE PT, LOOK_UP_TABLE1, LOOK_UP_TABLE2
where <Join condition>


There will be 60000*30 records in the TEMPORARY_TABLE which needs to be scanned through the entire PARTITIONED_TABLE.

The above parallelism works, but with some issues:

1. The explain plan shows the table 'PT' is accessed by INDEX ROWID
2. TEMPORARY TABLE is broadcast to all parallel processes and then each of them seems to be running the query.


I have two questions:

1.  The trace and explain plan show lot more consistent gets(logical reads). However I need a direct read (eliminate the buffered read) so that i can reduce the contention on the buffer pool. How is it possible to ensure that we can force to do a direct read rather than logical read? 

2. I tried to use ROWID(PT) hint but to no avail. Is it possible to force the buffered read? If so how? (Does the Table access through INDEX ROWID via index suggest that it is a physical index read than a logical read? In the trace file I do not see waits on direct read(its not the top wait), but rather huge waits on sequential reads(I want it the other way round...huges waits on direct read and less on sequential and scattered reads).
 

Please see the output of the below queries(I am running the current one with less DOP and the TEMPORARY TABLE with 99999 rows)


select p.server_name,
sql.sql_text
from v$px_process p, v$sql sql, v$session s
WHERE p.sid = s.sid
and p.serial# = s.serial#
and s.sql_address = sql.address
and s.sql_hash_value = sql.hash_value
/




SERV
----
SQL_TEXT
--------------------------------------------------------------------------------
P024
SELECT /*+ ordered  parallel_index(PT PT_TABLE_IX1,20,1)  */ 
Col1, col2, col3, col4
FROM TEMPORARY_TABLE tmp,  PARTITIONED_TABLE PT, LOOK_UP_TABLE1, LOOK_UP_TABLE2
where <Join condition>


SERV
----
SQL_TEXT
--------------------------------------------------------------------------------

P025
SELECT /*+ ordered  parallel_index(PT PT_TABLE_IX1,20,1)  */ 
Col1, col2, col3, col4
FROM TEMPORARY_TABLE tmp,  PARTITIONED_TABLE PT, LOOK_UP_TABLE1, LOOK_UP_TABLE2
where <Join condition>






SQL> SELECT dfo_number, tq_id, server_type, process, num_rows, bytes
FROM v$pq_tqstat
ORDER BY dfo_number DESC, tq_id, server_type DESC, process;  2    3  

DFO_NUMBER      TQ_ID SERVER_TYP PROCESS      NUM_ROWS      BYTES
---------- ---------- ---------- ---------- ---------- ----------
         1          0 Producer   QC             199998    3598832
         1          0 Consumer   P024            99999    1799416
         1          0 Consumer   P025            99999    1799416
         1          1 Producer   P024             1103      45698
         1          1 Producer   P025              376      15598
         1          1 Consumer   QC               1479      61296

6 rows selected.


Thanks,

A reader, April 10, 2006 - 1:13 pm UTC

Tom:

Not sure whether you got a chance to look at my previous posting or it is something wrong with my understanding of parallel concepts, I would like to know about direct reads and how to achive them.

Thanks,

April 11, 2006 - 10:55 am UTC

I do not respond to each and every review/followup - sometimes I'm just going over them too fast to give a meaningful answer.

the logical reads likely come from any index access you are doing, direct reads are for "big" things, not small things (and likely for small things you would much prefer to go to the cache than to disk over and over)

A reader, April 11, 2006 - 11:00 am UTC

Thanks Tom.

Alex, April 13, 2006 - 9:31 am UTC

Hi,

I'm having problems getting Oracle to use my parallel hints on complex queries. I thought I'd ask some general questions to see if I can figure this out on my own.

When I enable parallel at the table level, the optimizer will use a PQ plan. When I use a hint for the same table, same PQ level it will not. Assume that I am using the correct command, and that PQ would be benifical, do you know what could cause the difference?

I ran one of your tests above for table T and did select /*+ parallel(t) */ count(*) from t and it worked fine. So I don't know what the problem could be because

1)I know I have PQ enabled because the test worked.
2)I know PQ would be helpful in my complex query because at the table level, it will use PQ.

I'm hoping you can provide me with some ideas, some things to check for. Thanks a lot.

April 14, 2006 - 11:37 am UTC

it would be because your parallel hints were not "valid" - there would have been something preventing them from being used.

for example:

create table t ( x int );
create index t_idx on t(x);

select /*+ INDEX( T T_IDX ) */ * from t order by x;

note: updated the above sample query because the first one was wrong :) Thanks Alberto for the email.... I used to have "where x > 0" which of course could use the index - meant "order by x" which cannot use the index!

that will refuse totally to use the index - why? because X is nullable and an entirely null entry in a b*tree index is not made, so the index isn't indexing all of the rows.

There must be something invalid about your hint.

Alex, April 14, 2006 - 12:04 pm UTC

Thank you sir. A couple of quick follow up questions though.

1)I understand your index example, but what I don't understand with parallel is how could that effect results and cause that to be an invalid option? Your index example makes sense because Oracle couldn't get the correct answer by using that index, but with PQ, it's just more threads retrieving the same results right?

2)Can I just use select /*+ parallel */ from tbl t1, tbl2 t2... and let the optimizer choose what table would be best to use PQ or do we have to specify a table?

Thanks.

April 14, 2006 - 12:49 pm UTC

1) you have to use well formed hints that can be used and are so specific that nothing else could be considered.

2) to be well formed, the parallel hint wants a tablespec.
</code> http://docs.oracle.com/docs/cd/B19306_01/server.102/b14200/sql_elements006.htm#SQLRF50801 <code>

Transaction Free Lists

Naresh, April 17, 2006 - 9:28 am UTC

Hello Tom,

From the 9i DW Guide, Chapter on "Using Parallel Execution":

One of the parameters that impact 'Default Degree of Parallelism' is:

For parallel DML operations with global index maintenance, the minimum number of transaction free lists among all the global indexes to be updated. The minimum number of transaction free lists for a partitioned global index is the minimum number across all index partitions. This is a requirement to prevent self-deadlock.

What are the transaction free lists - are they ITL slots, or Free Lists (for free blocks)?

Thanks,
Naresh.

April 17, 2006 - 10:03 am UTC

from note 1029850.6

.....
The Transaction Free Lists:
===========================

Implicitly allocated by Oracle as needed. A Transaction Free List is a free
list dedicated to one transaction and only one. There are a minimum of 16
transactions free lists per segment, and this number increases as long as it
is necessary, until it reaches the limit of the Segment Header block size.

A transaction needs to allocate a Tx Free Lists entry when:

o It releases space in a block (DELETE or UPDATE)
o And if it has not already allocated one.

..........

Documentation Bug ?

Jonathan Lewis, April 17, 2006 - 1:55 pm UTC

The quoted text from the DW Guide says the degree of parallelism is:

"For parallel DML operations with global index maintenance, the minimum number of transaction free lists among all the global indexes to be updated."

I suspect this should read minimum value of INITRANS across the global indexes, (i.e. interested transaction list not transaction free list). The point about deadlocks is that if you have INITRANS = N but N+1 parallel slaves all try to delete a row from the same index block, and the block is full, then the N+1th update cannot take place until the global transaction commits - but the global transaction won't commit until the N+1th slave has completed its update.

Possible documentation bug ? I think I checked this once, but a long time ago and I can't find the test case at present.

rownum effects

Naresh, April 18, 2006 - 8:04 am UTC

hi Tom,

I have a script like below to populate a table from another table, and it includes a rownum to generate a synthetic primary key.

alter table cnv_resource parallel 4;
alter table cnv_res2 parallel 4;
alter session enable parallel dml; -- probably no use since cnv_res2 is not partitioned

  1  insert /*+ PARALLEL (cnv_resource, 4) */
  2   into cnv_res2(RESOURCE_ID, RESOURCE_VALUE, RESOURCE_STATUS)
  3  select /*+ PARALLEL (cnv_resource, 4) */
  4   100000000 + rownum, RESOURCE_VALUE, RESOURCE_STATUS
  5* from cnv_resource
SQL> set timing on
SQL> /

5545961 rows created.

Elapsed: 00:00:42.68

Execution Plan
----------------------------------------------------------
   0      INSERT STATEMENT Optimizer=CHOOSE (Cost=1 Card=82 Bytes=21648)
   1    0   COUNT
   2    1     TABLE ACCESS* (FULL) OF 'CNV_RESOURCE' (Cost=1 Card=82 B :Q165820
          ytes=21648)                                                  00

   2 PARALLEL_TO_SERIAL            SELECT /*+ NO_EXPAND ROWID(A1) */ A1."RESOUR
                                   CE_VALUE",A1."RESOURCE_STATUS" FROM

Repeated tests showed:
3.24 minutes without parallel
40 seconds with parallel
Second execution without parallel 2.04 minutes
Second execution with parallel 38 seconds
Third execution with parallel 42 seconds (this one was without APPEND to generate a plan)

The parallel was always faster, though I was expecting the rownum to serialize everything, including the scan.

QUESTIONs:

1) Is my interpretation of the plan below correct?

a) This part means that a parallel scan was done for the cnv_resource Table.

   2    1     TABLE ACCESS* (FULL) OF 'CNV_RESOURCE' (Cost=1 Card=82 Bytes=21648) :Q16582000
   2 PARALLEL_TO_SERIAL            SELECT /*+ NO_EXPAND ROWID(A1) */ A1."RESOURCE_VALUE",A1."RESOURCE_STATUS" FROM

b) This part means that the generation of the synthetic key with rownum and the insert itself was serial.

   0      INSERT STATEMENT Optimizer=CHOOSE (Cost=1 Card=82 Bytes=21648)
   1    0   COUNT

So, the speed up seems to have been only due to parallel scan - it seems from this that the step of reading from Disk (or FS Buffers) into the program memory (PGA?) seems to be the major time consumer from this?


2) If I run only the select in parallel, it takes about 2 minutes. For the insert (with or without APPEND), it takes about 40 seconds. What could be the reson for the difference? Does it have to do with sending output the the screen? (I have "set autotrace trace").

Thanks,
Naresh

April 18, 2006 - 10:00 am UTC

the count step was where the rownum was assigned - by default parallel insert like that will be direct pathed.

Documentation Bug - revisited.

Jonathan Lewis, April 18, 2006 - 1:08 pm UTC

Following an email from Tom, I did a couple of quick tests on a 9.2.0.6 database with 8KB blocks, using a table with 256 partitions a couple of global indexes, setting parallel_max_servers = 128.

The maximum value I was allowed to set freelists to on the indexes was 99. When I tried to delete with a degree of parallel 128, I got an actual degree of 101. When I set freelists to 95, I got an actual degree of 105. It looks as if an 8KB segment header block has an implicit limit of 200 slots for freelists, and the degree is limited to 200 minus your setting of freelists (You get SEG freelists up to your setting in the block dump, then XCT freelists).

So it looks as if the documentation is correct - but perhaps only relevant in extreme cases.

As a further test (of my original assumption that the limit was set by initrans) I found that I could get a deadlock message by running with a degree of parallelism greater than the value of MAXTRANS, e.g. degree 6 when one of the indexes has MAXTRANS set to 4 resulted in:

ORA-12801: error signaled in parallel query server P003
ORA-12829: Deadlock - itls occupied by siblings at block 20491 of file 9

So even though I'm wrong (about INITRANS), I ought to be right. I would have thought that risks of ITL space problems are much more likely than risks of XCT freelist problems. Indexes and tables treat INITRANS very differently, which means an index block could be full even before its value of INITRANS is reached - which is why I would hope to base the critical test on INITRANS.

April 18, 2006 - 3:34 pm UTC

I loved that last paragraph :)

<quote>
So even though I'm wrong (about INITRANS), I ought to be right.
</quote>

thanks for the followup, really appreciate it.

HOW CAN WE VERIFY

jasdeep, May 12, 2006 - 2:53 pm UTC

how can we verify that parallel in insert has been used or not.
will it show in explain plan.
can u give an example

May 12, 2006 - 9:35 pm UTC

just try to query the table :) it'll fail if it was parallel

How can we verify

jasdeep, May 15, 2006 - 11:24 am UTC

Fail ?
why ?

May 15, 2006 - 11:26 am UTC

the query will fail with:

ops$tkyte@ORA10GR2> select * from t;
select * from t
              *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

How to check if executed parallely or not

Shalini, July 06, 2006 - 4:01 am UTC

Tom,

We have following settings in our database.
SQL> show parameter parallel

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------
fast_start_parallel_rollback         string      LOW
optimizer_percent_parallel           integer     0
parallel_adaptive_multi_user         boolean     TRUE
parallel_automatic_tuning            boolean     TRUE
parallel_broadcast_enabled           boolean     FALSE
parallel_execution_message_size      integer     4096
parallel_instance_group              string
parallel_max_servers                 integer     100
parallel_min_percent                 integer     0
parallel_min_servers                 integer     5
parallel_server                      boolean     FALSE
parallel_server_instances            integer     1
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0

And we have inserting into a table - which is having 32 partitions and 1024 subpartitions with the following SQL(thru a overnight job)
INSERT /*+ APPEND */ into ... 

Will it be executing in parallel ?

We have come across a very strange scenario where we are hitting maximum DML Locks when we are running the above job, which is in turn firing the INSERT. I read on metalink that PDML on Partitions uses much more DML Locks and hence wanted to confirm if the above situation is a PDML. Unfortunately the job runs within seconds for us to monitor it, also it runs overnight,...

Thank you,
With Regards,
Shalini.

July 08, 2006 - 9:36 am UTC

depends on the settings of the underlying tables. Are they parallel enabled - did you enable parallel dml in your session.

query v$sql_plan to see the plan that has been used for the query.

Max parallel processes

Fat geek, July 10, 2006 - 5:22 pm UTC

We have a new system recently that has gone into production using parallel_automatic_tuning as recommended. We have the following setting (the server has 8 itanium CPUs).

SQL> show parameter parallel

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback         string      LOW
log_parallelism                      integer     1
parallel_adaptive_multi_user         boolean     TRUE
parallel_automatic_tuning            boolean     TRUE
parallel_execution_message_size      integer     4096
parallel_instance_group              string
parallel_max_servers                 integer     80
parallel_min_percent                 integer     0
parallel_min_servers                 integer     32
parallel_server                      boolean     FALSE
parallel_server_instances            integer     1
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0

We've been having some slowdowns and I ran a query (see below) to see what was waiting and got a response showing that one session had spawned some 60 parallel processes while all the others were running in serial (tables are set up with degree default). 

I wonder if you get diminishing returns from a large number of parallel processes handling one query like this. I saw in a response in this thread that you thought that a max of 16 parallel threads would be more appropriate for 8 cpus than 100. Our Oracle (i.e. from Oracle NZ) consultant said we should have a max of 160 (no of cpus x threads per cpu x 10), we currently have 80.

We are using resource manager to manage cpu usage to stop any loads occurring during the day from hogging the cpu and I'm inclined to also set a limit on the number of parallel processes per session to 8. Is this reasonable of should I just let Oracle do its thing? 

You said 16 seemed a better parallel_max_servers that 100 for an 8 cpu machine. Does 80 then sound too many?

SQL>   @whats_waiting.sql

TO_CHAR(SYSDATE,'D
------------------
11 JUL 06 08:59:12


PX_SER PX_STATUS      X_SID      P_SID OSUSER   CHILD_WAIT                                    PARENT_WAIT
------ --------- ---------- ---------- -------- --------------------------------------------- ---------------------------------------------
P000   IN USE            64         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P001   IN USE            49         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P002   IN USE            68         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P003   IN USE            59         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P004   IN USE            70         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P005   IN USE            58         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P006   IN USE            37         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P007   IN USE            69         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P008   IN USE            87         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P009   IN USE            79         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P010   IN USE            85         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P011   IN USE            62         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P012   IN USE            78         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P013   IN USE            90         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P014   IN USE            86         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P015   IN USE            53         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P016   IN USE            46         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P017   IN USE            39         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P018   IN USE           105         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P019   IN USE            57         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P020   IN USE            81         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P021   IN USE           116         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P022   IN USE            47         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P023   IN USE            94         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P024   IN USE           117         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P025   IN USE            77         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P026   IN USE           100         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P027   IN USE           114         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P028   IN USE            50         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P029   IN USE            96         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P030   IN USE            80         20 edrp     PX Deq: Table Q Normal                        resmgr:waiting in end wait
P031   IN USE            82         20 edrp     PX Deq: Table Q Normal                        resmgr:waiting in end wait
P032   IN USE            61         20 edrp     PX Deq: Table Q Normal                        resmgr:waiting in end wait
P033   IN USE            99         20 edrp     PX Deq: Table Q Normal                        resmgr:waiting in end wait
P034   IN USE           119         20 edrp     resmgr:waiting in end wait                    resmgr:waiting in end wait
P035   IN USE            63         20 edrp     PX Deq: Table Q Normal                        resmgr:waiting in end wait
P036   IN USE            93         20 edrp     PX Deq: Table Q Normal                        resmgr:waiting in end wait
P037   IN USE            83         20 edrp     PX Deq: Table Q Normal                        resmgr:waiting in end wait
P038   IN USE            65         20 edrp     PX Deq: Table Q Normal                        resmgr:waiting in end wait
P039   IN USE            60         20 edrp     PX Deq: Table Q Normal                        resmgr:waiting in end wait
P040   IN USE            45         20 edrp     PX Deq: Table Q Normal                        resmgr:waiting in end wait
P041   IN USE            44         20 edrp     PX Deq: Table Q Normal                        resmgr:waiting in end wait
P042   IN USE            36         20 edrp     PX Deq: Table Q Normal                        resmgr:waiting in end wait
P043   IN USE            88         20 edrp     PX Deq: Table Q Normal                        resmgr:waiting in end wait
P044   IN USE            31         20 edrp     PX Deq: Table Q Normal                        resmgr:waiting in end wait
P045   IN USE           106         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P046   IN USE           109         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P047   IN USE            34         20 edrp     direct path read                              resmgr:waiting in end wait
P048   IN USE            55         20 edrp     direct path read                              resmgr:waiting in end wait
P049   IN USE            67         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P050   IN USE            41         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P051   IN USE            28         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P052   IN USE            89         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P053   IN USE           111         20 edrp     direct path read                              resmgr:waiting in end wait
P054   IN USE            71         20 edrp     direct path read                              resmgr:waiting in end wait
P055   IN USE            24         20 edrp     direct path read                              resmgr:waiting in end wait
P056   IN USE            54         20 edrp     direct path read                              resmgr:waiting in end wait
P057   IN USE           107         20 edrp     direct path read                              resmgr:waiting in end wait
P058   IN USE            43         20 edrp     PX Deq: Execution Msg                         resmgr:waiting in end wait
P059   IN USE           103         20 edrp     direct path read                              resmgr:waiting in end wait
Serial                   20          0 edrp     resmgr:waiting in end wait
Serial                   21          0 ollivs   resmgr:waiting in end wait
Serial                   28          0 edrp     resmgr:waiting in end wait
Serial                   31          0 edrp     resmgr:waiting in end wait
Serial                   34          0 edrp     resmgr:waiting in check2
Serial                   36          0 edrp     resmgr:waiting in end wait
Serial                   38          0 edrp     resmgr:waiting in check2
Serial                   41          0 edrp     resmgr:waiting in check2
Serial                   43          0 edrp     resmgr:waiting in end wait
Serial                   54          0 edrp     resmgr:waiting in check2
Serial                   55          0 edrp     resmgr:waiting in check2
Serial                   61          0 edrp     resmgr:waiting in end wait
Serial                   62          0 edrp     resmgr:waiting in end wait
Serial                   65          0 edrp     resmgr:waiting in end wait
Serial                   66          0 edrp     resmgr:waiting in end wait
Serial                   67          0 edrp     resmgr:waiting in check2
Serial                   71          0 edrp     resmgr:waiting in check2
Serial                   73          0 edrp     resmgr:wait in actses run
Serial                   82          0 edrp     resmgr:waiting in check2
Serial                   86          0 edrp     resmgr:waiting in end wait
Serial                   88          0 edrp     resmgr:waiting in end wait
Serial                   89          0 edrp     resmgr:waiting in end wait
Serial                  102          0 edrp     resmgr:waiting in end wait
Serial                  103          0 edrp     resmgr:waiting in check2
Serial                  106          0 edrp     resmgr:waiting in check2
Serial                  107          0 edrp     resmgr:waiting in check2
Serial                  109          0 edrp     resmgr:waiting in check2
Serial                  111          0 edrp     resmgr:waiting in end wait

88 rows selected.


TO_CHAR(SYSDATE,'D
------------------
11 JUL 06 09:01:45

select to_char(sysdate,'DD MON YY HH24:MI:SS') from dual;
set lines 155
set pages 1000
column OSUSER format a8
column CHILD_WAIT format a45
column PARENT_WAIT format a45
        select 'Serial' as px_server
        , null as px_status
        , w.sid as x_sid
        , 0 as p_sid
             , v.osuser
             , w.event as child_wait
        , ' ' as parent_wait
        from  v$lock l
            , v$session v
            , v$session_wait w
        where v.sid = w.sid(+)
   and   v.sid = l.sid(+)
        and   nvl(l.type,'PS') = 'PS'
   and   (v.status = 'ACTIVE')
   and   w.event not like 'PX%'
   and   osuser !='oracle'
UNION
        select x.server_name as px_server
             , x.status as px_status
             , x.sid as x_sid
             , w2.sid as p_sid
             , v.osuser
             , w1.event as child_wait
             , w2.event as parent_wait
        from  v$px_process x
            , v$lock l
            , v$session v
            , v$session_wait w1
            , v$session_wait w2
        where x.sid <> l.sid(+)
        and   to_number (substr(x.server_name,2)) = l.id2(+)
        and   x.sid = w1.sid(+)
        and   l.sid = w2.sid(+)
        and   x.sid = v.sid(+)
        and   nvl(l.type,'PS') = 'PS'
        order by 1,2
/
select to_char(sysdate,'DD MON YY HH24:MI:SS') from dual;

July 11, 2006 - 7:37 pm UTC

if you have 8 cpus, anything over parallel 16 probably does not make sense.

why 80???? what drove you to that really high number on so few cpus?

parallel_max_proceses

Fat Geek, July 16, 2006 - 12:19 am UTC

The reason we set it to 80 was that our Oracle consultant (i.e. employee of Oracle) set it to 80 on our dev box (which has 4 cpus) as he said it should be 10 x no of cpus x threads per cpu.

When we created prod, the parameters were based on those on dev and I decided to keep it at 80 to begin with rather than the recommended 160 as it seemed a little high to me.

July 16, 2006 - 9:43 am UTC

think about what would happen on a 4 cpu machine with 80 things going on at the same time.

I know that is what would default - but 80 things just cannot "happen" on so few cpu's.

create index in parallel

Yoav, August 04, 2006 - 8:28 am UTC

Hi Tom,
We have HP-UX machine with 8 cpu.
I want to create an index using parallel option on
a table with 60 milion records.

In one session i run:
CREATE INDEX parallel_test ON t(update_date)
LOGGING
TABLESPACE INDEX05_TS
PCTFREE 5
INITRANS 5
MAXTRANS 255
STORAGE (
INITIAL 50M
NEXT 50M
MINEXTENTS 1
MAXEXTENTS 2147483645
PCTINCREASE 0
FREELISTS 5
FREELIST GROUPS 1
BUFFER_POOL DEFAULT
)
PARALLEL 4;

In a second session i run:
sql > select sid,serial#,qcsid,qcserial#,degree
2 from v$px_session;

no rows selected

select name,value from v$parameter where name like '%parallel%'

NAME VALUE
------------------------------ ---------
parallel_server FALSE
parallel_server_instances 1
recovery_parallelism 0
fast_start_parallel_rollback LOW
parallel_broadcast_enabled FALSE
parallel_adaptive_multi_user FALSE
parallel_threads_per_cpu 2
parallel_automatic_tuning FALSE
optimizer_percent_parallel 0
parallel_min_percent 0
parallel_min_servers 0
parallel_max_servers 0
parallel_instance_group
parallel_execution_message_siz 2152

(The database version is 8.1.7.4 and we are using optimizing_mode = rule)

I expect to get few rows back from v$px_session, but it didnt return any row.
Can you please explain why ?

Regards

August 04, 2006 - 8:45 am UTC

parallel_max_servers 0

that sort of says it all doesn't it :)

that is how to disable parallel operations....

ORA-12838 Cannot read/modify an object after a parallel modification...

Pete Beer, August 17, 2006 - 4:30 pm UTC

It seems that some responders don't realize that this error after parallel dml is only related to the current transaction. A commit (or rollback) will allow querying.

parallel dml might not be something you want to do if you need to verify reults before commiting.

Direct Path and in parallel

Arindam Mukherjee, August 20, 2006 - 4:15 am UTC

Respected Mr. Tom,

I read threads of your site just like Exam Paper and truly I feel I know nothing about Oracle. On reading this current thread, please let me ask you TWO questions that seem absolute enthralling to me. On August 19, 2005 under integrity constraints with append hint to this thread, your answer was as follows after editing for brevity.

Case 1
----------

ops$tkyte@ORA10G> create table t ( x int check (x > 0) );
Table created.
ops$tkyte@ORA10G> insert /*+ APPEND */ into t select 1 from dual;
1 row created.

ops$tkyte@ORA10G> select * from t;
select * from t
*
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

that shows we did a direct path, now:

Question 1:
**********
Did we really make it parallel by alter session enable parallel dml;?

Case 2
----------

ops$tkyte@ORA10G> create table t2 ( x int primary key );
Table created.
ops$tkyte@ORA10G> insert into t2 values ( 1 );
1 row created.
ops$tkyte@ORA10G> alter table t add constraint t_fk foreign key(x) references
t2;

ops$tkyte@ORA10G> insert /*+ APPEND */ into t select 1 from dual;

1 row created.

ops$tkyte@ORA10G> select * from t;

X
----------
1
1
We did not direct path because of the foreign key.

Question 2
*********
Why do you say that we did not direct path because of the foreign key when we used the hints /*+ APPEND */?

August 20, 2006 - 8:14 am UTC

well, I said it because it is "true" - you cannot do a direct path insert using INSERT /*+ APPEND */ in the presence of a foreign key constraint, - the append hint is silently ignored.

</code> http://docs.oracle.com/docs/cd/B19306_01/server.102/b14223/usingpe.htm#sthref2167 <code>

Concrete Proof

Arindam Mukherjee, August 20, 2006 - 8:52 am UTC

Respected Mr. Tom,

Thanks a lot for pointing out a cut and dried restriction on PDML in the Oracle Document. How do you remember all these things? You are really GREAT.

Parallel DDL but no parallel DML

Dave, August 21, 2006 - 1:27 pm UTC

Tom

I've got an insert statement that I can't get to perform parallel DML. I'm inserting into a parallel (degree 8) table. The table is totally "flat", no constraints or indexes.

ALTER SESSION ENABLE PARALLEL DML;

ALTER SESSION SET WORKAREA_SIZE_POLICY = MANUAL;

ALTER SESSION SET HASH_AREA_SIZE = 10000000;

INSERT
/*+ append
parallel (bl,8) */
INTO
brp_lines bl
SELECT
/*+ parallel (bt1,8)
parallel (bt2,8) */
<stuff>
FROM
big_table1 bt1,
big_table2 bt2
WHERE
<joins>

Run time 4 minutes

Although 8 parallel slaves are set up at run time to perform the insert, none of them perform any IO, all of which is done by the master process

When I run a CTAS using the same sql ie

CREATE TABLE brp_lines
PARALLEL (DEGREE 8)
TABLESPACE brp_lines
NOLOGGING
AS
SELECT
/*+ parallel (bt1,8)
parallel (bt2,8) */
<stuff>
FROM
big_table1 bt1,
big_table2 bt2
WHERE
<joins>

Run time 40 seconds

It performs exactly as I expect (2 big parallel selects, a big hash join and 8 slaves doing the write)

My question is, are there any obvious settings that would so dramatically influence the execution plan of the DML ITAS.

The reason I have tried to adopt the ITAS approach is because in real life BT1 and BT2 are very large tables (600m and 800m rows) partitioned on Year/Month. When I ran the CTAS as a serial operation they performed a nice partitionwise join, but as soon as I tried to apply parallel select to the partitioned tables I ended up with full table scans and a massive resulting hash join. I was trying to use ITAS and loop around doing one insert statement per year/month to emulate the partitionwise nature.

I have a full example if required but I thought maybe there was something stupidly obvious that I'm missing

Thanks Tom

August 27, 2006 - 9:59 am UTC

got plans? (format them so we can read them please)

parallel DML

A reader, October 13, 2006 - 12:18 am UTC

Hi Tom,

from metalink Note:50416.1

This style(direct path write) of write request is typically used for:

Direct load operations (eg: Create Table as Select (CTAS) may use this)
Parallel DML operations

does all types of parallel DML causing direct load , i understand that the default mode for insert when using in parallel is to do direct load ( correct me if i am wrong please ) , but what about update command when used in parallel , does it also do direct load ?

thanks

October 13, 2006 - 7:05 am UTC

updates/deletes do not direct path, they read/write into the "middle" of tables - things that can write above the high water marks can direct path.

Parallel ctas

Ramprasad, November 16, 2006 - 8:21 am UTC

Hi Tom,

Parallel CTAS does not run in parallelism.

    create table wsee_reformat2_tb
    tablespace work2_ts
    parallel (degree 32)
    nologging
    pctfree 0
    as
    (select /*+ use_hash(a,b,c) parallel(a,32) parallel(b,8) parallel(c, 8) */
       a.src_id,
       a.dataprep_seqno,
       a.abt_clink,
       a.abt_alink,
       a.abt_biz_link,
       a.BUREAU_ID,
       a.rms_INDIV_ID,
       lpad(b.biz_id,10,'0') biz_id ,
       lpad(c.hh_id,16,'0') hh_id,
       a.rms_seqno
    from rms_admin.reformat1_tb a,
         rms_admin.xref_biz_tb b,
         rms_admin.xref_hh_tb c
    where a.abt_biz_link = b.abt_biz_link(+)
     and   a.abt_alink = c.abt_alink(+))


SQL> select plan_table_output from table(dbms_xplan.display('PLAN_TABLE',null,'ALL'))


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------------------
| Id  | Operation              |  Name         | Rows  | Bytes |TempSpc| Cost  |  TQ    |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------------------
|   0 | CREATE TABLE STATEMENT |               |   773M|   108G|       |   147K|        |      |            |
|   1 |  LOAD AS SELECT        |               |       |       |       |       | 25,04  | P->S | QC (RAND)  |
|*  2 |   HASH JOIN OUTER      |               |   773M|   108G|  3111M|   147K| 25,04  | PCWP |            |
|*  3 |    HASH JOIN OUTER     |               |   773M|    88G|  2466M| 90389 | 25,02  | P->P | HASH       |
|   4 |     TABLE ACCESS FULL  | REFORMAT1_TB  |   773M|    68G|       |  8690 | 25,00  | P->P | HASH       |
|   5 |     TABLE ACCESS FULL  | XREF_HH_TB    |   277M|  7413M|       | 24934 | 25,01  | P->P | HASH       |
|   6 |    TABLE ACCESS FULL   | XREF_BIZ_TB   |    75M|  2005M|       | 13763 | 25,03  | P->P | HASH       |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"."ABT_BIZ_LINK"="B"."ABT_BIZ_LINK"(+))
   3 - access("A"."ABT_ALINK"="C"."ABT_ALINK"(+))

PX Slave SQL Information (identified by operation id):
------------------------------------------------------


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   1 - CREATE TABLE :Q41325004 TABLESPACE "WORK2_TS" PCTFREE 0 NOLOGGING AS SELECT
              C0,C1,C2,C3,C4,C5,C6,C7,C8,C9 FROM (SELECT /*+ ORDERED NO_EXPAND USE_HASH(A2) */ A1.C3 C0,A1.C4 C1,A1.C5
              C2,A1.C1 C3,A1.C0 C4,A1.C8 C5,A1.C9 C6,LPAD(A2.C1,10,'0') C7,LPAD(A1.C7,16,'0') C8,A1.C6 C9 FROM :Q41325002
              A1,:Q41325003 A2 WHERE A1.C0=A2.C0(+))

   3 - SELECT /*+ ORDERED NO_EXPAND USE_HASH(A2) */ A1.C5 C0,A1.C0 C1,A2.C0 C2,A1.C1 C3,A1.C2 C4,A1.C3
              C5,A1.C4 C6,A2.C1 C7,A1.C6 C8,A1.C7 C9 FROM :Q41325000 A1,:Q41325001 A2 WHERE A1.C0=A2.C0(+)

   4 - SELECT /*+ NO_EXPAND ROWID(A1) */ A1."ABT_ALINK" C0,A1."SRC_ID" C1,A1."DATAPREP_SEQNO"
              C2,A1."ABT_CLINK" C3,A1."RMS_SEQNO" C4,A1."ABT_BIZ_LINK" C5,A1."BUREAU_ID" C6,A1."RMS_INDIV_ID" C7 FROM
              "RMS_ADMIN"."REFORMAT1_TB" PX_GRANULE(0, BLOCK_RANGE, DYNAMIC)  A1

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

   5 - SELECT /*+ NO_EXPAND ROWID(A1) */ A1."ABT_ALINK" C0,A1."HH_ID" C1 FROM "RMS_ADMIN"."XREF_HH_TB"
              PX_GRANULE(0, BLOCK_RANGE, DYNAMIC)  A1

   6 - SELECT /*+ NO_EXPAND ROWID(A1) */ A1."ABT_BIZ_LINK" C0,A1."BIZ_ID" C1 FROM "RMS_ADMIN"."XREF_BIZ_TB"
              PX_GRANULE(0, BLOCK_RANGE, DYNAMIC)  A1


Note: cpu costing is off


Please correct me if I am doing wrong.

Thanks,
Ram

November 16, 2006 - 3:19 pm UTC

why do you think it is not?

Parallel CTAS

Ramprasad, November 17, 2006 - 7:46 am UTC

Hi Tom,

When the CTAS run, each slave builds its own TEMPORARY data segment. Upon completion, one TEMPORARY data segment is chosen as the segment for all other TEMPORARY segments to merge "INTO". The merged TEMPORARY data segment are
write to the base segment. During this step, should it be a parallel writes depending on the PARALLEL clause in the CTAS. I only see a single slave that is writing the merged TEMPORARY segment to the actual base segment. Please could you verifly if this should be the case?

Thank you,
Ram

November 17, 2006 - 8:11 am UTC

the rows are fed back from the parallel execution servers to the query coordinator and written out - the temp segments you see used by the parallel execution servers are for the intermediate hash joins

Duplicating rows in a table with 1.1 billion records

raj, November 20, 2006 - 6:49 pm UTC

I need to select a table that has 1.1 billion rows,
inserts back to the same table with just modifying one of the Key column value, only for the records that are exist in another small table.

Basically there is a small driver table with 1 million records in total and for this 1 million records there are 100 million records exist in the BIG table. Now I will have to duplicate those 100 million records with just modifying one of the key column value with the small table
column value.

In the following script, the GROUP_NR column in the BIG table is replaced with the NEW_GROUP_NR value from the DRIVER TABLE.

DECLARE
CURSOR c1 IS
SELECT ac_nr, old_group_nr, new_group_nr
FROM driver_table;

drv_count NUMBER := 0;

BEGIN
FOR rec IN c1 LOOP

drv_count := drv_count+1;

INSERT INTO big_table
SELECT AC_NR,
rec.new_group_nr,
STATUS_CD,
METHOD_TYPE,
ANYT_CODE,
ZONE,
ZONE_CODE,
UNIT_TYPE,
PORT_TYPE,
PCPPA,
COUNTRY,
CURRENCY,
PZTYPECD
FROM big_table
WHERE AC_NR = rec.ac_nr
AND GROUP_NR = rec.old_group_nr;

IF MOD(drv_count,10000) = 0 THEN
COMMIT;
END IF;

END LOOP;
COMMIT;
END;
/

The query uses the INDEX RANGE scan on the BIG TABLE.

We have tested this process with just 5000 records from
the driver table which processed 80000 records in the
BIG_TABLE and inserted 80000 records into the BIG_TABLE.
Based on the timing, the process will take more than 12 hours to process 1 million driver records and to insert 100 million records into the BIG table.

Can you please suggest us anything else to improve this process.

The following is the table structure and the index information.

DESC big_table
------------
AC_NR NOT NULL CHAR(10)
GROUP_NR NOT NULL CHAR(7)
STATUS_CD NOT NULL CHAR(1)
METHOD_TYPE NOT NULL CHAR(2)
ANYT_CODE NOT NULL NUMBER(3)
ZONE NOT NULL CHAR(6)
ZONE_CODE NOT NULL CHAR(2)
UNIT_TYPE NOT NULL CHAR(2)
PORT_TYPE NOT NULL CHAR(1)
PCPPA NOT NULL NUMBER(18,6)
COUNTRY NOT NULL CHAR(2)
CURRENCY NOT NULL CHAR(3)
PZTYPECD CHAR(2)

INDEX
---------
AC_NR
GROUP_NR
STATUS_CD
METHOD_TYPE
ANYT_CODE
ZONE

The table has 16 PARTITIONS (RANGE PARTITION)

The table is not created with PARALLEL.

SELECT degree FROM dba_tables
WHERE table_name = 'BIG_TABLE';

DEGREE
----------
1

But the index is created with 2 degree of parallelism.

SELECT degree FROM dba_indexes
WHERE index_name = 'BIG_TAB_IND';

DEGREE
----------
2

At the instance level, the following are the PARALLEL parameters setup.

SQL > SHOW PARAMETER PARALLEL

fast_start_parallel_rollback string LOW
log_parallelism integer 1
parallel_adaptive_multi_user boolean FALSE
parallel_automatic_tuning boolean FALSE
parallel_execution_message_size integer 2152
parallel_instance_group string
parallel_max_servers integer 5
parallel_min_percent integer 0
parallel_min_servers integer 0
parallel_server boolean FALSE
parallel_server_instances integer 1
parallel_threads_per_cpu integer 2
recovery_parallelism integer 0

The server is the HP-UX server (SUPERDOME server) with 20 CPUs.

Thanks
raj..

November 22, 2006 - 2:54 pm UTC

curious, why are you writing code for what should and could and must be done with a single sql statement?????

if you stop writing procedural code when you don't need to, you'll be more efficient and so will the database.

v

Bhagat Singh, January 24, 2007 - 3:08 am UTC

Hello Tom
I tried to do the simulation given by you in you new book on parallel enabled function as shown in you example there are several process which got spawn to do the necessary processing
but when I run the same I always get only two child process there there anything which requires to be adjusted before the simulation

select session_id,count(*) from t2 group by session_id;

SESSION_ID COUNT(*)
---------- ----------
24 14055
50 14017

Chi H, February 01, 2007 - 7:11 pm UTC

Tom,
We have parallel_adaptive_multi_user=true in our 9.2.0.6 database with parallel_max_servers=12. If my table has degree=1 (I didn't specify degree for the table), anytime I query with the /*+ parallel (t) */ hint, the engine tries to use the largest DOP possible. Whenever these queries run, they take up all parallel servers, and nothing is left for other processes.

Of course, we can explicitly declare the DOP by:
1) alter table degree=n
2) /*+ parallel (t, n) */

I want to avoid this and just have parallel_automatic_tuning and parallel_adaptive_multi_user, and allow the Oracle engine to figure out how to make the queries "play nice" without my intervention. How can we configure our database to do that?

February 02, 2007 - 10:37 am UTC

with adaptive multi-user - it would not be the case that they take up all of the parallel execution servers - it tries to be fair.

can you be more specific

Chi H, February 02, 2007 - 6:31 pm UTC

I saw an explanation on oracle-l:
http://www.freelists.org/archives/oracle-l/01-2007/msg00899.html

I tried testing it. Here's trying to be more specific:
1) I have a table that doesn't have DOP defined
2) I have max servers = 12
3) I have adaptive multi user = true

When I run
select /*+ parallel (t) */ count(*)
from t;

Oracle allocates all 12 parallel servers no matter the system load. We could have 20 sessions logged in and running queries, but parallel execution doesn't try to be fair. It will allocate all servers to my one query.

Maybe this behavior is different in 10g?

Sorry I couldn't come up with a test case for you. It's hard to do since there are parallel processes involved, and I'd have to build a whole simulation of multiple processes running.

Parallel process

Rajesh, February 20, 2007 - 11:00 pm UTC

Hi Tom,

SUBJECT:Parallel Slaves Session is Active while Master Session is Inactive.

In our application, we have parallel execution turned on and setting in init.ora is as below

Parallel_min_servers is 20
Parallel_max_servers is 40

The application is an OLAP application where in users run Segments and Campaigns.
What we have observed is when a user runs a Segment, which fires queries against the database, paralled slaves
are spawned for the query.

However, when the user cancels the segment through the Application UI, the master session becomes INACTIVE
while the session status of the parallel SLAVES is still ACTIVE but actually do nothing.

On executing query below for parallel slaves status

select
x.server_name
, x.status as x_status
, x.pid as x_pid
, x.sid as x_sid
, w2.sid as p_sid
, v.osuser
, v.schemaname
, w1.event as child_wait
, w2.event as parent_wait
from v$px_process x
, v$lock l
, v$session v
, v$session_wait w1
, v$session_wait w2
where x.sid <> l.sid(+)
and to_number (substr(x.server_name,2)) = l.id2(+)
and x.sid = w1.sid(+)
and l.sid = w2.sid(+)
and x.sid = v.sid(+)
and nvl(l.type,'PS') = 'PS'
and x.server_name in ('P017','P020','P021')
order by 1,2

I get the result as

SERVER_NAME X_STATUS X_PID X_SID P_SID OSUSER SCHEMANAME CHILD_WAIT PARENT_WAIT
P017 IN USE 30 53 25 n071946 SIEBEL PX Deq Credit: send blkd SQL*Net message from client
P020 IN USE 41 22 25 n071946 SIEBEL PX Deq Credit: send blkd SQL*Net message from client
P021 IN USE 43 13 25 n071946 SIEBEL PX Deq Credit: send blkd SQL*Net message from client

which shows that Parent Wait is basically an idle event.

On executing query "select * from v$session where sid=25" to get Master session details, i get below result

SADDR SID SERIAL# AUDSID PADDR USER# USERNAME COMMAND OWNERID TADDR LOCKWAIT
00000408BF633008 25 17 249619 00000408C0650320 25 SIEBEL 3 2147483644 null null

STATUS SERVER SCHEMA# SCHEMANAME OSUSER PROCESS MACHINE TERMINAL PROGRAM TYPE
INACTIVE DEDICATED 25 SIEBEL n071946 7432:7032 ACCOUNT-01\WSAPP1361 WSAPP1361 nqsserver.exe

When we kill the SID session of the slaves, no work is lost i.e. end users dont experience any problem or
get an error message saying that session has been killed.

Questions I have is

1) Why aren't the Parallel Slave sessions becoming INACTIVE and actually be available rather "IN USE" status i.e. Parallel
slaves should be in "AVAILABLE" status so that another query can use these slaves for processing.
2) Is there any setting to be done to solve this?

Thanks,
Rajesh

Parallel_enable in a package ??

Maverick, April 16, 2007 - 3:30 pm UTC

Tom,
I have a question regarding PARALLEL_ENABLE option in procedures/packages.

I currently have a package PKG_A with a procedure 'TEST'.
When a client calls this procedure, I am trying to do some kind of logging and for this logic i'm calling another procedure [Pkg_b.GetInfo] with in this 'TEST' procedure.
I want both these calls execute parallelly [Pkg_a.TEST should continue executing while PKG_B.getInfo does some other things]

Eg:

      Pkg_A.TEST->
               Begin
                  pkg_B.getInfo(param1,parm2);
                  
                  check_some_other_info
                  --and lots of other logic involved.

               End;

My questions is If I declare Pkg_B.getInfo procedure with a PARALLEL_ENABLE option,
Will it run parallelly when PKG_A.TEST is called or will there be some problems?

eg:

          Create or replace Package Pkg_B
          as
             Procedure Getinfo(param1 in Integer,
                               Param1 in integer)
             PARALLEL_ENABLE;
          end;

          Create or replace Package Pkg_B
          as
             Procedure Getinfo(param1 in Integer,
                               Param1 in integer)
             PARALLEL_ENABLE
             is

              Begin
                  Test(param1*param2);
              End;
          end;

Also, Am I Using Correct syntax? [Do I have to use Parallel Enable in both Spec and body?]

Any help is greatly appreciated.
Thanks,

April 17, 2007 - 9:29 am UTC

that is not what 'parallel enable' does, plsql is not a multi-threaded programming language (and even though java is, java in the database is not really multi-threaded either, it is a non-preemptive multitasking implementation)

parallel enable would allow your plsql package to be invoked in a parallel query, that is all.

plsql is a "linear", sequential programming language.

calling a "logging routine" should not impact your runtime measurably as far as the end user is concerned, keep it 'linear'

if you had something to invoke that took a long time and the end user didn't need to wait for it to complete, we would use AQ (advanced queues)

Thanks

Maverick, April 17, 2007 - 10:00 am UTC

Thanks for responding Tom. The reason why I wanted parellel processing was, My logging process is querying some Data dictionary objects and I don't want users to wait till it finishes logging[Thinking Querying data dictionary objects might effect performance] before it goes for actual processing.

I read in some places using dbms_job we can enable parallel processing. But I really do not want to use dbms_job for this scenario..i am not rescheduling or anything that sort..

If Advanced Queues is the only way, can you show me an example of how to approach for my problem using AQ's?

Thanks a bunch

April 17, 2007 - 10:19 am UTC

using AQ would be very similar to using dbms_job, it does not seem you want loosely coupled processing here.

why not code your logging routine and measure it's impact, that is, figure out if you have a problem to solve before solving the problem.

I can do that..but

A reader, April 17, 2007 - 12:50 pm UTC

Tom, Sure I can do that ..I can try and see if that's a problem at all. But that needs lot of time spent on testing and trying to simulate many users accessing this procedure and find out how long it takes..
I was thinking if I can find a solution even if there is no problem. With that,I'm saving lot of time for myself ..
That's the reason.

But what if there is an effect on performance? can you suggest any other approach to it from the top of your head?

Thanks,

April 18, 2007 - 11:05 am UTC

no, that is not the way we approach problems, you don't design a fix for a theory.

the fix adds a level of complexity you just don't want to get into. unless and until you need it.

You don't have to simulate many users, you can take many measurements before hand.
http://asktom.oracle.com/tkyte/runstats.html

Bump for a Review above

Mark Brady, August 13, 2007 - 12:44 pm UTC

<blockquote>
1) Why aren't the Parallel Slave sessions becoming INACTIVE and actually be available rather "IN USE" status i.e. parallel slaves should be in "AVAILABLE" status so that another query can use these slaves for processing.
</blockquote>

I have a PQ that shows the parent as inactive but the 5 slaves still show as STATUS = "ACTIVE". Are they actually active? i Didn't see any other indication that they were.

Enabling PDML in a session

Sri, February 18, 2008 - 5:47 pm UTC

Tom,

If you want your inserts,updates and deletes to be parallelized, is enabling PDML at session level mandatory in 10g also ? Because without enabling PDML at the session level I see the updates are parallelized when given a hint.
the version of the database is 10.1.0.4

Following is a small experiment to show that.

SQL> connect abc
Connected.
SQL> select sid from v$mystat where rownum = 1;

       SID                                                                      
----------                                                                      
       300                                                                      

SQL> select pdml_enabled from v$session where sid = 300;

PDM                                                                             
---                                                                             
NO                                                                              



SQL> create table myobj as select * from all_objects where 1 = 0;

Table created.

SQL> select count(*) from myobj;

  COUNT(*)                                                                      
----------                                                                      
         0                                                                      

SQL> exec dbms_stats.set_table_stats(user,'MYOBJ',numrows=>2000000,numblks=>20000)

PL/SQL procedure successfully completed.

SQL> set autotrace on explain statistics



SQL> update /*+ parallel(myobj) */ myobj
  2  set object_id=mod(object_id,10);

0 rows updated.


Execution Plan
----------------------------------------------------------                      
   0      UPDATE STATEMENT Optimizer=CHOOSE (Cost=108 Card=2000000 Byt          
          es=26000000)                                                          
                                                                                
   1    0   UPDATE OF 'MYOBJ'                                                   
   2    1     PX COORDINATOR                                                    
   3    2       PX SEND* (QC (RANDOM)) OF ':TQ10000' (Cost=108 Card=20 :Q1000   
          00000 Bytes=26000000)                                                 
                                                                                
   4    3         PX BLOCK* (ITERATOR) (Cost=108 Card=2000000 Bytes=26 :Q1000   
          000000)                                                               
                                                                                
   5    4           TABLE ACCESS* (FULL) OF 'MYOBJ' (TABLE) (Cost=108  :Q1000   
          Card=2000000 Bytes=26000000)                                          
                                                                                


   3 PARALLEL_TO_SERIAL                                                         
   4 PARALLEL_COMBINED_WITH_CHILD                                               
   5 PARALLEL_COMBINED_WITH_PARENT                                              


Statistics
----------------------------------------------------------                      
         45  recursive calls                                                    
          3  db block gets                                                      
         10  consistent gets                                                    
          0  physical reads                                                     
        636  redo size                                                          
        809  bytes sent via SQL*Net to client                                   
        721  bytes received via SQL*Net from client                             
          3  SQL*Net roundtrips to/from client                                  
          2  sorts (memory)                                                     
          0  sorts (disk)                                                       
          0  rows processed                                                     

SQL> show parameter parallel

NAME                                 TYPE        VALUE                          
------------------------------------ ----------- ------------------------------ 
fast_start_parallel_rollback         string      LOW                            
parallel_adaptive_multi_user         boolean     TRUE                           
parallel_automatic_tuning            boolean     FALSE                          
parallel_execution_message_size      integer     65536                          
parallel_instance_group              string                                     
parallel_max_servers                 integer     8                              
parallel_min_percent                 integer     0                              
parallel_min_servers                 integer     4                              
parallel_server                      boolean     FALSE                          
parallel_server_instances            integer     1                              
parallel_threads_per_cpu             integer     2                              

NAME                                 TYPE        VALUE                          
------------------------------------ ----------- ------------------------------ 
recovery_parallelism                 integer     0   
                           
SQL> select pdml_enabled from v$session where sid=300;

PDM                                                                             
---                                                                             
NO                                                                              


                                                                          

SQL> spool off

Can you please comment on this.

Thanks & regards

February 18, 2008 - 6:48 pm UTC

3 PARALLEL_TO_SERIAL

you have a parallel full scan that is serialized to an update.

versus, a parallel update

consider the differences in plans:

ops$tkyte%ORA10GR2> create table t as select * from all_objects where rownum <= 1;

Table created.

ops$tkyte%ORA10GR2> exec dbms_stats.set_table_stats( user, 'T', numrows => 1000000, numblks => 1000000 );

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> select pdml_enabled from v$session where sid = (select sid from v$mystat where rownum=1);

PDM
---
NO

ops$tkyte%ORA10GR2> delete from plan_table;

0 rows deleted.

ops$tkyte%ORA10GR2> explain plan for update /*+ parallel(t) */ t set object_id = object_id+1;

Explained.

ops$tkyte%ORA10GR2> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 1378397380

-------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time
-------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT      |          |  1000K|    12M| 42448   (1)| 00:03:2
|   1 |  UPDATE               | T        |       |       |            |
|   2 |   PX COORDINATOR      |          |       |       |            |
|   3 |    PX SEND QC (RANDOM)| :TQ10000 |  1000K|    12M| 42448   (1)| 00:03:2
|   4 |     PX BLOCK ITERATOR |          |  1000K|    12M| 42448   (1)| 00:03:2
|   5 |      TABLE ACCESS FULL| T        |  1000K|    12M| 42448   (1)| 00:03:2
-------------------------------------------------------------------------------

12 rows selected.

ops$tkyte%ORA10GR2> commit;

Commit complete.

ops$tkyte%ORA10GR2> alter session enable parallel dml;

Session altered.

ops$tkyte%ORA10GR2> delete from plan_table;

6 rows deleted.

ops$tkyte%ORA10GR2> explain plan for update /*+ parallel(t) */ t set object_id = object_id+1;

Explained.

ops$tkyte%ORA10GR2> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 2037160838

-------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time
-------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT      |          |  1000K|    12M| 42448   (1)| 00:03:2
|   1 |  PX COORDINATOR       |          |       |       |            |
|   2 |   PX SEND QC (RANDOM) | :TQ10000 |  1000K|    12M| 42448   (1)| 00:03:2
|   3 |    UPDATE             | T        |       |       |            |
|   4 |     PX BLOCK ITERATOR |          |  1000K|    12M| 42448   (1)| 00:03:2
|   5 |      TABLE ACCESS FULL| T        |  1000K|    12M| 42448   (1)| 00:03:2
-------------------------------------------------------------------------------

12 rows selected.

also consider this:

ops$tkyte%ORA10GR2> connect /
Connected.
ops$tkyte%ORA10GR2> select pdml_enabled from v$session where sid = (select sid from v$mystat where rownum=1);

PDM
---
NO

ops$tkyte%ORA10GR2> update /*+ parallel(t) */ t set object_id = object_id+1;

1 row updated.

ops$tkyte%ORA10GR2> select count(*) from t;

  COUNT(*)
----------
         1

ops$tkyte%ORA10GR2> commit;

Commit complete.

ops$tkyte%ORA10GR2> alter session enable parallel dml;

Session altered.

ops$tkyte%ORA10GR2> select pdml_enabled from v$session where sid = (select sid from v$mystat where rownum=1);

PDM
---
YES

ops$tkyte%ORA10GR2> update /*+ parallel(t) */ t set object_id = object_id+1;

1 row updated.

ops$tkyte%ORA10GR2> select count(*) from t;
select count(*) from t
                     *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

Parallel Processing

Sri, February 24, 2008 - 6:30 am UTC

Tom,

Oracle Datawarehousing guide has following statement in chapter 24.
quote:
"Note that the parallel execution coordinator and the parallel execution servers can only service one statement at a time. A parallel execution coordinator cannot coordinate, for example, a parallel query and a parallel DML statement at the same time."

What does that mean?

For example if we have insert into t select * from t2, does it mean that either only select portion or insert portion will be parallelized and not both ? Confused here.

Please help

Thanks & regards

February 24, 2008 - 11:57 am UTC

it means it is either

a) processing your update ( which might will include a read/query component)

b) processing your select

Just like a dedicated server can only do one thing at a time.

New Orcale formula to calculate PARALL_MAX_SERVER 10gR2

Yoav, March 06, 2008 - 4:49 am UTC

From July 11, 2006 :
"if you have 8 cpus, anything over parallel 16 probably does not make sense.
why 80???? what drove you to that really high number on so few cpus? "

I found this paper in metalink https://metalink.oracle.com/metalink/plsql/docs/Upgrade_Companion.pdf page 18:
"the default value of parallel_max_server has changed in oracle database 10g:
cpu_count*parallel_threads_per_cpu*(2 ig pga>0; otherwise 1) * 5
--------------------------------------------------------8*2*2*5=160
In a data warehouse database are you still think i should stick with parallel_max_server=16 and not 160 as
oracle suggested ?
Thank You.

March 06, 2008 - 8:23 am UTC

yes, i do - based on number of cpu's, the power of two

Parallel parameters

Chintu, May 23, 2008 - 12:52 am UTC

Should parallel_min_servers = parallel_max_servers*parallel_min_percent/100

If so what is the impact of the following

parallel_max_servers 128
parallel_min_percent 10
parallel_min_servers 8

May 23, 2008 - 9:05 am UTC

well, they are all documented - that'll tell you the impact of setting each one.

"should" - if they always should, it would be impossible to do otherwise, so they do not have to.

I let as many things as possible default, like all of them.

Chintu, May 23, 2008 - 9:52 am UTC

Tom, what do you mean by

"should" - if they always should, it would be impossible to do otherwise

May 23, 2008 - 4:19 pm UTC

You wrote:

...
Should parallel_min_servers = parallel_max_servers*parallel_min_percent/100
....

I answered

If it SHOULD be, it WOULD be - we would not permit anything else. Just hooked on semantics here - you asked "should x=f(y)". I said "if x SHOULD equal f(y) always, we would make sure it did and you would not have any say in it"

meaning, it could, it doesn't have to

export in parallel

LindaC, June 07, 2008 - 10:19 pm UTC

I need to move 500GB of data from 9.2 db in a Aix server to a 11.1.0.6.0 db in a Solaris 5.10. I want to do it in a parallel manner to speed things up. I will do it in the 11g server via a connect string to the 9.2 db.
Cant it be done? How to do it?

June 09, 2008 - 11:46 am UTC

you will have to "do it yourself"

you would run multiple export sessions on the host database (9i) exporting different bits of the database (say schema by schema, many schemas at the same time)

A reader, June 09, 2008 - 3:26 pm UTC

Thank you for your feedback.
You mention in this website: "sort area size will affect index creation, you might want it a larger then 256k "

But I ead the following: These parameters (hash_area_size and sort_areaa_size) should not be set for releases 9iR2 or higher as the auto memory manager is used for 9iR2 and 10gR1 based configurations.
So I have not configured this for the import or for the index creation script. The index nologging and parallel as you suggested.
Will run many index script at the same time, for different indexes.

June 09, 2008 - 3:51 pm UTC

if you are using automatic pga memory management and you are using a dedicated server connection, you need not set sort/hash area size, they'll be derived based on current workload and your pga aggregate target settings.

if you use shared server in 9i - then we use sort/hash area size regardless of whether automatic memory management is on. so there you would set them.

if you use shared server in 10g and above - we use automatic memory management so you need not set them if you are using that.

LindaC, June 09, 2008 - 3:26 pm UTC

Import Buffer or recordlength

LindaC, June 09, 2008 - 3:46 pm UTC

Which one to use in the import that I will submitted in parallel by myself, buffer of recordlength. I have modified the parameters in the databae to make it bigger during the import.

June 09, 2008 - 3:58 pm UTC

... I have modified the parameters in the databae to make
it bigger during the import.

..

that doesn't make sense to me, sorry - what parameters have you modified in the database for import???

Log_checkpoint_interval

LindaC, June 09, 2008 - 3:55 pm UTC

Is it true that for import i can improve the speed by "augmenting" the size of the log_checkpoint_interval to a bigger size than the redo size? (Os solaris 512) redo log size 600mb , thinking of changing log_chekpoint_interval to 1433600.

June 09, 2008 - 4:25 pm UTC

you would only play with this stuff if you are waiting on redo oriented events

and since you'll use commit=n, you won't be.

where did you get that advice from by the way - I'd be curious to see it in context.

buffer or recordlength

LindaC, June 09, 2008 - 9:38 pm UTC

Thank you for your feedback.
Answering your question of "what parameters have you modified in the database for import"?
I have modified this parameters in the init.ora.
I 'am not sure about this one: _serial_direct_read=TRUE, and also not sure about the "parallel" parameters or the fast* parameter.
After doing the import without indexes or constraints I will create the indexes first and then the constraints enabling them after creating the index parallel.
The tables where altered before theh export to do it "parallel".
db_file_multiblock_read_count=128
_serial_direct_read=TRUE
db_cache_size=1073741824
large_pool_size=524288000
shared_pool_size=800M
log_buffer=524288000
pga_aggregate_target=1024M
WORKAREA_SIZE_POLICY=AUTO
SGA_TARGET=4096M
MEMORY_TARGET=6144M
MEMORY_MAX_TARGET=6144M
#LOG_CHECKPOINT_INTERVAL=1433600---->removed because
# commit=n as you said
##########################################
# Parallelism
##########################################
parallel_min_percent=0
# parallel_server=false
# parallel_server_instances=1
parallel_min_servers=2
parallel_max_servers=16
parallel_threads_per_cpu=2
parallel_execution_message_size=4096
parallel_adaptive_multi_user=TRUE
# parallel_automatic_tuning=TRUE
###########################################
# Redo Log and Recovery
###########################################
fast_start_mttr_target=300

June 09, 2008 - 9:54 pm UTC

I'm not sure what you are doing or why.

leave underscore parameters ALONE, just stay away.

buffer or recordlength for import

LindaC, June 09, 2008 - 9:49 pm UTC

Also I would like to be sure if it is ok to use both parameters for the import.

June 09, 2008 - 9:55 pm UTC

sure, it is "ok"

underscore parameters

LindaC, June 09, 2008 - 10:17 pm UTC

Thank you Tom, for the feedback, I will stay away from the underscore parameters and will comment this on the init.ora.
Really appreciate your help.

Bye.

Create index in parallel only one query running parallel

LindaC, June 11, 2008 - 2:52 pm UTC

The following parameters for parallelism are set in the init.ora:

parallel_min_percent=0
# parallel_server=false
# parallel_server_instances=1
parallel_min_servers=2
parallel_max_servers=16
parallel_threads_per_cpu=2
parallel_execution_message_size=4096
parallel_adaptive_multi_user=TRUE
# parallel_automatic_tuning=TRUE

I modified the 4 scripts with 10 creation of indexes, so that the creation of the index be PARALLEL and NOLOGGING.

When I saw the running using a third party utility I saw that only once query was using "PARALLEILSM" using all 16 parallel.
Each of the 4 scripts started with "Alter session force parallel DDL;" before the creation of the inexes.

I don't know what I did wrong. First, don't know how to set the parallel parametrs in the init.ora so to take the "defaults", did I put more parameters than necessary. If afirmative please tell me what parameters.
Also don't know why only one query is "eating' all the parallel process and the other two don't.

June 11, 2008 - 8:51 pm UTC

...
When I saw the running using a third party utility I saw that only once query
was using "PARALLEILSM" using all 16 parallel.
Each of the 4 scripts started with "Alter session force parallel DDL;" before
the creation of the inexes.

....

you set adaptive multi user on, it would seem you started one index, it consumed the resources and the others said "oh well, maybe next time, lets just go it on our own"

as designed.

If you wanted to run many of them at the same time, you would use an explicit parallel degree on them, and use a number less than 16

PArallelism parameters

LindaC, June 13, 2008 - 11:06 am UTC

8 cpu's 4 gb of RAM - Oracle version 11g

How to set the parallelism* parameters?

There is no automatic_tuning for parallelism, what it te substitute parameter if there is one?

June 13, 2008 - 11:40 am UTC

you don't need to set anything, it is already set by default, the defaults should be used unless you have a really good reason to override them.

Parallelism ver 9 and 10g

LindaC, June 16, 2008 - 10:33 am UTC

If version 1g I don't have to set anything what about version 9 and 10?
What parameters do I have to set as to implement parallelism in the database?

What about the altr session force parallel query, dml and ddl? Is this command can be use in 9g and 10g ?

June 16, 2008 - 1:37 pm UTC

I suppose 1g means "11g"

If you have access to my book "Expert Oracle Database Architecture", I cover parallelism

The data warehousing guide does as well - suggest you read through it for sure.

10g sets it up by default as well.

PDML on a non-partitioned table with bitmap indexes

Pushkar, June 23, 2008 - 11:50 am UTC

Hi Tom,

In followup dated 18 February, 2008,among others, you have shown that use of PARALLEL hint if PDML is in enabled status results in PDML. However, in "Restrictions on Parallel DML" section in Chapter 21 (Using Parallel Execution)in Oracle 9i Data Warehousing Guide Release 2(9.2), it has been specified that PDML cannot happen on a non-partitioned table if it has bitmap index. 
So, I did a test:

SQL> select * from v$version ;

BANNER                                                                          
----------------------------------------------------------------                
Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production                
PL/SQL Release 9.2.0.8.0 - Production                                           
CORE 9.2.0.8.0 Production                                                       
TNS for HPUX: Version 9.2.0.8.0 - Production                                    
NLSRTL Version 9.2.0.8.0 - Production       

SQL> create table  table_test
  2  as     select object_id   ,
  3                mod( object_id, 5) mod_value
  4         from   all_objects ;

Table created.

SQL> create bitmap index table_test_b1
  2         on           table_test(mod_value) ;

Index created.

SQL> desc table_test;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 OBJECT_ID                                 NOT NULL NUMBER
 MOD_VALUE                                          NUMBER

SQL> select sid,serial#, program, pdml_status, pddl_status, pq_status
  2  from   v$session
  3  where  osuser = 'psingh' ;

       SID    SERIAL# PROGRAM              PDML_STA PDDL_STA PQ_STATU           
---------- ---------- -------------------- -------- -------- --------           
       302      20075 SQLPLUS.EXE          DISABLED ENABLED  ENABLED            

SQL> insert /*+ parallel(test,2,3) */ into table_test test
  2  select object_id, mod(object_id,5)
  3  from all_objects
  4  where rownum = 1 ;

1 row created.

SQL> select *
  2  from table_test
  3  where rownum = 1;

 OBJECT_ID  MOD_VALUE                                                           
---------- ----------                                                           
       195          0                                                           

SQL> commit;

Commit complete.

SQL> alter session enable parallel dml ;

Session altered.

SQL> select sid, serial#, program, pdml_staus, pddl_status, pq_status
  2  from   v$session
  3  where  osuser = 'psingh' ;

       SID    SERIAL# PROGRAM              PDML_STA PDDL_STA PQ_STATU           
---------- ---------- -------------------- -------- -------- --------           
       302      20075 SQLPLUS.EXE          ENABLED  ENABLED  ENABLED            

SQL> insert /*+ parallel(test, 2, 3) */ into table_test test
  2  select object_id, mod(object_id, 5)
  3  from   all_objects
  4  where  rownum = 1 ;

1 row created.

SQL> select *
  2  from   table_test
  3  where  rownum = 1 ;
from   table_test
       *
ERROR at line 2:
ORA-12838: cannot read/modify an object after modifying it in parallel 


SQL> commit ;

Commit complete.

SQL> select *
  2  from   table_test
  3  where  rownum = 1 ;

 OBJECT_ID  MOD_VALUE                                                           
---------- ----------                                                           
       195          0      

On basis of above test, in contrast to what specified in the document, can i safely conclude that PARALLEL hint with PDML in enabled status results in PDML even on a non-partitioned table with bitmap index? 
Moreover, is it true that, whenever a PDML is going on, Oracle creates new sessions for each parallel server, e.g. in above test case it creates 6 (2*3) new sessions?

Thanks in advance.

June 23, 2008 - 1:49 pm UTC

you just did a direct path load (parallel implies APPEND) but not parallel.

ops$tkyte%ORA9IR2> drop table table_test;

Table dropped.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> create table  table_test
  2  as     select object_id   ,
  3                mod( object_id, 5) mod_value
  4         from   all_objects ;

Table created.

ops$tkyte%ORA9IR2> create bitmap index table_test_b1
  2         on           table_test(mod_value) ;

Index created.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> insert /*+ parallel(test,2,3) */ into table_test test select object_id, mod(object_id,5) from all_objects where rownum = 1 ;

1 row created.

ops$tkyte%ORA9IR2> select * from table_test where rownum = 1;

 OBJECT_ID  MOD_VALUE
---------- ----------
        30          0

ops$tkyte%ORA9IR2> commit;

Commit complete.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> alter session enable parallel dml ;

Session altered.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> select a.name, b.value from v$statname a, v$mystat b where a.statistic# = b.statistic# and lower(a.name) like '%parallel%';

NAME                                               VALUE
--------------------------------------------- ----------
queries parallelized                                   0
DML statements parallelized                            0
DDL statements parallelized                            0
DFO trees parallelized                                 0
Parallel operations not downgraded                     0
Parallel operations downgraded to serial               0
Parallel operations downgraded 75 to 99 pct            0
Parallel operations downgraded 50 to 75 pct            0
Parallel operations downgraded 25 to 50 pct            0
Parallel operations downgraded 1 to 25 pct             0

10 rows selected.

ops$tkyte%ORA9IR2> insert /*+ parallel(test, 2, 3) */ into table_test test select object_id, mod(object_id, 5) from   all_objects where  rownum = 1 ;

1 row created.

ops$tkyte%ORA9IR2> select a.name, b.value from v$statname a, v$mystat b where a.statistic# = b.statistic# and lower(a.name) like '%parallel%';

NAME                                               VALUE
--------------------------------------------- ----------
queries parallelized                                   0
DML statements parallelized                            0
DDL statements parallelized                            0
DFO trees parallelized                                 0
Parallel operations not downgraded                     0
Parallel operations downgraded to serial               0
Parallel operations downgraded 75 to 99 pct            0
Parallel operations downgraded 50 to 75 pct            0
Parallel operations downgraded 25 to 50 pct            0
Parallel operations downgraded 1 to 25 pct             0

10 rows selected.
<b>
No parallel, but we did a direct path load...
</b>
ops$tkyte%ORA9IR2> select * from   table_test where  rownum = 1 ;
select * from   table_test where  rownum = 1
                *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel


ops$tkyte%ORA9IR2> commit ;

Commit complete.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> drop index table_test_b1;

Index dropped.

<b>get rid of index...</b>

ops$tkyte%ORA9IR2> select a.name, b.value from v$statname a, v$mystat b where a.statistic# = b.statistic# and lower(a.name) like '%parallel%';

NAME                                               VALUE
--------------------------------------------- ----------
queries parallelized                                   0
DML statements parallelized                            0
DDL statements parallelized                            0
DFO trees parallelized                                 0
Parallel operations not downgraded                     0
Parallel operations downgraded to serial               0
Parallel operations downgraded 75 to 99 pct            0
Parallel operations downgraded 50 to 75 pct            0
Parallel operations downgraded 25 to 50 pct            0
Parallel operations downgraded 1 to 25 pct             0

10 rows selected.

ops$tkyte%ORA9IR2> insert /*+ parallel(test, 2, 3) */ into table_test test select object_id, mod(object_id, 5) from   all_objects where  rownum = 1 ;

1 row created.

ops$tkyte%ORA9IR2> select a.name, b.value from v$statname a, v$mystat b where a.statistic# = b.statistic# and lower(a.name) like '%parallel%';

NAME                                               VALUE
--------------------------------------------- ----------
queries parallelized                                   0
DML statements parallelized                            1
DDL statements parallelized                            0
DFO trees parallelized                                 1
Parallel operations not downgraded                     0
Parallel operations downgraded to serial               0
Parallel operations downgraded 75 to 99 pct            0
Parallel operations downgraded 50 to 75 pct            0
Parallel operations downgraded 25 to 50 pct            0
Parallel operations downgraded 1 to 25 pct             1

10 rows selected.

<b>and we get parallel...</b>

parallel query is implemented via parallel execution servers which under unix will show up as pNNN processes and when active will have a session in v$session - yes. But the pNNN processes will hang around, we don't have to always create and recreate them.

Execution time and Parallel processing

A reader, June 24, 2008 - 10:05 am UTC

Hi Tom

First of all, a ton of thanks will be too small for your wonderful explanation and your patience to answer so many queries from around a world of Oracle programmers.

I would be very thankfull if you can help me with your suggestions for my issue.

I am using Oracle 9iR2.I am trying to run 2 different procedures in parallel. Both these procedures process huge amount of data with operations such as inserts with some transformations. Both these procedures use a transaction temporary table in common and session temporary table(while populating this table I have a column containing different values).

I have no transaction control statements in this(as I am using temporary tables).

When I try to run these 2 procedures sequentially I find the below timings.

proc1 = 7 mins 27 secs
proc2 = 5 mins 41 secs


After running when I try to rollback the processing sequentially I find below timings.

to rollback proc1 changes = 4.32
to rollback proc2 changes = 1.31

Now when I try to run these two procedures exectly in parallel I find the below timings

proc1= 10 mins 21 secs
proc2=  9 mins 41 secs

and the rollback times are

to rollback proc1 changes = 8 mins and 45 secs
to rollback proc2 changes = 4 mins and  9 secs

I tried to see the parallel parameter details from the db server

SQL> show parameter parallel

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback         string      LOW
log_parallelism                      integer     1
parallel_adaptive_multi_user         boolean     FALSE
parallel_automatic_tuning            boolean     FALSE
parallel_execution_message_size      integer     2148
parallel_instance_group              string
parallel_max_servers                 integer     5
parallel_min_percent                 integer     0
parallel_min_servers                 integer     0
parallel_server                      boolean     FALSE
parallel_server_instances            integer     1
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0

My questions
============
1.Why this behaviour happens on a database like oracle which supports parallel processing?

2.Is there any parameter I need to change in the above list of parameter values?

3.Why rollback operation has a varying time according to parallel and sequential runs of procedures?

4.Can I change the INITRANS parameter of all the tables which are modified in these procedures?

5.In Oracle tables have a default value of 1 for INITRANS and indexes have a default of 2. Why it is 2 for indexes?

Please advice if there is any way to get the execution times of the procedures low when running them in parallel.


Thanks in advance

Badri

June 24, 2008 - 11:20 am UTC

1) it is called "contention", you must be contending with yourself for some resource

o memory
o cpu
o disk IO
o enqueues (locks)

2) you are doing your own parallelism here! None of our parallel parameters apply, you just have two different user sessions doing work - it is not "parallel" in the Oracle sense, it is "do it yourself" parallel

3) see #1, contention. A rollback is a logical "undo". For every row you insert, rollback will delete. For every row you delete, rollback will insert, for every row you update, rollback will un-update.

4) sure, but initrans is already "2" and you have "2", so it won't do anything

5) that changed, initrans now defaults to "2" for everything.

In order to get something in parallel running "fast" you have to ensure

a) you have sufficient resources, cpu, memory, io for both to run at the same time without exceeding the amount of that resource

b) you have split to workload in a manner that makes it so the two jobs do not contend with each other - that is, one of them works on the "top half" of the table and the other on the "bottom half" (if you were reading from a staging table - it would be nice if one of them read the first half and the other the second half so they do not contend with each other)

You mentioned bitmap indexes, I hope you have none during this time - they will lock you up tighter than a drum.

One approach to see what is being contended for would be to run the serial processes with 10046 level 12 tracing and then run them together with the same.

Then use tkprof to see what is different between the two runs.

Detrimental impact of parallelism

A reader, September 12, 2008 - 11:01 pm UTC

Hi Tom,

In a DW environment (10g), do you think that parallel clause (without DOP) for dimension tables can have detrimental impact (by creating parallel execution coordinator and servers)on overall performance? Or oracle will know (based upon stats) that being small tables,no parallelism will be performed, as DBA is saying that every table should be created with parallel clause???

Regards,

Default parallel clause

A reader, September 15, 2008 - 4:25 pm UTC

Hi Tom,

Could you please help about having default parallel clause (without DOP)for all tables in DW environment?

Thanks

September 16, 2008 - 10:20 pm UTC

what help do you need exactly?

parallel dop

A reader, September 17, 2008 - 6:31 am UTC

Hi Tom,

Just wanted to know that for small tables (few hundred rows) created with parallel clause, will oracle start creating parallel execution (especially load) processes? which may have more overhead then actually benefit, or oracle will NOT do parallel execution( based upon stats) because there is not much benefit in doing parallelism.

Regards

September 17, 2008 - 9:23 am UTC

in order to do a parallel load - you actually have to call out for it, explicitly, so yes - it would.

Yes, it will consider parallel execution if you say consider parallel execution.

Detrimental impact

A reader, September 17, 2008 - 4:23 pm UTC

Hi Tom,
Thanks for the response.
Do you think that for DW load of small tables in parallel can have detrimental impact in overall performance of the machine? as Oracle will create additional processes, which will not be any benefit so far as loading is concerned - rather may impact the overall performance of other jobs.

Regards,

September 17, 2008 - 7:28 pm UTC

it can have

a) a positive impact
b) a negative impact
c) no impact whatsoever either way.

Use your knowledge to do a parallel direct path load - or not. You have to explicitly ask for it after all.

parallel_automatic_tuning

A reader, November 19, 2008 - 5:34 am UTC

Hi Tom,
Going through your books and things here, the importance of parallel_automatic_tuning was something I started looking into. But as per 10g document ( http://download.oracle.com/docs/cd/B19306_01/server.102/b14231/whatsnew.htm#sthref55 ), this parameter is deprecated. But I also noticed that there is some improvement in performance if this is set to TRUE in 10g, bit confusing, should we use this parameter or not in 10g?
Regards,

November 24, 2008 - 10:36 am UTC

yes, as of 10gr1

... Oracle Database provides defaults for the parallel execution initialization parameters that are adequate and tuned for most situations. The PARALLEL_AUTOMATIC_TUNING initialization parameter is now redundant and has been deprecated. ...

define "some improvement" - you should be able to tell us "what is different" between it being on or off.

Serialized update

Ian, November 19, 2008 - 12:33 pm UTC

Hi Tom

I have a table that I am updating in parallel - this was working in my test system but since it was refreshed from production last week it no longer works. If I create an exact copy of the table using exp/imp to an indexfile it does update in parallel.

This is the plan of the one that does work:

---------------------------------------------------------------------------------------------------
Operation            |  Name                 | Rows  | Bytes | Cost  |  TQ    |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------
UPDATE STATEMENT     |                       |    15M|   519M| 25300 |        |      |            |
 INDEX MAINTENANCE   | XXX_TASK_ASSIGNMENTS  |       |       |       | 90,04  | P->S | QC (RAND)  |
  UPDATE             | XXX_TASK_ASSIGNMENTS  |       |       |       | 90,03  | P->P | RANGE      |
   HASH JOIN         |                       |    15M|   519M| 25300 | 90,02  | P->P | HASH (BLOCK|
    TABLE ACCESS FULL| JTF_TASKS_B           |  8681K|   190M| 12500 | 90,00  | P->P | HASH       |
    TABLE ACCESS FULL| XXX_TASK_ASSIGNMENTS  |    15M|   187M| 10705 | 90,01  | P->P | HASH       |
---------------------------------------------------------------------------------------------------

And this is the one that does not work. JTF_TASK_ASSIGNMENTS does not have any triggers.

-------------------------------------------------------------------------------------------------
Operation            |  Name                 | Rows  | Bytes | Cost  |  TQ    |IN-OUT| PQ Distrib
-------------------------------------------------------------------------------------------------
UPDATE STATEMENT     |                       |    15M|   519M| 25300 |        |      |           
 UPDATE              | JTF_TASK_ASSIGNMENTS  |       |       |       |        |      |           
  HASH JOIN          |                       |    15M|   519M| 25300 | 83,02  | P->S | QC (RAND) 
   TABLE ACCESS FULL | JTF_TASKS_B           |  8681K|   190M| 12500 | 83,00  | P->P | HASH      
   TABLE ACCESS FULL | JTF_TASK_ASSIGNMENTS  |    15M|   187M| 10705 | 83,01  | P->P | HASH      
-------------------------------------------------------------------------------------------------

My question is - can you think of any reason why it is serializing? Parallel update takes 36 minutes - serial update takes 5+ hours. This is the update statement:

update
(select /*+ parallel(a,4) parallel(b,4) */
        a.actual_start_date
       ,a.actual_end_date
       ,b.scheduled_start_date
       ,b.scheduled_end_date
 from jtf.xxx_task_assignments a
     ,jtf.jtf_tasks_b b
 where a.task_id = b.task_id
)
set actual_start_date = scheduled_start_date
   ,actual_end_date = scheduled_end_date

Thanks and Regards

Ian

November 24, 2008 - 10:49 am UTC

can you give us the dbms_metadata.get_ddl for both pairs of tables.

Re question above

Ian, November 19, 2008 - 12:40 pm UTC

Tom - forget to say - yes I am doing alter session enable parallel dml; for both of them.

Regards

Ian

Bad hair day

Ian, November 21, 2008 - 6:35 am UTC

Tom

Regarding the above question - version is 9.2.0.7

Regards

Ian

Metadata

Ian, November 25, 2008 - 9:47 am UTC

Tom

They are identical apart from the table names. This is the bit after the column definitions:

   ) PCTFREE 10 PCTUSED 70 INITRANS 10 MAXTRANS 255 NOCOMPRESS LOGGING
  STORAGE(INITIAL 268435456 NEXT 4194304 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 4 FREELIST GROUPS 4 BUFFER_POOL KEEP)
  TABLESPACE "JTFD"

Regards

Ian

November 25, 2008 - 12:37 pm UTC

but that gives me nothing to play with - the test with.

ok, compare everything in the dictionary for the two sets of tables - statistics, parallel settings, everything.

Double checked everything

Ian, November 25, 2008 - 2:32 pm UTC

Tom

I backed up the stats from the original table then updated the rows in the stattab table to xxx instead of jtf then restored the stats into the new table. So the stats are identical. I have double checked the table definition - degree, instances, everything. All identical.

One thing I didn't mention was that the original table had two new columns added with an alter table command. But not in the same SQL*Plus session. Columns were added then exit sqlplus - then in a new session explain plan the update.

So to replicate this exactly I have recreated the xxx table without the two new columns using the output from dbms_metadata. I am in the process of populating it with all 15 Million rows from the original table - then will add the two new columns to see if that is the problem.

Sorry about not giving you a nice clear test case - but I can't replicate the problem at the moment!

Will let you know.

Thanks and Regards

Ian

Parallel Query: v$SQL.PX_SERVERS_EXECUTIONS

Suvendu, December 10, 2008 - 6:35 am UTC

Hi Tom,

I did the test case, as you demonstrated earlier in this thread. From this test case, I was looking for use of new column of v$SQL in 10.2 to check whether this query run in parallel mode, but after doing the test case, it's NOT showing such info.

Could you please, direct me where I'm missing..?

Thanks,
Suvendu

-----------

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production
PL/SQL Release 11.1.0.6.0 - Production
CORE 11.1.0.6.0 Production
TNS for Solaris: Version 11.1.0.6.0 - Production
NLSRTL Version 11.1.0.6.0 - Production

SQL> show parameter parallel

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback string LOW
parallel_adaptive_multi_user boolean TRUE
parallel_automatic_tuning boolean TRUE
parallel_execution_message_size integer 4096
parallel_instance_group string
parallel_io_cap_enabled boolean FALSE
parallel_max_servers integer 160
parallel_min_percent integer 0
parallel_min_servers integer 0
parallel_server boolean FALSE
parallel_server_instances integer 1
parallel_threads_per_cpu integer 2
recovery_parallelism integer 0
SQL> drop table t;

Table dropped.

SQL> create table t (x int);

Table created.

SQL> exec dbms_stats.set_table_stats('TEST','T',numrows=>1000000,numblks=>100000);

PL/SQL procedure successfully completed.

SQL> explain plan for select /*+ parallel(t) myQuery */ count(1) from t;

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------
Plan hash value: 3026717286

--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 1520 (1)| 00:00:19 | | | |
| 1 | SORT AGGREGATE | | 1 | | | | | |
| 2 | PX COORDINATOR | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 1 | | | Q1,00 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | 1 | | | Q1,00 | PCWP | |
| 5 | PX BLOCK ITERATOR | | 1000K| 1520 (1)| 00:00:19 | Q1,00 | PCWC | |
| 6 | TABLE ACCESS FULL| T | 1000K| 1520 (1)| 00:00:19 | Q1,00 | PCWP | |
--------------------------------------------------------------------------------------------------------

13 rows selected.

SQL> l
1 select PX_SERVERS_EXECUTIONS, sql_text from v$sql
2* where sql_text like '%myQuery%'
SQL> /

PX_SERVERS_EXECUTIONS SQL_TEXT
--------------------- ------------------------------------------------------------
0 select PX_SERVERS_EXECUTIONS, sql_text from v$sql where sql_
text like '%myQuery%'

0 explain plan for select /*+ parallel(t) myQuery */ count(1)
from t

SQL>

December 10, 2008 - 9:58 am UTC

you did not actually ever RUN the query.

you did an explain plan on it???

After running the query, there is no update on PX_SERVERS_EXECUTIONS column

Suvendu, December 11, 2008 - 1:44 am UTC

Hi Tom,

After runing the query, still there is no update on PX_SERVERS_EXECUTIONS column. Below is the tkprof out of the query.

Could you please, update with a test case on it?

Thanks,
Suvendu

SQL ID : 0wkzfj4vkzba8
select /*+ parallel(t) myQuery */ count(1)
from
t

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.01 0 0 0 0
Execute 1 0.00 0.00 0 2 0 0
Fetch 2 0.00 0.00 0 3 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.01 0 5 0 1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 28

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=5 pr=0 pw=0 time=0 us)
1 PX COORDINATOR (cr=5 pr=0 pw=0 time=0 us)
1 PX SEND QC (RANDOM) :TQ10000 (cr=3 pr=0 pw=0 time=0 us)
1 SORT AGGREGATE (cr=3 pr=0 pw=0 time=0 us)
0 PX BLOCK ITERATOR (cr=3 pr=0 pw=0 time=0 us cost=1520 size=0 card=1000000)
0 TABLE ACCESS FULL T (cr=3 pr=0 pw=0 time=0 us cost=1520 size=0 card=1000000)

********************************************************************************

SQL> select PX_SERVERS_EXECUTIONS, sql_text from v$sql
2 where sql_text like '%myQuery%'
3 /

PX_SERVERS_EXECUTIONS SQL_TEXT
--------------------- --------------------------------------------------
0 select /*+ parallel(t) myQuery */ count(1) from t

December 11, 2008 - 7:40 am UTC

table was empty.
we knew that when we read the segment header.
so - it said "yeah, whatever, no parallel really needed, we'll just do serial"

ops$tkyte%ORA11GR1> create table t (x int, y char(200));

Table created.

ops$tkyte%ORA11GR1> insert /*+ append */ into t
  2  select level, 'x' from dual connect by level <= 1000000;

1000000 rows created.

ops$tkyte%ORA11GR1> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR1> set serveroutput off
ops$tkyte%ORA11GR1> select /*+ parallel(t) myQuery */ count(1) from t;

  COUNT(1)
----------
   1000000

ops$tkyte%ORA11GR1> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0wkzfj4vkzba8, child number 0
-------------------------------------
select /*+ parallel(t) myQuery */ count(1) from t

Plan hash value: 3126468333

--------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |       |  1116 (100)|          |        |      |            |
|   1 |  SORT AGGREGATE        |          |     1 |            |          |        |      |            |
|   2 |   PX COORDINATOR       |          |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |          |     1 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR |          |  1000K|  1116   (1)| 00:00:14 |  Q1,00 | PCWC |            |
|*  6 |       TABLE ACCESS FULL| T        |  1000K|  1116   (1)| 00:00:14 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access(:Z>=:Z AND :Z<=:Z)


23 rows selected.

ops$tkyte%ORA11GR1> select PX_SERVERS_EXECUTIONS, sql_text from v$sql where sql_text like '%myQuery%';

PX_SERVERS_EXECUTIONS SQL_TEXT
--------------------- ------------------------------
                    8 select /*+ parallel(t) myQuery
                       */ count(1) from t

                    0 select PX_SERVERS_EXECUTIONS,
                      sql_text from v$sql where sql_
                      text like '%myQuery%'

TKprof outout..

Suvendu, December 11, 2008 - 9:08 am UTC

Tom,
Thanking you so much for answering me.

As, you told, from the segment header we came to know that there is no actual rows, so we continue with serial mode. But from my earlier reply, the tkprof also said that, the query was EXECUTED in parallel mode, when there is ONLY statistics.

I'm sure, I'm going to learn more new stuff. Please, correct me if I'm wrong.

Thanking you,
Suvendu

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=5 pr=0 pw=0 time=0 us)
1 PX COORDINATOR (cr=5 pr=0 pw=0 time=0 us)
1 PX SEND QC (RANDOM) :TQ10000 (cr=3 pr=0 pw=0 time=0 us)
1 SORT AGGREGATE (cr=3 pr=0 pw=0 time=0 us)
0 PX BLOCK ITERATOR (cr=3 pr=0 pw=0 time=0 us cost=1520 size=0 card=1000000)
0 TABLE ACCESS FULL T (cr=3 pr=0 pw=0 time=0 us cost=1520 size=0 card=1000000)

December 11, 2008 - 9:19 am UTC

that was the plan - we just didn't feel the need at runtime to actually do anything in parallel - it has that choice. So the PX column in v$sql was dead accurate - it told you "we didn't use any".

The 3 reads you see where the reads of the segment header which told the coordinator "you are done, nothing to see here, return the answer". The coordinator never needed to shoot anything down to any execution servers.

If you actually put data in there - it then says "ok, time to fire up the execution servers - we have some real work to do"

process level information

Matt, December 11, 2008 - 4:28 pm UTC

Hi Tom,

Thanks a lot for creating this site, its wonderful and I learn a lot from it. I have just bought your book Expert one on one 2nd edition, it is very good and you have explained many topics in great details. After reading on Parallel processing and how it works, I have some questions,
1. is it possible to find some inforamtion for each process ? information like why only some process used for hash join or sort and not others?
select /*+ parallel (b) */ a.*, b.*
from a, b
where a.some_column = b.some_column

2. what factors, I should look for selecting table in parallel process ? for example, if I have big table x joining with very big table y (40 times x), should I use parallel (x) or parallel (y) in the select clause ? and does it change if size of table x grows 10 times or size of y grows 100 times ? or I should use just parallel ?

3. If I have a table over dblink, does it process the same way ?
ex.
select /*+ parallel (a) */ a.*, b.*
from a, b@db_link_table
where a.some_column = b.some_column

Thanks,

December 11, 2008 - 9:19 pm UTC

1) in a parallel query - the execution servers would all be doing the same thing - it would not be that "some use hash join and some do not"

not sure what you mean

2) The factors are: how many resources do you have to completely take over and use. Do you have the cpu's, do you have the memory, do you have the IO - if so, just parallel them all - if not, do not parallel anything.

3) the query that is executed at the remote site can be run in parallel, but all of the results will be sent back to you over a single serial pipe from the remote site.

So, the query could use parallel query, turn a billion rows into 100 rows - and then send those back - that would be "useful"

But if the query runs in parallel and turns a billion rows into a billion rows (eg: it was just a full scan) - and then sends a billion rows over the single dblink - probably not useful

information at process level.

Matt, December 12, 2008 - 6:05 pm UTC

Thanks Tom for your reply. I did not knew that over dblink it would be send in one pipe, so for larger data it is always good to run the query on remote db and then export import, correct ?

for #2,on selecting from big table or very big table, I still have question and here is the scenario,

I have a table X, which has 2 Million distinct account numbers and of size 25 MB and table Y, which has ~5.4 Billion rows and size of ~3.4 TB. account number is the primary key on both the tables.

DB is 10.0.2 , CPU 8 and I am the only user to use the maximum of box,

Total System Global Area 4294967296 bytes
Fixed Size 2147872 bytes
Variable Size 1522448864 bytes
Database Buffers 2768240640 bytes
Redo Buffers 2129920 bytes

When I run the query considering X (with 2M ban) in parallel mode, it takes 8 hours to return 175 million rows, but if I use Y in the query, it takes ~50 hours to do the same.

And this is not the end,the things changes for another very big table Z with ~120 Million rows, if I join Z with X and select X in parallel it takes more time then selecting z in parallel.

Though, type of query is different, both large tables have account number as primary key and that is the only column which is used for joining with table X.

Please see explain plan, for x joins with y and parallel (Y) in select statement:

INSERT STATEMENT CHOOSECost: 7,561,068 Bytes: 23,814,000 Cardinality: 441,000
21 LOAD AS SELECT C_CHARGE
20 PX COORDINATOR
19 PX SEND QC (RANDOM) SYS.:TQ10004 Cost: 7,561,068 Bytes: 23,814,000 Cardinality: 441,000
18 HASH GROUP BY Cost: 7,561,068 Bytes: 23,814,000 Cardinality: 441,000
17 PX RECEIVE Cost: 7,561,068 Bytes: 23,814,000 Cardinality: 441,000
16 PX SEND HASH SYS.:TQ10003 Cost: 7,561,068 Bytes: 23,814,000 Cardinality: 441,000
15 HASH GROUP BY Cost: 7,561,068 Bytes: 23,814,000 Cardinality: 441,000
14 HASH JOIN Cost: 7,560,972 Bytes: 499,947,228 Cardinality: 9,258,282
4 PX RECEIVE Cost: 1,566 Bytes: 339,115,491 Cardinality: 30,828,681
3 PX SEND HASH SYS.:TQ10001 Cost: 1,566 Bytes: 339,115,491 Cardinality: 30,828,681
2 PX BLOCK ITERATOR Cost: 1,566 Bytes: 339,115,491 Cardinality: 30,828,681
1 INDEX FAST FULL SCAN INDEX (UNIQUE) CLM_PK Cost: 1,566 Bytes: 339,115,491 Cardinality: 30,828,681
13 PX RECEIVE Cost: 7,453,667 Bytes: 27,018,549,007 Cardinality: 628,338,349
12 PX SEND HASH SYS.:TQ10002 Cost: 7,453,667 Bytes: 27,018,549,007 Cardinality: 628,338,349
11 HASH JOIN Cost: 7,453,667 Bytes: 27,018,549,007 Cardinality: 628,338,349
8 BUFFER SORT
7 PX RECEIVE Cost: 746 Bytes: 29,494,972 Cardinality: 2,268,844
6 PX SEND BROADCAST LOCAL SYS.:TQ10000 Cost: 746 Bytes: 29,494,972 Cardinality: 2,268,844
5 INDEX FAST FULL SCAN INDEX (UNIQUE) BAN_PK Cost: 746 Bytes: 29,494,972 Cardinality: 2,268,844
10 PX BLOCK ITERATOR Cost: 7,447,075 Bytes: 163,991,895,000 Cardinality: 5,466,396,500 Partition #: 20 Partitions accessed #1 - #22
9 TABLE ACCESS FULL TABLE CHARGE_HISTORY Cost: 7,447,075 Bytes: 163,991,895,000 Cardinality: 5,466,396,500 Partition #: 20 Partitions accessed #1 - #22

And this is when I put small table X in select statement parallel (X) joins with Big table Y

Plan
INSERT STATEMENT CHOOSECost: 16,354,996 Bytes: 386,431,509 Cardinality: 8,221,947
17 LOAD AS SELECT C_CHARGE
16 PX COORDINATOR
15 PX SEND QC (RANDOM) SYS.:TQ10002 Cost: 16,354,996 Bytes: 386,431,509 Cardinality: 8,221,947
14 HASH GROUP BY Cost: 16,354,996 Bytes: 386,431,509 Cardinality: 8,221,947
13 HASH JOIN Cost: 16,348,210 Bytes: 386,431,509 Cardinality: 8,221,947
4 PX RECEIVE Cost: 1,566 Bytes: 339,115,491 Cardinality: 30,828,681
3 PX SEND HASH SYS.:TQ10000 Cost: 1,566 Bytes: 339,115,491 Cardinality: 30,828,681
2 PX BLOCK ITERATOR Cost: 1,566 Bytes: 339,115,491 Cardinality: 30,828,681
1 INDEX FAST FULL SCAN INDEX (UNIQUE) CLM_PK Cost: 1,566 Bytes: 339,115,491 Cardinality: 30,828,681
12 PX RECEIVE Cost: 116 Bytes: 8,310 Cardinality: 277
11 PX SEND HASH SYS.:TQ10001 Cost: 116 Bytes: 8,310 Cardinality: 277
10 TABLE ACCESS BY LOCAL INDEX ROWID TABLE CHARGE_HISTORY Cost: 116 Bytes: 8,310 Cardinality: 277 Partitions accessed #1
9 NESTED LOOPS Cost: 16,264,057 Bytes: 20,088,168,696 Cardinality: 558,004,686
6 PX BLOCK ITERATOR
5 INDEX FAST FULL SCAN INDEX (UNIQUE) BAN_PK Cost: 54 Bytes: 12,089,274 Cardinality: 2,014,879
8 PARTITION RANGE ITERATOR Cost: 4 Cardinality: 277 Partition #: 16
7 INDEX RANGE SCAN INDEX (UNIQUE) CHARGE_HISTORY_PK Cost: 4 Cardinality: 277 Partition #: 16

From the cost of the query it looks good for parallel(big) but real execution time is less for parallel (small), what I am missing here ? please advice.

December 12, 2008 - 6:17 pm UTC

... so for larger data it is always good to run the query on remote db and then export import, correct ?
....

export? import? That would be the slowest approach known to me. Not a chance.

information at process level

Matt, December 15, 2008 - 2:16 pm UTC

So, what do you suggest for larger data ? I always learn better way of doing things from your notes, please advice.

Thanks,

December 29, 2008 - 10:47 am UTC

I don't know what you mean.

I would just

insert into t select ....;

to do a one time copy of data from A to B.

parallel insert

Badri, January 15, 2009 - 8:35 am UTC

Hi Tom

A simple 'Thanks a lot' is not sufficient for the kind of guidance you offer to the world of Oracle Programmers. Coming to the point,

When we are doing a table load/insert rows in a table using a transaction, if we opt to use parallel option then, I read for

(i)SQL Loader multiple sessions depending on the DOP, are created and load is performed into the table simultanously.
(ii)If we insert records with parallel hint then different transactions are created in the same session and the records are inserted simultanously.(please correct me if I am wrong in this point).

My doubt is how the locking issue is handled by oracle in these 2 scenarios, definitely oracle has a solution to overcome the locking conflict in these 2 cases, just eager to know how it happens.

Will really appreciate if you can please clear this.

Thanks in advance
Badri

January 16, 2009 - 6:10 pm UTC

basically, all parallel operations like that are done in separate threads or processes - regardless of whether sqlldr did it or you used a parallel direct path load.

You have a session that coordinates the others typically - you would see separate sessions show up in v$session for all the others.

Parallel inserts are done using direct path inserts (no buffer cache, just write to disk) and are done above the high water mark. each of the parallel processes would allocate and write to it's own extent(s), so the are not writing to the same areas of the tables. As they write this data, they maintain mini-indexes on the data they are loading - when they are done, they merge their indexes into the "larger overall index". Now since Oracle is used to doing things multi-user, all of this happens "concurrently".

What locking conflict were you envisioning being an issue?

parallel insert

Badri, January 19, 2009 - 8:37 am UTC

Hi Tom,

Once again 'THANKS A LOT' for such an informative reply. Idea was to get the concept clear on what exactly happens in the background when we do insert using sqlldr parallel and parallel direct path hints. Based on your reply I was just testing a scenario as shown below.
(Please tolerate if I am testing your patience, for the sake of me knowing oracle better :). )

I opened 2 sessions from my system.

Version: Oracle 9iR2.

Session 1:
==========
create table a(x int);
(Table Created.)
insert /*+ append parallel(a 4) */ into a select /*+ parallel(b 4) */ level from dual b connect by level <1000;
(999 rows created.)

With this status of Session 1 if I open Session 2 and do the below insert for 9999 rows,

Session 2:
==========
insert /*+ append parallel(a 4) */ into a select /*+ parallel(b 4) */ level from dual b connect by level <10000;
(session hangs)

after issuing the commit in Session 1, we can see the message 9999 rows created in Session 2.

Please note that in the insert statement I have mentioned the append(for insert above high water mark) and parallel(for 4 simultaneous sessions to do the job) hints in both Sessions.

1.Why Session 2 hangs when Session 1 is not committed,if Session 2 is going to parallelly insert records in its own extents(also above high water mark)?
2.Will parallel load really work when we insert data using direct load method?

I am absolutely sure that I am well on my way to get the concept clear and know something new in this region.

Thanks in advance
Badri

January 19, 2009 - 9:29 pm UTC

1) because a direct path load requires a table lock. If you direct path into a table, only one person at a time can do it.

2) ummm, yes, you just demonstrated it. A single user can use parallel direct path loading.

parallel insert

Badri, January 20, 2009 - 5:35 am UTC

Thanks Tom for replying.

#2 is understood and fine.

#1 Breaking your reply into 2 parts for understanding sake,

a)because a direct path load requires a table lock.

As per my understanding any type of insert or DML statements require a lock on the table. What is the reason you mentioned this point specifically for direct path load?

b)If you direct path into a table, only one person at a time can do it.

Well. This reply is for a person doing the load manually.If I am correct then what happens when the server opens multiple sessions depending on the DOP of the table to do the direct path load parallelly. Will there be any difference in the parallel load process when it is done manually or automatically by the server?

As per your previous reply can I say,

When parallel direct path load happens, the DB Server opens multiple parallel sessions depending on the DOP and these sessions are coordinated by one session which opened them. Each session write to its own extents above high water mark(per this point I understand that the preformatted data blocks are created using the fresh memory extents) and they are merged by the session which coordinates them all and the preformatted data blocks are moved into table's storage in the physical data file.

My doubt is that if sessions are using their own fresh memory extents, then they need not have to be writing data in the same areas of the table. If this is the case then how Session 2 hangs in my example above?

Got a bit confused, requesting you to throw some more light on this.

Thanks in advance
Badri

January 20, 2009 - 7:23 am UTC

a direct path load can be done by one and only one person/thing at a time. It gets an exclusive lock on the table (segment really). If I do a direct path load into a table (segment), YOU must wait until I am done. I can do it in parallel (I can use 100 sessions if I want to) to load the table, but you cannot touch that table (segment) (modify it) until I commit. I own that table (segment) when I direct path load into.

It is the way it works. One and only one person can direct path load a segment.

If you use partitioning, I can direct path load partition 1 whilst you direct path load partition 2.

parallel insert

Badri, January 20, 2009 - 12:00 pm UTC

Thanks a lot once again for your patience in explaining this yet again upon my request.

Badri

Parallel parameters / Degree of tables in OLTP database

Avi, January 22, 2009 - 1:15 am UTC

Dear Tom,

In one of our telecom billing database I see parallel query related events quite often. This is OLTP environment and at least 250+ user at a time accessing this database.
Earlier many times I have found that setting noparallel to table/index improves overall DB performance until specified in select clause in case of FTS but value of PARALLEL_ADAPTIVE_MULTI_USER, PARALLEL_AUTOMATIC_TUNING was set to false in those databases.
Now in this database I'm not sure if degree 2,4 will help so I'm planning to change it to NOPARALLEL but in this thread you said to use value of PARALLEL_ADAPTIVE_MULTI_USER, PARALLEL_AUTOMATIC_TUNING parameters to true and degree of table/index as parallel, i.e., default.
Please suggest me what should I keep the value of these parameters and degree of table/indexes considering this scenario as we don't have test setup for this DB?


Server  : HP-UX 11.11
DB Version : 8.1.7.4
Memory  : 16GB
CPU  : 8

SQL>select degree,count(*) from dba_tables group by degree;

DEGREE       COUNT(*)
---------- ----------
         1       4871
         2          2
         4         15
   DEFAULT          2

SQL> select event,count(*) from v$session_wait group by event order by 2;

EVENT                                     COUNT(*)
--------------------------------------- ----------
SQL*Net message to client                        1
direct path read                                 1
direct path write                                1
smon timer                                       1
pmon timer                                       1
db file scattered read                           3
latch free                                       4
PX Deq Credit: send blkd                         4
db file sequential read                          5
PX Deq: Execution Msg                            6
rdbms ipc message                                8
SQL*Net message from client                    358

SQL> sho parameter parallel

NAME                                 TYPE    VALUE
------------------------------------ ------- ------------------------------
fast_start_parallel_rollback         string  LOW
optimizer_percent_parallel           integer 0
parallel_adaptive_multi_user         boolean TRUE
parallel_automatic_tuning            boolean TRUE
parallel_broadcast_enabled           boolean FALSE
parallel_execution_message_size      integer 4096
parallel_instance_group              string
parallel_max_servers                 integer 4
parallel_min_percent                 integer 0
parallel_min_servers                 integer 2
parallel_server                      boolean FALSE
parallel_server_instances            integer 1
parallel_threads_per_cpu             integer 2
recovery_parallelism                 integer 0

SQL>select sum(bytes)/1024/1024 from dba_segments;

SUM(BYTES)/1024/1024
--------------------
          720099.023


SQL>select * from v$license;

SESSIONS_MAX SESSIONS_WARNING SESSIONS_CURRENT SESSIONS_HIGHWATER  USERS_MAX
------------ ---------------- ---------------- ------------------ ----------
           0                0              369                402          0

January 22, 2009 - 8:49 am UTC

parallel query in OLTP does not make sense at all? Parallel query works well when a single user wants to monopolize the entire machine. Do you actually want parallel query enabled at all????

Typo

Avi, January 22, 2009 - 1:27 am UTC

Dear Tom,

By mistake I pasted init parameter value of other DB in my earlier post. Please ingnore that as original value is as follows:

NAME TYPE VALUE
------------------------------------ ------- ------------------------------
fast_start_parallel_rollback string LOW
optimizer_percent_parallel integer 0
parallel_adaptive_multi_user boolean TRUE
parallel_automatic_tuning boolean TRUE
parallel_broadcast_enabled boolean FALSE
parallel_execution_message_size integer 8192
parallel_instance_group string
parallel_max_servers integer 16
parallel_min_percent integer 0
parallel_min_servers integer 2
parallel_server boolean FALSE
parallel_server_instances integer 1
parallel_threads_per_cpu integer 2
recovery_parallelism integer 0

Thanks a lot.

Avi, January 23, 2009 - 5:40 am UTC

Dear Tom,
Many thanks for reply. Actually I'm also in favour of setting degree to 1 ,i.e., noparallel. I write the parallel degree in sql itself if needed. But here at my client, people insist to have degre >1 on tables.
Now I'll ask them to disable degree of parallelism on table/indexes as this has stamp of Tom too. :)
Thanks Again.

January 23, 2009 - 9:03 am UTC

please don't use "the stamp of tom"

I get really nervous when people do things like that...

Thanks

Avi, February 02, 2009 - 7:15 am UTC

Dear Tom,
You also get nervous????? Sounds strange to me. Anyway I'll take your words.
Thanks a ton.

February 02, 2009 - 11:13 am UTC

Why does it sound strange.

I get nervous because people many times hear what they want to hear - and then walk around saying "but so and so says X, X is therefore good".

And so and so didn't actually ever say X
Or so and so said X, but in a very specific context

You never want to say "so and so said X, so therefore X is true, we don't need to think about it"

for any so and so.

SQL history of parallel degree

Jeff, February 03, 2009 - 12:16 am UTC

As a DBA, I want to understand the impact of parallel executions on the database performace.

1. How can I get the parallel degrees that were used by SQL statements during recent days (such as past two days)?

2. I queried $SQL, but got big number on the PX_SERVERS_EXECUTIONS column for some queries in our 10.2.0.3 database. Why?

PARSING_SCHEMA_NAME PX_SERVERS_EXECUTIONS
SYS 3813
SYS 814
DBSNMP 739
DBSNMP 739
DBSNMP 712
SYS 283
SYSMAN 214
SYS 86
DBSNMP 74
BI 61
BI 61

Thanks a lot.

February 03, 2009 - 10:15 am UTC

PX_SERVERS_EXECUTIONS:
Total number of executions performed by Parallel eXecution Servers. The value is 0 when the statement has never been executed in parallel.

As for #2, why? Because there were some queries executed in parallel - look at the queries themselves to determine "why" they might have been executed.

As for #1, unless the parallel degree is set on a segment or the hint was used with an explicit degree (query text would have the degree in it), the degree will fluctuate and we really don't track that over time (and a single query can have more than "one" degree associated with it - multiple steps...)

Need help in understanding the PARALLEL hint

Manjunath, April 07, 2009 - 7:41 am UTC

Hi Tom,

I had a few doubts on using the PARALLEL operations through hints. My application is a datawarehouse and we are using PARALLEL hints extensively in the materialized view definitions. As the MVs are all refreshed as the last activitity after the daily load, it is safe to assume that there is no user activity at the time when PARALLEL is at play.

Oracle Version: 10.2.0.4
cpu_count = 16
RAM = 40 Gigs
SGA_TARGET = 16 Gigs
PGA_AGGREGATE_TARGET = 1 Gig

The related parameters are sized as: (all at their defaults)

fast_start_parallel_rollback         string      LOW
parallel_adaptive_multi_user         boolean     TRUE
parallel_automatic_tuning            boolean     FALSE
parallel_execution_message_size      integer     2152
parallel_instance_group              string
parallel_max_servers                 integer     320
parallel_min_percent                 integer     0
parallel_min_servers                 integer     0
parallel_server                      boolean     FALSE
parallel_server_instances            integer     1
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0


Typically, the way we write our hint is as under:(it is from an existing query part in the materialized view ddl)

SELECT
/*+ PARALLEL (A 4) PARALLEL (B 4)  PARALLEL (C 4)  PARALLEL (E 4)  PARALLEL (F 4)  PARALLEL (H 4)  PARALLEL (I 4)  PARALLEL (J 4)*/
DISTINCT
--will contain about 20 fields from all FROM tables
FROM
EDW.FACT1 A,
EDW.DMV1 B,
EDW.DIM1 C,
EDW.DMV2 E,
EDW.FACT2 F,
EDW.DIM2 H,
EDW.DIM3 I,
(SELECT
/*+ PARALLEL (X 4) */
SUM (NVL(PROJ_STOCK_QTY,0)) PROJECT_STOCK_QTY,
SUM (NVL(NON_PROJ_STOCK_QTY,0)) NON_PROJECT_STOCK_QTY,
SUM (NVL(RECEIVING_OUT,0)) RECEIVING_OUT_QTY,
SUM (NVL(INSP_STOCK_QTY,0)) INSP_STOCK_QTY,
SUM (NVL(RESTRICTED_STOCK_QTY,0)) RESTRICTED_STOCK_QTY,
SUM (NVL(BLOCKED_STOCK_QTY,0)) BLOCKED_STOCK_QTY,
SUM (NVL(RETURNS_STOCK_QTY,0)) RETURNS_STOCK_QTY,
MATERIAL_NO MATERIAL_NO
FROM EDW.DIM4 X
WHERE 
X.CURRENT_FLG = 1
GROUP BY 
MATERIAL_NO ) J
WHERE 
--Join and Filter clauses here
;


My questions are:

1. Should we blindly include all tables and the resultset(PARALLEL (J 4)) in the hint? Some of those tables are quite small by warehouse standards(less than 100K rows). Explain plan statement indicates that all are not being used.

2. If I have to selectively choose tables for PARALLEL hint, should I just go for the largest of the tables that are being full table scanned? Any thumb rule in this regard? 

3. The job to refresh the materialized views will be running in serial mode(one after other) atleast for now. I am already using the atomic_refresh=false. My parallel max servers are 320. So, is it safe to assume I can increase the degree to 32 provided I keep the number of tables to 4-5? I would definetely be doing detailed tests to increase even beyond the current 4, but my question is a theoretical one. Is that allocattion correct?

4. My PGA_AGGREGATE_TARGET is too small currently(at 1G). I have to recommend increasing it to a higher number. I have more than 1% multi-pass executions for even operations that require 512K. Will this reason alone be sufficient for the others to understand the ramifications or is there anything else that will help them?

5. Can you please help me in understanding the below output?

SQL> SELECT * FROM V$PX_PROCESS_SYSSTAT WHERE STATISTIC LIKE 'Buffers%';

STATISTIC                           VALUE
------------------------------ ----------
Buffers Allocated                  427895
Buffers Freed                      427535
Buffers Current                       360
Buffers HWM                          1157



Thanks and regards
Manjunath

April 13, 2009 - 12:13 pm UTC

rather than say "parallel N", just say "parallel please",

http://docs.oracle.com/docs/cd/B19306_01/server.102/b14223/usingpe.htm#sthref2226

use the default degree of parallelism (and use atomic refresh => FALSE on your dbms_mview.refresh)

the "rule of thumb" would be to let the optimizer decide what it wants to do and what degree of parallelism it wants to use based on statistics.

as for ... Will this reason alone be sufficient for the
others to understand the ramifications or is there anything else that will help
them? ....

ummm, you have 40gb of RAM, using memory in a data warehouse is sort of something you want to do. Why do they not inately understand "we should use the memory on the machine". 1gb of pga for a warehouse is teeny tiny.

If you are sorting to disk frequently, especially multi-pass operations, you really want to utilize memory a bit more. That is obvious.

5) http://docs.oracle.com/docs/cd/B19306_01/server.102/b14237/dynviews_2034.htm#REFRN30191

you use message buffers to pass data between server processes. You currently have 360 of them in use in your system - at one point you had 1,157 of them in use concurrently. You've allocates 427,895 of them over time and have freed them 427,535 times (which if you add 360 is the number of allocated buffers....)

Parallel function not 'piping'

Mark Rynbeek, May 14, 2009 - 11:19 am UTC

Hi Tom

I'd like to get back to the parallel_pipelined function example. I've seen odd behaviour with a function that I made, where the results did not pipe back one by one but came back in one go.

Here's an example (based on yours) how to reproduce it

create table t0
as
select rownum r from all_objects where rownum < 15
/

create table t1
as
select rownum id, object_name text
from all_objects,t0;

analyze table t1 compute statistics;

create or replace
function parallel_pipelined( l_cursor in sys_refcursor )
return t2_tab_type
pipelined
parallel_enable ( partition l_cursor by any )
is
l_session_id number;
l_rec t1%rowtype;
begin
select sid into l_session_id
from v$mystat
where rownum =1;
loop
fetch l_cursor into l_rec;
exit when l_cursor%notfound;
if mod(l_rec.id,100000)=0 then
pipe row(t2_type(l_rec.id,to_char(sysdate,'hh24:mi:ss'),l_session_id));
end if;
end loop;
close l_cursor;
return;
end;
/

set arraysize 1

select *
from table
(parallel_pipelined
(CURSOR (select /*+ parallel(t1) */ * from t1))
)
/

ID TEXT SESSION_ID
---------- ------------------------------ ----------
300000 15:41:06 130
100000 15:41:03 138
400000 15:41:06 138
200000 15:41:05 140
500000 15:41:07 170
600000 15:41:08 170
700000 15:41:08 170

So the difference with yours is that the PIPE ROW is conditional..
Could you tell me why the pipe is not 'dripping back' the results? I get them all in one go after 6 seconds. If you remove the if-clause with the mod, the piping works (you get all rows back of course)

Many thanks
Mark

May 14, 2009 - 12:07 pm UTC

how can they "drip back" if they all have to fetch over a ton of data - they spend a considerable amount of time looping over data - and then return it. So they all happen to have to iterate over a lot of stuff first (you wait) and then they return.

ops$tkyte%ORA11GR1> create or replace
  2      function parallel_pipelined( l_cursor in sys_refcursor )
  3      return t2_tab_type
  4      pipelined
  5      parallel_enable ( partition l_cursor by any )
  6  is
  7      l_session_id number;
  8      l_rec        t1%rowtype;
  9     begin
 10         select sid into l_session_id
 11           from v$mystat
 12          where rownum =1;
 13         loop
 14             fetch l_cursor into l_rec;
 15             exit when l_cursor%notfound;
 16             if mod(l_rec.id,100000)=0 then
 17             pipe row(t2_type(l_rec.id,to_char(sysdate,'hh24:mi:ss') || ' looped ' || l_cursor%rowcount || ' times',l_session_id));
 18             end if;
 19         end loop;
 20         close l_cursor;
 21         return;
 22    end;
 23  /

Function created.

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> set arraysize 1
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> select *
  2  from table
  3     (parallel_pipelined
  4       (CURSOR (select /*+ parallel(t1) */ * from t1))
  5      )
  6  /

        ID
----------
TEXT
-------------------------------------------------------------------------------
SESSION_ID
----------
    700000
11:46:04 looped 84563 times
       143

    800000
11:46:04 looped 99815 times
       147

    900000
11:46:06 looped 120694 times
       135

    400000
11:46:01 looped 46274 times
       170

    200000
11:46:00 looped 25984 times
       148

    600000
11:46:03 looped 80665 times
       148

    500000
11:46:02 looped 63968 times
       139

    100000
11:45:59 looped 10389 times
       138

    300000
11:46:01 looped 36643 times
       138


9 rows selected.

They each do a lot of work (you wait)
Then they each return a single row.

There is nothing to "drip" in your logic there.

return more data from the procedures and you'll see some "dripping" as you call it.

More pipe

Mark Rynbeek, May 14, 2009 - 12:43 pm UTC

Thanks Tom, I see your point.

But the timestamp of the actual PIPE ROW is in the result ( using to_char(sysdate,'hh24:mi:ss') )
So it turns out that there were seconds between the piping.. enough to have results sent one by one.

Like in your output session 148 finds results with 3 seconds in between:

200000
11:46:00 looped 25984 times
148

600000
11:46:03 looped 80665 times
148

So I would expect to see that first result earlier than the second.

cheers!
Mark

May 14, 2009 - 12:54 pm UTC

there is JUST ONE ROW. what could possibly drip, the overhead of parallel coordination is affecting you, the message buffering between the execution servers and the query coordinator is affecting you - the rows will not go one by one from the execution servers to the coordinator, they'll fill a message buffer.

return *more* data and you'll see "dripping", right now you have parallel query message buffering in between the execution servers and the query coordinator.

Mark, May 15, 2009 - 5:12 am UTC

Ok, that clears things up, thanks!

Just one more thing - would there be any way (a HINT?) to influence this query message buffering in between the execution servers and the query coordinator? Some setting to make buffer size =1 ?

May 15, 2009 - 1:45 pm UTC

not that I am aware of.

Pipelined function not being parallelized

Ike Wiggins, May 21, 2009 - 2:03 pm UTC

Okay, we thought we had a brilliant idea of using unions to call pipelined functions so our code execute in parallel....

The problem is that each function is sequentially executed :-( We wanted the functions themselves to be executed concurrently. Even if we were to pass in cursor to the functions and use the parralel_enable (partion by any) option, I don't think it would make a difference because they would be processed serially, does that make sense? Do you know of a workaround for this?

Sample query for reference.
INSERT INTO xxpa_pfp_temp_lines xpt
-- Find Current Rev Budget
(SELECT *
FROM TABLE(get_rev_budget(p_as_of_date, l_project_info.project_id)))
UNION ALL
(SELECT *
FROM TABLE(get_orig_rev_budget(p_as_of_date, l_project_info.project_id)))
UNION ALL
-- Find Current Cost Budget
(SELECT *
FROM TABLE(get_cost_budget(p_as_of_date, l_project_info.project_id)))
UNION ALL
(SELECT *
FROM TABLE(get_orig_cost_budget(p_as_of_date, l_project_info.project_id)))
UNION ALL
-- Find Current Actual Units and Qty reported (from Payroll/Timecard)
(SELECT *
FROM TABLE(get_hr_units(p_as_of_date, l_project_info.project_id)))
UNION ALL
-- Find Current Actual Expenditures/Cost
(SELECT *
FROM TABLE(get_actual_expenditures(p_as_of_date, l_project_info.project_id)))
UNION ALL
-- Find Committed cost
(SELECT *
FROM TABLE(get_committed_cost_amt(p_as_of_date, l_project_info.project_id)))
UNION ALL
-- Find PCIM Known Risks (KR)
(SELECT *
FROM TABLE(get_pcim_data(p_as_of_date, l_project_info.project_id, l_pfp_header_id)));

Result execution:
12:29:48: Cached 23 Expenditure Type IDs..
12:29:48: ====Finished Initialization=====
12:29:48: Generating for PFP Year Month: 200904
12:29:48: Starting Rev Budget..
12:29:49: Rev Budget completed for Project..
12:29:49: Starting Original Rev Budget..
12:29:49: Original Rev Budget completed for Project..
12:29:49: Starting Cost Budget..
12:29:49: Cost Budget completed for Project..
12:29:49: Starting Original Cost Budget..
12:29:49: Original Cost Budget completed for Project..
12:29:49: Starting HR Units..
12:29:54: HR Units completed for Project..
12:29:54: Starting Actual Expenditures..
12:31:02: Expenditures completed for Project..
12:31:02: Starting Commitments..
12:31:24: Commitments completed for Project..
12:31:24: Starting PCIM Known Risks..
12:31:27: PCIM Known Risks completed for Project..
12:31:27: ====Finished Get Data=====

Thanks,

Ike Wiggins
http://bipublisher.blogspot.com

May 23, 2009 - 1:22 pm UTC

you don't have any query to parallelize here ( a parallel pipelined function is a function - single function - we execute in parallel)

can you write a single function that takes a cursor as input that could be parallelized?

else, schedule N jobs - each that does their insert - and using job chaining (dbms_scheduler) you could have a finish job run

eg:

your code shedules the N insert routines and the last chained job
the N inserts run
the last chained job runs

INSTEAD OF

your code does an insert into as above
runs more code (this code would move into the last chained job)

Pipelined function not being parallelized

Ike Wiggins, May 27, 2009 - 7:42 pm UTC

Followup Question: Would I get parallelism if create a function that is:
1. pipelined parallel enabled
2. partions the row by any
3. that takes in the unioned pipelined functions as a cursor

Example:

SELECT * FROM
TABLE(SUPER_FUNCTION(CURSOR(
SELECT * FROM TABLE(get_rev_budget(...)
UNION ALL
SELECT *
FROM TABLE(get_orig_rev_budget(..))
)))

Ike Wiggins
http://bipublisher.blogspot.com

May 28, 2009 - 7:12 am UTC

The individual get functions are not parallelizable -

select * from table(f()) union all select * from table(g())

In general - f and g "share" a common session, a common session state - application context values, global variables in packages - they cannot just be executed in separate parallel query slaves as that would not be true anymore.

I gave you my idea, use dbms_jobs, have a rendezvous at the end of the process to gather up the outputs.

Parallelism

JAIME, June 29, 2009 - 8:08 pm UTC

Hi Tom,

We had oracle tunning our datawarehouse long time ago.

When i came to my company I read the tunning report done.
The whole report was done on 4 queries using different parallel degree, he showed how a single query improved with paralellism while he added more data to the table.
While he added data to the table the query or queries improved while paralellism was increased

So finally he recomended using for tables
Table size DEGREE
XMB -> degree N
XMB+YMB -> degree N +Z ............

I cannot understand his decisions trying 4 queries more or less, he makes decisions over all the database and how the degree should be taken for each table on the database, not even taking care if the table was partitioned ,or how many sessions could be on the database at some point in time having the new degrees on the table.

By the way he recommended using parallel degree of 16 -> big tables.
Maybe i am mistaken but if we have cpu = 24 (when the report was done) having possibly 33 sessions for one query "1 (qsid) + 16 + 16" with 24 cpus is not a good idea.

1)Could you please answer if he should have taken more care with the decisions of parallel degree on the database tables.

2)Was the decission of degree 16 on a CPU = 24 a good idea.

July 06, 2009 - 6:38 pm UTC

you would really have to ask the person that made the recommendation "why". It does not compute to me at all.

the total degree of parallelism can easily be higher than the number of cpu's, you do things OTHER than cpu work, if you want the cpu's to be driven to capacity you will often have a higher number of worker processes than cpu's

DIY parallelism and commit

Andrey Boychuk, March 06, 2010 - 6:37 am UTC

Dear Tom,
Thank you for your recent presentations in Moskow. They were interesting and very helpful.

We use your DIY parallelism widely. It works fine while we can split our task to many nearly independent peaces of work and in case of some crash we will run the process again and it will process the rest of peaces.

But sometimes these peaces are not logically independent e.g. in order to switch business date we have to process several sets of tables and logically it should looks like one transaction.

At present we solve the problem using locks to manage parallel processes: jobs will finish data processing and then wait for some lock to release. The main process will analyze the results of each job and it will signal them whether to commit or rollback. But it looks complex and we still have possibility of crash between releasing of lock and actual commit. In this case we will run the process again but for some time our database will be in logically inconsistent state.

Is there any solution that allows us to use something like two phase commit inside database?

P.S. I hope that I've answered on your "why do we need this transactional behaviour" question :-)

March 09, 2010 - 10:38 am UTC

would job chains (available in dbms_scheduler) be of use to you? It sounds like it.

DIY parallelism and commi

Andrey, March 11, 2010 - 2:24 am UTC

DBMS_SCHEDULER will help us to implement business logic in simpler way. We even will be able to simplify our recover procedure.
But it solves only 90+% of our task.
The great thing will be not to write this recover procedure at all and to commit the results of parallel tasks or rollback them simultaneously.

March 11, 2010 - 8:30 am UTC

that is called parallel dml and has existed for a long long time.

DIY parallelism and commit

Andrey, March 12, 2010 - 12:31 pm UTC

Probably my English is unclear.

What we need:

Task 1: Move some part of the data from table 1 and table 2 to history tables then clear tables.
Task 2: Move the contents of table 3 and table 4 to history tables with some complex transformation then update table 3 and 4
…
Task N: process table X

We cannot use Parallel DML because of its restrictions (these tables have triggers and self-referential foreign keys). More over, sometime we have to implement our logic in PL/SQL because of its complicity.

At present we run these tasks in parallel using jobs and when each task finishes, it waits for other tasks to finish and then for command to commit or rollback. So we’ve implemented something like two-phase commit. It would be great to use some embedded solution.

Thanks a lot.

March 12, 2010 - 4:03 pm UTC

oh, you mean:

flashback table t1, t2, t3, .... to <sometime ago>;

if you want to 'rollback' after many unrelated tranactions?

DIY parallelism and commit

Andrey, March 15, 2010 - 4:10 am UTC

It would work if we were able to get exclusive access to the tables but in reality some processes can insert new data to them. I imagine if it is possible to use several client sessions and to start distributed transaction between them and to run different tasks in different sessions in parallel (using different threads) and then use usual two phase commit?

Thanks a lot

March 15, 2010 - 11:12 am UTC

if you want to make it really really utterly hard - you would have to program this all in XA - go for it.

but it seems like you are using an atom bomb to crack open a walnut, it'll surely crack the walnut open.

DIY parallelism and commit

Andrey, March 17, 2010 - 11:50 am UTC

Dear Tom,
Thank you very much for your advice. Probably we will think how to handle possible rare situation when our database is in inconsistent state instead of playing with «atomic bomb».

Hope to see your new presentations in Kiev or San Francisco this year.

Best regards,
Andrey

Force Parallel DML

Grzegorz, June 09, 2010 - 8:53 am UTC

For VLDB processing need to ensure that my processing runs in parallel. However PQ servers may not be available ( used for other jobs ). How to force the sql ( e.g. CTAS ) run in parallel or die in case that PQ servers are not available for session now?

Thanks in advance,

Grzegorz

June 10, 2010 - 11:56 am UTC

http://docs.oracle.com/docs/cd/E11882_01/server.112/e10820/initparams181.htm#REFRN10159

PARALLEL DEGREE

Shaji Nair, June 14, 2010 - 12:33 am UTC

Tom,
We have a data extraction process which runs every night for an hour. The process has many CREATE TABLE <tablename> NOLOGGING PARALLEL(DEGREE 4) statements. Earlier the process without PARALLEL used to take 4-5 hours. Now the DBAs have started complaining about the load on the file system due to PARALLEL clause. They want as to remove the PARALLEL clause now. Is this using PARALLEL not a correct approach?

June 22, 2010 - 8:13 am UTC

think about this - you and your DBA team are the only ones that can answer this.

If you use parallel 4 - what happens? You will have the ability to do as much as 4 times the IO per unit of time as before. You will have the ability to use 4 times as much CPU per unit of time as before. You will have the ability to use lots more memory per unit of time as before.

Is that good? Yes, No and Maybe are the only answers.

If the system is already 80% or more loaded when you are processing, doing parallel four might be a really bad idea.

If the system isn't being used at all - you are the only process in town, then using parallel 4 might be awesome - in fact, parallel 32 might be the answer (depends on how much of a given resource - IO, CPU, Memory you have available).

If the DBA's are saying you are putting too much load on the system - and they are not making that up - then it is likely that you are in the first case - the system was already fully loaded and there just isn't enough room for you to do that.

You would need to ask them what "too much load" on the IO system means and why it is relevant to understand why they are asking you to back off the degree of parallelism.

Parallel slaves

goiyala3, June 28, 2010 - 2:40 am UTC

Hi tom,

My parallel params are

parallel_adaptive_multi_user boolean TRUE
parallel_automatic_tuning boolean FALSE
parallel_max_servers integer 35
parallel_threads_per_cpu integer 2

1. I run the following statements.
aleter session force parallel query parallel 5;
run <query>;
this query takes 10 slaves.
Is this possible to override my session settings.

2. is there a parameter at instance level to control session/query other than logon trigger
to take a particular amount of slaves sothat all the sessions will share equal no of slaves?
because if first user runs query, he consumess 15 slaves, next qry will have to consume less slaves.

July 06, 2010 - 11:19 am UTC

1) override what? You are using parallel 5 - which may use many more than 5 execution servers (a full scan process reads and feeds a sort process that sorts and feeds a query coordinator for example).

Typically - many parallel queries would use a little more than 2 times the degree of parallelism - as many many operations are pipelined like that.

2) if you want it to vary by session, you would need to set it in each session - if I'm following you, that is the answer.

If you want us to automatically adjust it for you - do not specify the degree at all and we'll adjust it up and down as the load goes up and down over time.

16 CPUs and Parallel Degree 4

Shaji Nair, July 02, 2010 - 2:03 am UTC

Tom,
Thanks for the answer you provided on June-22nd. The production server has 16 CPUs and the DBAs couldn't come up with any performance bottleneck issue related with this script. They are of the opinion that, since the table has PARALLEL DEGREE 4, each query , which involves this table(which is created with PARALLEL DEGREE 4) will have wait contention issue and adversely affect the performance. My question to you is, how does a query on a table which has PARALLEL DEGREE 4 affects the performance?

July 06, 2010 - 2:33 pm UTC

... They are of the opinion that ...

the thing about opinions - everyone has them.

Either

a) you want to use parallel query

or

b) you do not.

Will parallel 4 cause contention? If 100 people do parallel 4 - and parallel max servers is set to more than 400 - sure, it could.

But then again, if 100 people do serial 1 - you'll be doing 100 things at the same time and have contention as well.

If you have 100 people doing parallel 4 but not simultaneously - and parallel 4 lets them finish say twice as fast, you'll have more capacity on your system than ever before - since a single user can use a non-bankable resource (cpu/memory/disk - non-bankable means you use it or lose it, you cannot save it up) to greatest efficiency - finish faster and get off of the system faster.

So, the only answer, as always, is "it depends"

As the computer on Star Trek liked to say often "insufficient data"

degree = default

Ajeet, September 08, 2010 - 4:30 am UTC

Hello Tom,

There are many tables in one of my database schema ( oracle verion 10.2.0.4) with Degree as DEFAULT. does it mean that they have a degree > 1. ( Database Server has 8CPU and CPU per thread is 2). I was reading in metalink that Default = parallel_threads_per_cpu * cpu_count.

explain for queries off those tables shows the parallel processing...

is this understanding correct ?
and if I see lot of waits related with parallel execution - is it ok to test with degree = 1 ?

September 09, 2010 - 8:33 pm UTC

default is parallel, yes.

can you test with degree 1 - sure, you can always test, that is always true.

Parallel execution - limit for rowid range scan

Ravindra, September 18, 2010 - 2:20 am UTC

Hi Tom,

I have two queries(ITAS) both of them mutually exclusive(insert in separate table and select from separate tables) when i run them alone they finish in half the time they take when i run both the script in parallel.

Both of these script use 32 parallel threads at a time and parallel_max_server =64 on our Oracle 10gR2 running on HPUX with
8 Intel(R) Itanium 2 9100 series processors
16 logical processors (2 per socket)

When i traced the wait events. I see that most of the time its PX Deq Credit: send blkd and PX Deq: Table Q Normal.

Also when i kept continuous monitoring i see that one of the script does only single rowid range scan whereas the other one does 8 in parallel.

Can you please tell if there is any upper limit on rowid range scans when we run multiple queries in parallel.

September 20, 2010 - 1:41 pm UTC

ITAS means nothing to me, nothing.

... when i run them alone they finish in
half the time they take when i run both the script in parallel.
...

without reading further than that - that would make sense if your have sufficient resources (machine resources) for one of them to run unimpeded - but not for two of them to run at the same time. Sort of like what would happen on your laptop if you try to zip five directories at a time in five separate processes - they would be competing for your single CPU and single disk.

If you have only 16 cpus - parallel 32, at most parallel 48, would be about it.

When you run one of these queries using parallel 32 (which could easily have 64 or more processes going just for it) - do you find that you are already nearly 100% cpu - if so, doing two of them will definitely take you over that limit.

If you need to run both at the same time, probably - back off the parallel degree by half. Else, run them one after the other. In 11g - with parallel statement queueing - we would automagically do the latter - we wouldn't run them both at the same time anymore.

Parallel query

goiyala3, October 31, 2010 - 8:02 am UTC

Hi tom

Our dwh system is oracle solaris 5.10 with oracle 10.2.0.4,32 cpu ,32g memory. and parallel settings are

parallel_adaptive_multi_user boolean TRUE
parallel_automatic_tuning boolean TRUE
parallel_execution_message_size integer 4096
parallel_instance_group string
parallel_max_servers integer 200
parallel_min_percent integer 0
parallel_min_servers integer 0

A join query on table with 180 partitions took around 128 processes. After this query, if i started the
other query (which usually take 128 processes), it is going into serial mode eventhough it can take upto 72 processes.
whichever query comes first, take full advantage of parallel. other queries are going into serial.

My question is

1. why 2nd query goes in serial mode instead of using 72 available PQ slaves.
2. How to avoid each query have a fair equal chance for parallel? ( how about parallel_min_percent).
3. In 10g, all the sessions are PQ_status enable by default. Query on a table with 180 partitions with degree 1,
sometimes taking parallel or serial eventhough PQ slaves are availble.

Thanks in advance.

November 01, 2010 - 3:06 pm UTC

well, to be honest, with 32 cpus - if you did more than 128 things at a time - you would be doing things *slower*

why do you think you want to do more than 4x the number of CPU's??

1) be thankful (or get to 11g where parallel query queueing comes into play - a new feature)

2) don't run such massively parallel queries concurrently

3) insufficient data to comment

just one more question

A read, November 01, 2010 - 3:44 pm UTC

Hi Tom,

Thanks for guidence. if a table has 32 partitions with bitmap local index on them. partition pruning take place during the query and 5 partitions are select with DOP =5.

for optimizer possible access paths are:
full scan the partition by each slave process.
bitmap index scan the partition with by each slave process.

I am putting this question because there is some confusion that parallel processing uses only full table scan or partition scan as the access path.

I am doing some research on RDBMS architecture of different database and have not able to find the answer anywhere.

November 01, 2010 - 3:52 pm UTC

parallel processing allows for parallel index scans as well.

http://docs.oracle.com/docs/cd/E11882_01/server.112/e16579/px.htm

Index creation on Multiplease session concurrently

SHAILESH, December 02, 2010 - 6:25 am UTC

Hi tom,

We have migrated a database from HP plateform to Linux plateform (Oracle 10g). We have done the export and import activity. Now, while creating index on very large tables parallally from multiple session, INDEX CREATION is getting bery slow. We have explored option of increasing pga agreegate,db_multiblock_read_count setting, parallel indexing option along with nologging option. There is no significant change in the
performance.

Parameters are set as follows :
fast_start_parallel_rollback string LOW
parallel_adaptive_multi_user boolean TRUE
parallel_automatic_tuning boolean FALSE
parallel_degree_limit string CPU
parallel_degree_policy string MANUAL
parallel_execution_message_size integer 8192
parallel_force_local boolean FALSE
parallel_instance_group string
parallel_io_cap_enabled boolean FALSE
parallel_max_servers integer 8
parallel_min_percent integer 0
parallel_min_servers integer 0
parallel_min_time_threshold string AUTO
parallel_server boolean TRUE
parallel_server_instances integer 3
parallel_servers_target integer 128
parallel_threads_per_cpu integer 2
recovery_parallelism integer 0

If we are running index creation scripts,one by one, it is taking very less time as compared to several index creation at a time through multiple session.

Could you please let us know if, we can do someting to increase the performance of INDEX CREATION at multiple session at a time?

December 07, 2010 - 9:08 am UTC

you tell us nothing about the resources you have (the machine characteristics)

you don't tell us how large is large and how slow is slow (maybe what needs to be fixed is your expectations - not the performance - but we don't know)

you don't tell us how many sessions you are trying to do concurrently

you don't tell us the coordination between these sessions (they are not all hitting the same table are they? that'll cause some degree of contention)

... it is taking very less
time as compared to several index creation at a time through multiple session. ...

that sounds reasonable to me, off hand it sounds like you are SLAMMING THE %$@#%#@! out of the machine and it is swamped.

The perfect number of things to be done on the machine at a single time is 2xcpu (maybe up to 4xcpu).

So, if you have a 4cpu machine, you can have ONE SESSION doing parallel 8 and you'll use the machine perfectly. Or you could have 8 sessions doing a serial operation. But that would be about it.

Given your last two paragraphs, it sounds like you just killed the machine. And that your expectations need to be reset (you don't have infinite resources, you likely have a rather modestly sized machine and you have to respect that fact and use the resources that are available and no more)

ntile

A reader, January 03, 2011 - 8:24 am UTC

Happy New Year!

In one of your initial response related to DIY parallelism,
it should be possible to do the below :=

select min(rowid), max(rowid), count(*), nt
from ( select rowid, ntile(8) over (order by rowid) nt
from some_table
)
group by nt order by nt;

and kickstart 8 parallel jobs to delete from some_table (which is large one) or any other processing.

This should help us save time as it is doing the updates/deletes in parallel.Right ?

January 03, 2011 - 9:02 am UTC

or if you really wanted to save time, you would just use parallel operations and just issue "delete" - it would get parallelized by us.

do it yourself parallelism (superseded of course in 11g by the dbms_parallel_execute package!) would not be for "time" inasmuch as breaking a really big job into many smaller (independent) jobs.

about the parameter 'parallel_automatic_tuning'

Henry, March 06, 2011 - 8:33 pm UTC

first time to be here =]

I'm curious about parallel execution in ORACLE, thus lead me to this link.

Looking into the view v$parameter, I found that parameter 'PARALLEL_AUTOMATIC_TUNING' is marked isdeprecated='TRUE' in 10g. And I learned it might brought ORA-32004 if 'PARALLEL_AUTOMATIC_TUNING' is turned on with a system reboot.

I'm now wondering:
1. if it is a good choice to use 'PARALLEL_AUTOMATIC_TUNING' in 10g.

2. will the DEGREE value in TABLES as well as INDEXES help without 'PARALLEL_AUTOMATIC_TUNING' parameter and /* +PARALLEL */ hint?

my regards

March 07, 2011 - 12:27 pm UTC

1) well, things change over time. If something is deprecated in a specific release and above, it would not be a good idea to use it, no.

2) the explicit degrees of parallelism would be used.

Query cannot run in parallel

Lizhuohua, April 01, 2011 - 8:41 am UTC

Tom,
  We have a DB that can not run query in parallel.
SQL> set autot trace
SQL> select /*+ parallel(T)*/
  2   count(*)
  3    from CUSM T;

Elapsed: 00:03:28.52

Execution Plan
----------------------------------------------------------
Plan hash value: 2872246982

------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |     1 |  3660   (0)| 00:00:44 |       |       |        |      |            |
|   1 |  SORT AGGREGATE        |          |     1 |            |          |       |       |        |      |            |
|   2 |   PX COORDINATOR       |          |       |            |          |       |       |        |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |            |          |       |       |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |          |     1 |            |          |       |       |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR |          |  8541 |  3660   (0)| 00:00:44 |     1 |    36 |  Q1,00 | PCWC |            |
|   6 |       TABLE ACCESS FULL| CUSM     |  8541 |  3660   (0)| 00:00:44 |     1 |    36 |  Q1,00 | PCWP |            |
------------------------------------------------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     959221  consistent gets
     959044  physical reads
          0  redo size
        516  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> set autot off
SQL> select name, value
  2    from v$mystat s, v$statname n
  3   where s.statistic# = n.statistic#
  4     and n.name like '%parallel%';

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
DBWR parallel query checkpoint buffers written                            0
queries parallelized                                                      0
DML statements parallelized                                               0
DDL statements parallelized                                               0
DFO trees parallelized                                                    0

Elapsed: 00:00:00.01

--when the first query is running I issued the following query in another session 
SQL> select status, count(*) from v$pq_slave group by status;

STAT   COUNT(*)
---- ----------
IDLE          1

SQL> show parameter parallel

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback         string      LOW
parallel_adaptive_multi_user         boolean     TRUE
parallel_automatic_tuning            boolean     FALSE
parallel_execution_message_size      integer     2152
parallel_instance_group              string      ORAUTF1
parallel_max_servers                 integer     320
parallel_min_percent                 integer     0
parallel_min_servers                 integer     0
parallel_server                      boolean     TRUE
parallel_server_instances            integer     2
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0


SQL> select * from v$pq_sysstat;

STATISTIC                                                         VALUE
------------------------------------------------------------ ----------
Servers Busy                                                          0
Servers Idle                                                          1
Servers Highwater                                                    38
Server Sessions                                                  531452
Servers Started                                                     793
Servers Shutdown                                                    792
Servers Cleaned Up                                                    0
Queries Initiated                                                114146
DML Initiated                                                         0
DDL Initiated                                                         0
DFO Trees                                                        176207
Sessions Active                                                       0
Local Msgs Sent                                                 2309610
Distr Msgs Sent                                                 1417713
Local Msgs Recv'd                                               3409953
Distr Msgs Recv'd                                               2141506


SQL> !vmstat

System configuration: lcpu=16 mem=24576MB

kthr    memory              page              faults        cpu    
----- ----------- ------------------------ ------------ -----------
 r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa
 3  1 4991553 1024360   0   0   0 295 10393   0 387 63739 8561 17  5 77  0

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Productio
NLSRTL Version 10.2.0.4.0 - Production

It strange that v$pq_sysstat shows
  Servers Highwater                                                    38
  Servers Started                                                     793
I think it means that there have been some parallel query,
But I cannot run any sql in parallel
After set parallel_min_servers=10,I can find some ora_xx_p0x process ,
But the query still cannot run in parallel .
Could you give me any suggestions?
Tanks and Best Regards

April 12, 2011 - 11:24 am UTC

explain plan, as used by autotrace, 'lies' at times.

Try displaying the 'real' plan

ops$tkyte%ORA10GR2> set serveroutput off
ops$tkyte%ORA10GR2> select /*+ parallel(t) */ count(*) from big_table.big_table t;

  COUNT(*)
----------
    100000

ops$tkyte%ORA10GR2> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
SQL_ID  7f35pd0r9w6a7, child number 0
-------------------------------------
select /*+ parallel(t) */ count(*) from big_table.big_table t

Plan hash value: 1764098166

------------------------------------------------------------------------------
| Id  | Operation             | Name         | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |              |       |    80 (100)|          |
|   1 |  SORT AGGREGATE       |              |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| BIG_TABLE_PK |   100K|    80   (3)| 00:00:01 |
------------------------------------------------------------------------------


14 rows selected.

You may well need to be much more specific in your hinting, like telling it to full scan the table explicitly.

Parallel Update

INDRANIL, April 13, 2011 - 11:49 am UTC

Hi Tom,
We have a table like TEMP_LOCKBOX_REMITTANCE of   73294780
records...Now we are trying to update it -
############
alter session enable parallel dml;
alter table contract_payment parallel (degree 6);

--------TEMP_LOCKBOX_REMITTANCE
declare
          
     cursor c_cur3 is select rowid
       from TEMP_LOCKBOX_REMITTANCE 
       where  payment_method_type_code='MC';
       --and rownum = 1;

     
 Type cur3_type is table of c_cur3%rowtype;
 

 c3 cur3_type;
 
 begin
 -------
  Open c_cur3;
  LOOP
    Fetch c_cur3 bulk collect into c3 LIMIT 1000000;  
     --MC
    Forall I3 in c3.first .. c3.last
      UPDATE TEMP_LOCKBOX_REMITTANCE
      SET PAYMENT_CONTROL_NUMBER='C217DC8C813A77562F9A2979302CBD32E1924E940939D6461169DCE01F6150A7G425',
      last_modified_by = USER,
      changed_by       = USER,
      last_modified    = sysdate,
      changed_date     = sysdate  
      where rowid      = c3(I3).rowid;
   EXIT when c_cur3%NOTFOUND;
   commit;
  END LOOP; 
 Close c_cur3; 
 --------------------------------------------------------------------
EXCEPTION
  WHEN OTHERS THEN
   Dbms_Output.put_line('Error '||sqlerrm);
end;
/
############
I saw that 
SQL> show parameter parallel

NAME                                 TYPE
------------------------------------ --------------------------------
VALUE
------------------------------
fast_start_parallel_rollback         string
LOW
parallel_adaptive_multi_user         boolean
TRUE
parallel_automatic_tuning            boolean
FALSE
parallel_execution_message_size      integer
2152
parallel_instance_group              string

NAME                                 TYPE
------------------------------------ --------------------------------
VALUE
------------------------------

parallel_io_cap_enabled              boolean
FALSE
parallel_max_servers                 integer
200
parallel_min_percent                 integer
0
parallel_min_servers                 integer
5

NAME                                 TYPE
------------------------------------ --------------------------------
VALUE
------------------------------
parallel_server                      boolean
FALSE
parallel_server_instances            integer
1
parallel_threads_per_cpu             integer
2
recovery_parallelism                 integer
0
#########
We are having 16 CPU ... 
As I mentioned , the above update is one of the update statements...we are trying in 6 sessions each of them having similar update on the same table (mutually exclusive)... 
$ jobs
[6] +  Running                 nohup ./rel_6.sh >>a6.log 2>&1 &
[5] -  Running                 nohup ./rel_5.sh >>a5.log 2>&1 &
[4]    Running                 nohup ./rel_4.sh >>a4.log 2>&1 &
[3]    Running                 nohup ./rel_3.sh >>a3.log 2>&1 &
[2]    Running                 nohup ./rel_2.sh >>a2.log 2>&1 &
[1]    Running                 nohup ./rel_1.sh >>a1.log 2>&1 &
 Here the rel_1 is calling the 1st update , rel_2 is calling the 2nd update like that...
Now, unfornutely speed of update is still very very slow...even though INDEX / statistics all looks fine...my questions are - 1> Is there any way to speedup the update ?
2> can we set the statistics to boost up the speed ?
Our target is to speed up the update in any way...Thanks is advance.. Regards INDRANIL , IBM

April 13, 2011 - 1:44 pm UTC

EXCEPTION
  WHEN OTHERS THEN
   Dbms_Output.put_line('Error '||sqlerrm);
end;
/

I hate your code

http://asktom.oracle.com/pls/asktom/asktom.search?p_string=%22i+hate+your+code%22

WHY do people do that, why? why? why? I'll never ever get it.

All that does, ALL THAT DOES is:

a) make debugging harder, you just LOST the line number the error actually happened on

b) CHANGE (as in remove, as in blow away, as in LOSE) the A in the ACID properties - you are no longer 'atomic'.

I have a question for you - WHY IS THERE ANY CODE AT ALL here??? You seem to want to use parallel query:

alter session enable parallel dml;
alter table contract_payment parallel (degree 6);

but then you take a procedural - row by row (albeit using bulk processing, but still row by row) approach that EXCLUDES parallel query????

Why not just a single UPDATE using PDML ????

If you are updating every row - it is often faster to create a new table as select with the select "doing the update" and losing the old table.

If this is using a single index - something is seriously wrong. Your row by row approach (making it do scattered random IO's all over the place by rowid) is deadly too.

Explain why this isn't just a single update done using PDML for me please?

parallel processing

lizhuohua, May 19, 2011 - 5:55 am UTC

hi,Tom

YL1> select /*+ full(c) parallel(c) */ count(*) from cusm c;

  COUNT(*)
----------
     16169

Elapsed: 00:10:15.54
YL1> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  bf3g673du1c93, child number 0
-------------------------------------
select /*+ full(c) parallel(c) */ count(*) from cusm c

Plan hash value: 2872246982

------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |       |  3429 (100)|          |       |       |        |      |            |
|   1 |  SORT AGGREGATE        |          |     1 |            |          |       |       |        |      |            |
|   2 |   PX COORDINATOR       |          |       |            |          |       |       |        |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |            |          |       |       |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |          |     1 |            |          |       |       |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR |          | 16169 |  3429   (0)| 00:00:42 |     1 |    36 |  Q1,00 | PCWC |            |
|*  6 |       TABLE ACCESS FULL| CUSM     | 16169 |  3429   (0)| 00:00:42 |     1 |    36 |  Q1,00 | PCWP |            |
------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access(:Z>=:Z AND :Z<=:Z)


23 rows selected.

Elapsed: 00:00:00.04
YL1> select name, value
  2    from v$mystat s, v$statname n
  3   where s.statistic# = n.statistic#
  4     and n.name like '%parallel%';

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
DBWR parallel query checkpoint buffers written                            0
queries parallelized                                                      0
DML statements parallelized                                               0
DDL statements parallelized                                               0
DFO trees parallelized                                                    0

Elapsed: 00:00:00.09


during the query running,in other session:
YL1> select count(*) from v$pq_slave;

  COUNT(*)
----------
         1
YL1> show parameter parallel_max_server

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
parallel_max_servers                 integer     240

No query parallelized.
Any other parameters may caused this?

I have two other questions about parallel processing:
  1)How can I find the best DOP in a DW environment?
  2)Look at the following ,Why CURRENT_UTILIZATION greater than parallel_max_servers?
SQL> select * from v$resource_limit;

RESOURCE_NAME                  CURRENT_UTILIZATION MAX_UTILIZATION INITIAL_ALLOCATION   LIMIT_VALUE
------------------------------ ------------------- --------------- -------------------- --------------------
....
parallel_max_servers                           354             433        100                 3600

22 rows selected.

Elapsed: 00:00:00.01
SQL> show parameter parallel

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
....
parallel_max_servers                 integer     100
parallel_min_percent                 integer     0
....

Thanks !

Misleading statistics on DBA_HIST

Tony, June 10, 2011 - 9:39 pm UTC

Dear Tom,

Kindly clarify my doubts.

1. When Parallel update is used, two rows for the same update statement appears in v$sql. Why? The only difference I could see on v$sql_shared_cursor is for DIFF_CALL_DURN column.

2. The elapsed_time on v$sql / DBA_HIST* for the parallel execution is sum of all slaves and parent sessions. Its misleading. How to find the parent sessions elapsed_time alone?

3. For parallel queries, if DOP is 4, 9 sessions run. Why?

June 11, 2011 - 2:50 pm UTC

1) it is expected, the parallel execution server SQL is treated differently

http://docs.oracle.com/docs/cd/E11882_01/server.112/e17110/dynviews_3059.htm#REFRN30254

states as much in the description. It "looks" like the same sql, but you know it cannot be - as they are not all doing the same exact range - it is a slight different sql than the parent one - it is set up to follow the rules of parallel dml.

2) that is just the way it is reported, they are not segregated. I do not know of a way to separate that out.

3) 4 readers, 4 sorters, 1 coordinator - for example.

http://docs.oracle.com/docs/cd/E11882_01/server.112/e16541/parallel002.htm#sthref864

a reader, September 05, 2011 - 11:39 am UTC

Hi Tom,

For batch jobs, there are the following parallel execution strategies

1. Manual parallel execution from outside DB by multi-threading Java/C++ code

2. Manual parallel execution from inside DB by something like dbms_parallel_execute

3. Parallel enabled pipelined function

4. regular PL/SQL procedure, parallel DMLs

From my personal experience, if you can do 2, don't do 1, but 1 is easiest to use RAC to increase processing ability by simply adding node; if you can do 4, don't do 2. Between 4 and 2, the only major difference is 2 allows you to control chunking mechanism because you know more about your data than Oracle Optimizer.

But if it's just transforming a big table like 10 billion records, and both 4 and 2 can do the job, what would be your strategy? How can they easily take advantage of RAC if they run inside DB?

Thanks

September 06, 2011 - 8:53 am UTC

if you do #2 using dbms_job (do it yourself parallelism), instance affinity is easy to achieve - easier probably than doing it outside of the database.

I like 4, 2, 3, 1 in order...

PDML can be within an instance, across instances, or both.

Parallel processing of RAW files

Rajeshwaran, Jeyabal, January 15, 2012 - 12:33 pm UTC

Tom:

I was reading your "Do-It-Yourself Parallelism" approach.

I have some raw files (nearly 8,000 files) to be processed by some complex plsql validation.
The database has 16 cpu of dual core & no other process will be utilizing this cpu, while we run this complex validation.

Now we thought of appliying this DIYP. So we pulled all 8,000 file names into a table and used the below query to break it up into non-overlapping chuncks.

create table t(file_key number,file_name varchar2(40));

select file_key,ntile(8) over(order by file_key) as chunk_size
from t;

This query will break it up into 8 non-overlapping chunks. Now i need to open 8 sqlplus session and run the below piece of code?

where N=1,2,3...8.

begin
 for x in (select * from ( 
   select file_key,ntile(8) over(order by file_key) as chunk_size
   from t) 
   where chunk_size = N )
 loop
  dbms_job.submit(l_job_id,'complex_process(JOB)');
 end loop;
 commit;
end;
/

1) If we go by this, then session=1 will have 1000 raws files processing and session=2 will have another 1000 and so on. Does this cause any delay on CPU waits?
2) OR, do i need to run the above plsql block for N=1 and wait till it completes before N=2 is started?

January 17, 2012 - 11:48 am UTC

if you are using the job queues, you only need one sqlplus session to spawn all of the jobs.

You probably want to run 16 or 32 concurrent threads on a 16 core machine where you are the only thing running.f

You could:

a) make sure job_queue_processes is set to at least 16 or 32

b) just run:

for x in 1.. (16 or 32)
loop
dbms_job.submit(l_job_id,'complex_process('||x||')');
end loop;
commit;
end;
/

that'll schedule 16 jobs and tell them to each do chunk 'X'. To avoid all of them having to run the ntile query, perhaps you would just add a column to the existing table and pre-batch them up.

in-memory parallel processing

lh, January 25, 2012 - 4:21 am UTC

Hi

Found an interesting article
http://blogs.oracle.com/datawarehousing/entry/in-memory_parallel_execution_i

Got curious if I can find out from explain plan whether buffer cache or direct data block access is used ?

Also wondered that if I have parallel query which is using joins or subqueries with nested loops to other tables, how are these tables accessed when in-memory parallel query is not used. Is buffer cache used when accessing them ?

parallel_force_local

a Reader, January 30, 2012 - 8:59 am UTC

Hi Tom,

We have 6 nodes participating in a RAC. Node 1 and 2 are configured to satisfy connections via a SERVICE.

We were thinking to use the parameter Parallel_force_local set to TRUE, so that jobs don't get spawned over both the nodes to reduce Interconnect traffic.

Some questions:

1) Lets suppose we have bandwidth of spawning 8 and 12 PX based on parallel_max_target on node 1 and 2.

There is a job which runs by a user whose restrict dop under a resouce_consumer_group needs 16 PX.

1.Will the job be spawned across 16 Slaves on both nodes 1 and 2?
2. Will the DOP of that SQL statement be reduced to 12 to satisfy Node 2 (un-used limits) because the parameter is set to TRUE.

Thanks

Unable to enable parallel feature

Daissy, February 14, 2012 - 12:45 pm UTC

Dear Tom, 

I am trying to enable parallel processing to move massive data from old db design to a new db design.  I am the only user in the database so I think the parallel processing will be a good fit for us. But it seems I cannot get to run any query in parallel. The steps to move the data are as follows:

1. Enable parallel mode in tables and indexes with degree 4
   alter table Tgt_tab  parallel 4
   alter index Tgt_tab_idx parallel 4
2. Force the session to run in parallel
   alter session FORCE PARALLEL DML;
3. Use the append hint to do the insert 
   INSERT /*+ APPEND */ INTO Tgt_tab (ID, ....) 
   SELECT t1.* FROM src_tab1 t1, src_tab2 t2
   WHERE t1.id=t2.id;

The explain plan shows full scan on src_tab1 and index full scan on src_tab2. No parallel execution

We are running oracle 11gR2. I tried to test with your example and these are my results:

SQL> drop table t;

Table dropped.

SQL> create table t ( x int );

Table created.

SQL> exec dbms_stats.set_table_stats( user, 'T', numrows=>1000000, numblks =>100000 );

PL/SQL procedure successfully completed.

SQL> show parameter parallel;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback         string      LOW
parallel_adaptive_multi_user         boolean     TRUE
parallel_automatic_tuning            boolean     FALSE
parallel_degree_limit                string      CPU
parallel_degree_policy               string      AUTO
parallel_execution_message_size      integer     16384
parallel_force_local                 boolean     FALSE
parallel_instance_group              string
parallel_io_cap_enabled              boolean     FALSE
parallel_max_servers                 integer     80
parallel_min_percent                 integer     0
parallel_min_servers                 integer     8
parallel_min_time_threshold          string      AUTO
parallel_server                      boolean     FALSE
parallel_server_instances            integer     1
parallel_servers_target              integer     32
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0
SQL> explain plan for select /*+ PARALLEL*/ * from t;

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  1000K|    12M| 37566   (1)| 00:08:46 |
|   1 |  TABLE ACCESS FULL| T    |  1000K|    12M| 37566   (1)| 00:08:46 |
--------------------------------------------------------------------------

8 rows selected.

I hope you can shed some light here as I fail to see why the queries are not running in parallel. 

Thank you,
Daissy

February 14, 2012 - 2:05 pm UTC

show me the output of

select * from v$version

Unable to enable parallel feature

Daissy, February 14, 2012 - 2:11 pm UTC

Hi Tom, 

This is the result of my query:
SQL> select * from v$version
  2  ;

BANNER
------------------------------------------------------------------
Oracle Database 11g Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for 32-bit Windows: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

February 14, 2012 - 2:25 pm UTC

It isn't enterprise edition is it....

Unable to enable parallel feature

Daissy, February 14, 2012 - 2:38 pm UTC

No, it is not Enterprise Edition

February 14, 2012 - 3:13 pm UTC

then you do not have parallel.

http://docs.oracle.com/cd/E11882_01/license.112/e10594/toc.htm

specifically:

http://docs.oracle.com/cd/E11882_01/license.112/e10594/editions.htm#sthref122

Unable to enable parallel feature

Daissy, February 14, 2012 - 2:51 pm UTC

Tom,
Please disregard my recent request as I understood that the parallel feature is only available for EE.
Thank you though for the lead you gave as I could finally understand what was going on.
Thank you again,
Daissy

Parallel DML

Shimmy, March 02, 2012 - 1:23 pm UTC

Hi,

As per

http://docs.oracle.com/cd/B28359_01/server.111/b28313/usingpe.htm#autoId33

"Enabling Parallel DML
A DML statement can be parallelized only if you have explicitly enabled parallel DML in the session with the ENABLE PARALLEL DML clause of the ALTER SESSION statement.
"
If the above statement is true, in the example below, why does the explain plan(shown below), show that the update is being done in parallel mode?

CREATE TABLE SK_TEMP
( PARTITION_KEY                NUMBER(10)    NOT NULL,
  OBJ_NUM                      NUMBER(10)    NOT NULL,
  OBJECT_NAME                  VARCHAR2(300) NOT NULL,
  OBJECT_NAME_DUP              VARCHAR2(300)) 
PARTITION BY HASH (PARTITION_KEY)
(  
  PARTITION SK_TEMP_P01
    TABLESPACE APISA_P01,  
  PARTITION SK_TEMP_P02
    TABLESPACE APISA_P02,  
  PARTITION SK_TEMP_P03
    TABLESPACE APISA_P03,  
  PARTITION SK_TEMP_P04
    TABLESPACE SK_TEMP_P04 
);

SELECT   


CREATE TABLE SK_TEMP
( PARTITION_KEY                NUMBER(10)    NOT NULL,
  OBJ_NUM                      NUMBER(10)    NOT NULL,
  OBJECT_NAME                  VARCHAR2(300) NOT NULL,
  OBJECT_NAME_DUP              VARCHAR2(300)) 
PARTITION BY HASH (PARTITION_KEY)
PARTITIONS 5;

INSERT INTO SK_TEMP
(PARTITION_KEY,
 OBJ_NUM,
 OBJECT_NAME)
(SELECT MOD(OBJECT_ID, 20) PARITION_KEY,
        OBJECT_ID,
        OBJECT_NAME
 FROM ALL_OBJECTS);
 
COMMIT;

ANALYZE TABLE SK_TEMP COMPUTE STATISTICS;

SQL> SET AUTOTRACE TRACEONLY
SQL> UPDATE /*+PARALLEL(T, 8) */ SK_TEMP T
  2  SET OBJECT_NAME_DUP = OBJECT_NAME
  3  WHERE PARTITION_KEY IN (1, 2, 3, 7,10);

21622 rows updated.


Execution Plan
----------------------------------------------------------                      
Plan hash value: 2704836492                                                     
                                                                                
--------------------------------------------------------------------------------
-----------------------------------------------                                 
                                                                                
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time    
 | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |                                 
                                                                                
--------------------------------------------------------------------------------
-----------------------------------------------                                 
                                                                                
|   0 | UPDATE STATEMENT      |          | 62025 |  2119K|    17   (6)| 00:00:01
 |       |       |        |      |            |                                 
                                                                                
|   1 |  UPDATE               | SK_TEMP  |       |       |            |         
 |       |       |        |      |            |                                 
                                                                                
|   2 |   PX COORDINATOR      |          |       |       |            |         
 |       |       |        |      |            |                                 
                                                                                
|   3 |    PX SEND QC (RANDOM)| :TQ10000 | 62025 |  2119K|    17   (6)| 00:00:01
 |       |       |  Q1,00 | P->S | QC (RAND)  |                                 
                                                                                
|   4 |     PX BLOCK ITERATOR |          | 62025 |  2119K|    17   (6)| 00:00:01
 |KEY(I) |KEY(I) |  Q1,00 | PCWC |            |                                 
                                                                                
|*  5 |      TABLE ACCESS FULL| SK_TEMP  | 62025 |  2119K|    17   (6)| 00:00:01
 |KEY(I) |KEY(I) |  Q1,00 | PCWP |            |                                 
                                                                                
--------------------------------------------------------------------------------
-----------------------------------------------                                                                                                          
                                                                                
Predicate Information (identified by operation id):                             
---------------------------------------------------                             
                                                                                
   5 - filter("PARTITION_KEY"=1 OR "PARTITION_KEY"=2 OR "PARTITION_KEY"=3 OR "PA
RTITION_KEY"=7 OR "PARTITION_KEY"=10)                                           
                                                                                

Statistics
----------------------------------------------------------                      
         52  recursive calls                                                    
      77866  db block gets                                                      
      14323  consistent gets                                                    
          0  physical reads                                                     
   13669348  redo size                                                          
        387  bytes sent via SQL*Net to client                                   
        407  bytes received via SQL*Net from client                             
          3  SQL*Net roundtrips to/from client                                  
          1  sorts (memory)                                                     
          0  sorts (disk)                                                       
      21622  rows processed

March 02, 2012 - 1:48 pm UTC

PLEASE - do not use analyze, dbms_stats is the only appropriate way these days to gather statistics.

the scan was done in parallel - but before you got to the update step - there was a " P->S " - parallel to serial.

ops$tkyte%ORA11GR2> CREATE TABLE SK_TEMP
  2  ( PARTITION_KEY                NUMBER(10)    NOT NULL,
  3    OBJ_NUM                      NUMBER(10)    NOT NULL,
  4    OBJECT_NAME                  VARCHAR2(300) NOT NULL,
  5    OBJECT_NAME_DUP              VARCHAR2(300))
  6  PARTITION BY HASH (PARTITION_KEY)
  7  PARTITIONS 5;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> INSERT INTO SK_TEMP
  2  (PARTITION_KEY,
  3   OBJ_NUM,
  4   OBJECT_NAME)
  5  (SELECT MOD(OBJECT_ID, 20) PARITION_KEY,
  6          OBJECT_ID,
  7          OBJECT_NAME
  8   FROM ALL_OBJECTS);

72286 rows created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> COMMIT;

Commit complete.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'SK_TEMP' );

PL/SQL procedure successfully completed.


ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set serveroutput off
ops$tkyte%ORA11GR2> UPDATE /*+PARALLEL(T, 8) */ SK_TEMP T
  2   SET OBJECT_NAME_DUP = OBJECT_NAME
  3    WHERE PARTITION_KEY IN (1, 2, 3, 7,10);

18065 rows updated.

ops$tkyte%ORA11GR2> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  3bfvt5f8c4hj7, child number 0
-------------------------------------
UPDATE /*+PARALLEL(T, 8) */ SK_TEMP T  SET OBJECT_NAME_DUP =
OBJECT_NAME   WHERE PARTITION_KEY IN (1, 2, 3, 7,10)

Plan hash value: 145170646

--------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time
--------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT      |          |       |       |    18 (100)|
|   1 |  UPDATE               | SK_TEMP  |       |       |            |
|   2 |   PX COORDINATOR      |          |       |       |            |
|   3 |    PX SEND QC (RANDOM)| :TQ10000 | 51633 |  1764K|    18   (0)| 00:00:01
|   4 |     PX BLOCK ITERATOR |          | 51633 |  1764K|    18   (0)| 00:00:01
|*  5 |      TABLE ACCESS FULL| SK_TEMP  | 51633 |  1764K|    18   (0)| 00:00:01
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access(:Z>=:Z AND :Z<=:Z)
       filter(("PARTITION_KEY"=1 OR "PARTITION_KEY"=2 OR "PARTITION_KEY"=3 OR "P


24 rows selected.

ops$tkyte%ORA11GR2> select count(*) from sk_temp;

  COUNT(*)
----------
     72286

<b>the fact that I can read the table indicates that the update was *not* parallel.  Part of the plan was - but not the update...</b>


ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> commit;

Commit complete.

ops$tkyte%ORA11GR2> alter session enable parallel dml;

Session altered.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> UPDATE /*+PARALLEL(T, 8) */ SK_TEMP T
  2   SET OBJECT_NAME_DUP = OBJECT_NAME
  3    WHERE PARTITION_KEY IN (1, 2, 3, 7,10);

18065 rows updated.

ops$tkyte%ORA11GR2> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  3bfvt5f8c4hj7, child number 5
-------------------------------------
UPDATE /*+PARALLEL(T, 8) */ SK_TEMP T  SET OBJECT_NAME_DUP =
OBJECT_NAME   WHERE PARTITION_KEY IN (1, 2, 3, 7,10)

Plan hash value: 2318138013

--------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time
--------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT      |          |       |       |    18 (100)|
|   1 |  PX COORDINATOR       |          |       |       |            |
|   2 |   PX SEND QC (RANDOM) | :TQ10000 | 51633 |  1764K|    18   (0)| 00:00:01
|   3 |    UPDATE             | SK_TEMP  |       |       |            |
|   4 |     PX BLOCK ITERATOR |          | 51633 |  1764K|    18   (0)| 00:00:01
|*  5 |      TABLE ACCESS FULL| SK_TEMP  | 51633 |  1764K|    18   (0)| 00:00:01
--------------------------------------------------------------------------------

<b>see how the update step moved down INTO the parallel part of the plan...</b>


Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access(:Z>=:Z AND :Z<=:Z)
       filter(("PARTITION_KEY"=1 OR "PARTITION_KEY"=2 OR "PARTITION_KEY"=3 OR "P


24 rows selected.

ops$tkyte%ORA11GR2> select count(*) from sk_temp;
select count(*) from sk_temp
                     *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

<b>and we cannot read the table until we commit - one of the side effects of a parallel DML statement...</b>

full hint with parallel

A reader, March 26, 2012 - 2:13 am UTC

Hello Sir,

One of our Sr. developer says whenever you use parallel hint, always use with full hint. Is it so? even my query does full scan without specifying full hint, still do i need to give full hint? any reason for this?

Thanks

March 26, 2012 - 6:55 am UTC

The parallel hint is typically fine all by itself. Parallel operations tend to use full scan operations anyway.

UNION and parallel query

Al Ricafort, May 09, 2012 - 2:28 am UTC

Hi Tom,

I would assume that any query using UNION is a good candidate for a parallel query. Because if you think about it the SQLs in the UNION can very well be executed independently of each other.

But why is it that most execution plan of UNIONs will not consider parallel query?

Also reading from your book I got the impression that you are not a fan of parallel query in an OLTP environment, although you did say that 'nothing is ever black and white".

May 10, 2012 - 3:07 am UTC

I would assume that any query using UNION is a good candidate for a parallel
query. Because if you think about it the SQLs in the UNION can very well be
executed independently of each other.

but they currently are not. Check back in a year maybe...

And in OLTP - you are supposed to be executing queries that touch a tiny bit of data - if not - if your queries hit large amounts of data, if they process large amounts of data - you by definition are NOT OLTP anymore.

UNION and parallel query

Al Ricafort, May 10, 2012 - 6:57 am UTC

Hi Tom,

"but they currently are not. Check back in a year maybe..."

Why are they not independent?

Why can't I?
1)Divvy up the 2 SQLs into 2 sessions(or PQ slaves), then
2)Assemble the result.

May 10, 2012 - 8:57 am UTC

because the code doesn't exist to do that in the database server.

and don't forget - union => sort/distinct as the last step.

Parallel DDL vs QUERY

Zilvinas, June 06, 2012 - 3:13 am UTC

Hi, Tom can you explain how it's working.

ALTER SESSION FORCE PARALLEL QUERY PARALLEL 4
ALTER SESSION FORCE PARALLEL QUERY PARALLEL 2

CREATE TABLE b AS select * from a
takes 10 seconds

ALTER SESSION FORCE PARALLEL QUERY PARALLEL 2
ALTER SESSION FORCE PARALLEL QUERY PARALLEL 4

CREATE TABLE b AS select * from a
takes 5 seconds

ALTER SESSION FORCE PARALLEL DDL PARALLEL 4
ALTER SESSION disable PARALLEL QUERY PARALLEL 4
takes same 5 seconds

v$px_process in any case shows 4 processes running.

1. Is there a way to see which process from v$px_process is doing DDL or QUERY? v$session.action always says "SQL Window - Query data of table"?

2. What algorithm Oracle uses for determining number and type of parallel processes when I try to force? Why not 2 for QUERY and 4 for DDL as I tried to force?
I noticed that if parallel degree for DDL and QUERY is different Oracle always uses degree of DDL. If I disable parallel ddl then Oracle uses degree of QUERY, but time of CREATE TABLE is the same as parallel query disabled.

3. In data warehouse I use following concept:
INSERT /*+ append */
INTO Dest_Table(...)
SELECT *
FROM TABLE(Parallel_Pipelined_ETL_Function(CURSOR(SELECT /*+ parallel(t1) */ * FROM Src_Table1 t1)))

Can I have x parallel processes for select, y for Pipelined functions and z for insert?
My testing shows that I can:
ALTER SESSION FORCE PARALLEL QUERY PARALLEL x
ALTER SESSION FORCE PARALLEL DML PARALLEL z

but number of pipelined funcions is allways the same like DML

4. Why parallel hints is not working for me? Only 'alter session ... force ...' works?

I know that preferred way is to leave everything to Oracle. But sometimes there are problems not with Oracle, but with people and you are forced to force Oracle. I can't change any parameters in database myself. And it is very very hard to convince administrators to change something. They told me they want to sleep quietly and do not want to use anything new from last two releases of Oracle DB and so on.
So I'm researching how it works and what results I can get myself

June 06, 2012 - 5:44 am UTC

1) what is the difference between DDL and a QUERY given that your DDL is 99.999% Query?

and you are ONLY DOING DDL - there is just DDL taking place here.

If you want to run a query, start it with SELECT (or with)
If you want to run DDL, start it with CREATE

You can see what sql a session is executing in v$session, just look at the sql_address joined to v$sql or v$sqltext_with_newlines

2) parallel sets the degree of parallelism for each step in a plan. Say you have a query like "select * from big_table order by x". And you run with parallel 2. You might have 2 scanners, 2 sorters and 1 coordinator.

http://docs.oracle.com/cd/E11882_01/server.112/e25523/parallel002.htm#sthref862

and timing such a small event is not going to be very reliable. In my opinion 5 and 10 seconds are just about the same, nothing significant about that. Try doing things that take minutes.

3) You would use the parallel hint for each step

4) you have to enable parallel DDL and then the hints will work.

DIY - parallelism

Prakash Rai, June 07, 2012 - 11:38 am UTC

Hi Tom -
Great writting on DYI parallelism. Very creative approach.

I use PERCENTIAL_DISC to break data volumn from "big_table" into 4 groups, put them into loop to process (aggregate and load into GTT and then insert into final table)using dbms_job.

Finally I want to schedule to run the main procedure once a day. First run goes fine. Second run gets into DEAD LOCK between jobs while one is deleting "rslt_table" and another is inserting into "rslt_table". Please see the code snippet.

Question 1: why would job1 and job2 dead lock?
Question 2: how would I preseve them from deal lock?

-->> code snippet
/* split big_table by id into 4 groups to process with DIY - parallelism approach*/
BEGIN
FOR x IN (
SELECT
MIN (id) AS t1,
PERCENTILE_DISC (0.25) WITHIN GROUP (ORDER BY id) AS t2
FROM big_table
UNION ALL
SELECT
PERCENTILE_DISC (0.25) WITHIN GROUP (ORDER BY id) +1 AS t1,
PERCENTILE_DISC (0.50) WITHIN GROUP (ORDER BY id) AS t2
FROM big_table
UNION ALL
SELECT
PERCENTILE_DISC (0.50) WITHIN GROUP (ORDER BY id) +1 AS t1,
PERCENTILE_DISC (0.75) WITHIN GROUP (ORDER BY id) AS t2
FROM big_table
UNION ALL
SELECT
PERCENTILE_DISC (0.75) WITHIN GROUP (ORDER BY id) +1 AS t1,
MAX (id) AS t2
FROM big_table)
LOOP

--x_gtt1, x_gtt2, x_gtt3 and x_gtt4 are global temporary table build with ON COMMIT PRESERVE ROWS clause

strSQL :=
'BEGIN
INSERT /*+ APPEND */ INTO x_gtt'|| tbl_suffix ||'
SELECT
/*+ PARALLEL(te,8) */
te.id,
... -- fields
FROM
te,
t
WHERE
te.id_fk = t._id
AND t.id between '
|| x.t1
|| ' AND '
|| x.t2
|| ' GROUP BY
te.id,
t.id;
COMMIT; /* this commit prevents ORA-12838: cannot read/modify an object after modifying it in parallel. "A transaction containing a direct-path INSERT statement cannot be or become distributed." */

DELETE FROM rslt_table WHERE id BETWEEN ' || x.t1 || ' AND ' || x.t2 ||';'||'
COMMIT;

INSERT /*+ APPEND */ INTO rslt_table SELECT * FROM x_gtt'|| tbl_suffix ||';
COMMIT;
END;';

DBMS_JOB.submit(jobno, strSQL);
commit; --commit the job to kick off right now;

-- dbms_output.put_line('strSQL='||strSQL);
-- dbms_output.put_line('job='||jobno);

tbl_suffix := tbl_suffix + 1;
END LOOP;
END;

Thanks as always!
Prakash

June 07, 2012 - 6:53 pm UTC

question for you before I even think about analyzing this

since it takes a long time to do the percentile stuff, and longer to put the data into other tables

why not just use my rowid ranges (instantaneous) and work off of the original table (no overlaps, no concurrency issues whatsoever)

I really don't like that you are not using binds for something that would be trivial to bind with.

DIY - parallelism

Prakash Rai, June 07, 2012 - 4:47 pm UTC

Note to my last posting;

Actually there is no DEAD LOCK, just the jobs were blocking each other while one session is INSERTING the other is DELETING. Due to the heavy volume, it seems to me they were gone dead lock.

To patch the LONG BLOCKING, I added intermidate table to load data from all 4 GTTs and last job that gets to run TRUNCATE, INSERT to the final table.

DIY parallelism approach worked OK to me except it requires some level of settings but worth to try.

Pleaes suggest if you see other options to DELETE, INSERT with DIY parallelism.

Thanks,

June 07, 2012 - 6:59 pm UTC

insert /*+ append */ takes a full table lock, no DML on that table until it is done.

why use an intermediate table?
why use percentile???

rowid ranges would be MUCH faster and contention free.

DIY - parallelism

Prakash Rai, June 08, 2012 - 11:46 am UTC

Tom -

I played with rowid ranges and it sure computes many times faster than percential_disc usage. Though rowid ranges is limited in my world as my process has to run on different schema and access tables over synonyms.I sure see the benefits of rowid ranges.

From your questioning response, I applied bind variables and took off /*+ APPEND */ and intermidate tables. With all this changes I am happy with the response time.

I think I have some room to expand my DIY parallelism from 4 to 6 or more if needed.

Thanks,
Prakash

June 08, 2012 - 3:54 pm UTC

so what if it requires a different schema an a synonym, you can figure out the base table and get the rowid ranges.

DIY - parallelism

Prakash Rai, June 09, 2012 - 5:19 pm UTC

Hi Tom,

Forgot to mention that the reporting schema that will run the script has no access to dba_objects, dba_extents or any dba objects.

Would there be workaround in such setup?

Thanks,
Prakash

June 10, 2012 - 3:35 am UTC

Get access?? Tell the dba you have a way to do something many times faster, using many orders of magnitude less of their resource - and you need their help to get the permissions to do it.

They might even consider working with you

Number of parallel slaves

john, June 11, 2012 - 7:02 pm UTC

Hi Tom,

First of all thanks for your valuable responses.

Can you please let me know how to find the number of parallel slaves available in the database? It's a RAC database and version: 11gr2.

Thank you

June 12, 2012 - 4:34 am UTC

ops$tkyte%ORA11GR2> show parameter parallel_max

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
parallel_max_servers                 integer     160

if you have enabled cross instance parallel query, you can look at gv$parameter to get the cumulative value.

Number of Parallel slaves

John, June 12, 2012 - 1:20 pm UTC

Hi Tom,

Sorry for not being clear about my requirement. I want to determine number of parallel slaves available in an instance(at a given time) which we would use to decide whether to run a query or not. It is a 5-node RAC database.

Thanks Tom

June 13, 2012 - 1:08 am UTC

we do that automatically - http://docs.oracle.com/cd/E11882_01/server.112/e25513/initparams184.htm#REFRN10159

you cannot really do that in your code, the instant you found out there where 20 available PQ execution servers, there might become only 0 or 5. You have concurrency issues - you cannot stop 50 other session from trying allocate that resource at the same time.

I'd suggest using parallel statement queueing (11g) or the resource manager to limit the number of active sessions and allowing those active sessions to consume as much of the machine as possible. See

http://www.youtube.com/watch?v=hMXsrxyeRro

temp usage

A Reader, June 20, 2012 - 7:56 am UTC

Hi Tom,
THis is excerpts for the plan of bad query query which is dieing out because of the Temp space issue ( ORA-01652: unable to extend temp segment by 128 in tablespace TEMP )
we have sufficent temp tablespace space ( around 100gb)

----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------------
| | HASH JOIN | | 9 | 4752 | 79467 (1)| 00:15:54 | Q1,04 | PCWP | |
| 10 | BUFFER SORT | | | | | | Q1,04 | PCWC | |
| 11 | PX RECEIVE | | 8204K| 579M| 34250 (2)| 00:06:52 | Q1,04 | PCWP | |
| 12 | PX SEND HASH | :TQ10001 | 8204K| 579M| 34250 (2)| 00:06:52 | | S->P | HASH |
|* 13 | TABLE ACCESS FULL | t1 | 8204K| 579M| 34250 (2)| 00:06:52 | | | |
| 14 | PX RECEIVE | | 3510K| 1520M| 45197 (1)| 00:09:03 | Q1,04 | PCWP | |
| 15 | PX SEND HASH | :TQ10003 | 3510K| 1520M| 45197 (1)| 00:09:03 | Q1,03 | P->P | HASH |
|* 16 | HASH JOIN BUFFERED | | 3510K| 1520M| 45197 (1)| 00:09:03 | Q1,03 | PCWP | |
| 17 | BUFFER SORT | | | | | | Q1,03 | PCWC | |
| 18 | PX RECEIVE | | 4076K| 213M| 43007 (1)| 00:08:37 | Q1,03 | PCWP | |
| 19 | PX SEND HASH | :TQ10000 | 4076K| 213M| 43007 (1)| 00:08:37 | | S->P | HASH |
|* 20 | TABLE ACCESS FULL | t2 | 4076K| 213M| 43007 (1)| 00:08:37 | | | |
| 21 | PX RECEIVE | | 3192K| 1214M| 2177 (1)| 00:00:27 | Q1,03 | PCWP | |
| 22 | PX SEND HASH | :TQ10002 | 3192K| 1214M| 2177 (1)| 00:00:27 | Q1,02 | P->P | HASH |
| 23 | PX BLOCK ITERATOR | | 3192K| 1214M| 2177 (1)| 00:00:27 | Q1,02 | PCWC | |
| 24 | TABLE ACCESS FULL | t3 | 3192K| 1214M| 2177 (1)| 00:00:27 | Q1,02 | PCWP | |

Question/s

1. do parallel_max_servers integer and parallel_threads_per_cpu settings would have made this query to run out of temp space?
2. is it something if i have huge number of parallel_max_servers set in the instance and those parallel server would consume the temp.
3. is remedy is to reduce the parallel sevrer ?

June 21, 2012 - 7:31 am UTC

we have sufficent temp tablespace space ( around 100gb)

umm, apparently not!

1) no, your query did
2) they could use more temp than fewer would - it depends.
3) increasing temp pops into my head really fast. 100gb is not very large if you are running parallel query and big queries (and you are only running parallel query if you have big queries)

..contd

A Reader, June 21, 2012 - 8:14 am UTC

yes Tom,
tables involved here has degree set to more then 1.

so true , if we want to run parallel then we would need to add disks...:)

Also,

q1)
how do we bencharmk how much big really i need temp. so i would do
a) allow autoextend on on temp
b) run the big query
c) when query ends , make autoextend off
at (c) we would get the maximum temp size we would need.

Q2) How the temp size allocation happens in case parallel query. I did some benchmark and found if I have noparallel query succeeds , but when i have set parallel degree at table then query run out of temp space... why Oracle does not try to maintain the temp usage ( within max limit) as if it did in single thread run?..

June 22, 2012 - 7:05 am UTC

that would get you the temp size for running that query all by itself.

if in real life there are other things going on - you'd need to account for their temp space as well

and further, if there were other things, your PGA workareas might not be as large as they would be in isolation - which could make it use more temp at runtime.

But yes, best to benchmark - get a reasonable estimate, add in concurrency related issues (if 10 people are doing this - then you need more then 10 times as much)...

..contd

A Reader, June 22, 2012 - 10:28 am UTC

Tom
we ran the above serially ( no parallel)

In above plan we have TABLE ACCESS FULL t2 during which we see around 100GB temp is being used.

during full table access it has to filter some of the column of t2 .
but the filter is resulting again reading 90% of the table.

question

do the full table scan results into huge temp tablespace usage?

regards

June 22, 2012 - 4:56 pm UTC

no, full table scans do not necessarily result in huge temp tablespace usage. It depends on the nature of the query.

select count(*) from ten_billion_row_table_that_is_one_terabyte_in_size;

will use hardly any temp space.

select * from that_same_big_table order by 1,2,3;

will probably use terabytes of temp to sort in .

..contd

A reader, June 22, 2012 - 12:48 pm UTC

Tom,
Just to add to it.

a) t1 and t2 are full scaned and hash joined later.

b) t2 table is just 2 Gb and t1 is around 1 GB

c) in session long ops t2 full scan was showing at the same time temp usage was increasing and touches 90GB.
filter on t2 is ( col1 = ? and col2 = ? and col3 <>? ), so during full scan candidate rows were selected from t2 based on the filter.

e) filter ( col1 = ? and col2 = ? and col3 <>? ) results in alomost reading the 95% of the table data. So indexing on these filter predicates didnt helped.

e) Once t2 full scan was over then temp usage remained at 90Gb and no further increase..

f) next long ops was hash join and so on ...

g) we have work_area policy AUTO and pga_aggregate_target 9GB.. and seen pga_usage around 4GB during above.

question.
1. How come full scan of 2GB table resulted in 90GB of temp usage? while participating in hash join with t1 .
2. is it pointing to any bug related with temp usage?

many thanks.

June 22, 2012 - 5:00 pm UTC

show me the plan, I don't see how a 2gb table would explode out to 90gb of time.

we don't want any indexes at all.

what tool/query did you use to monitor temp space usage

Parallel processing - DBMS_STATS ALL

Iván Mínguez Moreno, June 22, 2012 - 1:04 pm UTC

Hi Tom & Team,

Oracle 9207: Having a partitioned table with a partitioned index asociated to it, and gathering statistics with DBMS_STATS I found a pretty interesting thing.

Apparently while gathering table partition statistics, 16 sub-processes where spawned, but while gathering index partition statistics just the session process was in charge of performing that task.

Why is this happening? I specified 8 sub-processes to be used, not 16 and I thought that those 8 processes where going to process every request, not only table...

Used statetment: exec dbms_stats.gather_table_stats(ownname=>'&&OWNER_NAME',tabname=>'&&TABLE_NAME',GRANULARITY=>'ALL',estimate_percent=>0.1,degree=>8,cascade=>true);

I would appreciate any comment on this subject.

June 22, 2012 - 5:03 pm UTC

parallel 8 queries make it so that each step could have 8. 8 readers, 8 sorters, feeding into 1 query coordinator, this is quite normal and expected:

http://docs.oracle.com/cd/E11882_01/server.112/e25523/parallel002.htm#sthref862

not every operation is possible to parallelize or should be parallelized

who is this 'team' you refer to?

..conts(above) 1652

A Reader, June 25, 2012 - 1:15 pm UTC

Tom,

As discussed   in the thread above 

<quote>......show me the plan, I don't see how a 2gb table would explode out to 90gb of time. </quote>.
I am furnishing the details,
This is real puzzling.


RDBMS : 11.2.0.2 
OS : Windows 2008


 

Batch job runs the below Statement ( through application)


INSERT      /*+ APPEND */INTO T_LOAD
            (rowid_object, rowid_object_matched, rowid_match_rule,
             automerge_ind)
   SELECT          /*+ USE_HASH (b1, b2) */
          DISTINCT b1.rowid_object, b2.rowid_object, 'SVR1.2Q5X', 1
              FROM t$mle_party b1,
                   (SELECT /*+ USE_HASH (C_MT_PARTY_IDENTIFIER , C_MT_PARTY_ADDRESS) */
                           c_mt_party_identifier.rowid_object,
                           c_mt_party_identifier.party_idnty_value,
                           c_mt_party_identifier.party_idnty_type_cd,
                           c_mt_party_address.party_type,
                           c_mt_party_address.party_name,
                           c_mt_party_address.country,
                           c_mt_party_address.bpa_string
                      FROM c_mt_party_identifier INNER JOIN c_mt_party_address
                           ON c_mt_party_address.rowid_object =
                                            c_mt_party_identifier.rowid_object
                     WHERE c_mt_party_address.party_type = 'ORG') b2
             WHERE b1.rowid_object != b2.rowid_object
               AND COALESCE (b1.party_idnty_value,
                             COALESCE (b2.party_idnty_value, 'S~!#*+')
                            ) =
                      COALESCE (b2.party_idnty_value,
                                COALESCE (b1.party_idnty_value, 'T~!#*+')
                               )
               AND COALESCE (b1.party_idnty_type_cd,
                             COALESCE (b2.party_idnty_type_cd, 'S~!#*+')
                            ) =
                      COALESCE (b2.party_idnty_type_cd,
                                COALESCE (b1.party_idnty_type_cd, 'T~!#*+')
                               )
               AND b1.party_name = b2.party_name
               AND b1.country = b2.country
               AND b1.bpa_string = b2.bpa_string;

It is resulting into

ORA-01652: unable to extend temp segment by 128 in tablespace TEMP 
  


  
explain plan for the above is :      

Elapsed: 00:00:00.06
13:08:22 SQL> select * from  table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------ 

Plan hash value: 1115960413

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name                 | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT                |                      |     1 |   277 |       |   134K  (1)| 00:26:56 |
|   1 |  LOAD AS SELECT                 | T_LOAD         |       |       |       |            |          |
|   2 |   HASH UNIQUE                   |                      |     1 |   277 |       |   134K  (1)| 00:26:56 |
|*  3 |    FILTER                       |                      |       |       |       |            |          |
|   4 |     NESTED LOOPS OUTER          |                      |     1 |   277 |       |   134K  (1)| 00:26:56 |
|*  5 |      HASH JOIN                  |                      |     1 |   236 |    25M|   134K  (1)| 00:26:56 |
|*  6 |       HASH JOIN                 |                      |   136K|    23M|   128M| 77430   (1)| 00:15:30 |
|   7 |        TABLE ACCESS FULL        | T$MLE_PARTY          |  1130K|   115M|       |  3369   (1)| 00:00:41 |
|*  8 |        TABLE ACCESS FULL        | C_PARTY_ADDRESS      |  8204K|   578M|       | 34250   (2)| 00:06:52 |
|*  9 |       TABLE ACCESS FULL         | C_PARTY              |  4076K|   213M|       | 43007   (1)| 00:08:37 |
|* 10 |      TABLE ACCESS BY INDEX ROWID| C_PARTY_IDENTIFIER   |     1 |    41 |       |     6   (0)| 00:00:01 |
|* 11 |       INDEX RANGE SCAN          | SVR1_17K3            |     4 |       |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------------- 

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(COALESCE("B1"."PARTY_IDNTY_VALUE",COALESCE("C_MT_PARTY_IDENTIFIER"."IDENTIFIER_VALUE",'S~!#*+'))=COA
              LESCE("C_MT_PARTY_IDENTIFIER"."IDENTIFIER_VALUE",COALESCE("B1"."PARTY_IDNTY_VALUE",'T~!#*+')) AND
              COALESCE("B1"."PARTY_IDNTY_TYPE_CD",COALESCE("C_MT_PARTY_IDENTIFIER"."IDENTIFIER_TYPE_CD",'S~!#*+'))=COALESCE("C_M
              T_PARTY_IDENTIFIER"."IDENTIFIER_TYPE_CD",COALESCE("B1"."PARTY_IDNTY_TYPE_CD",'T~!#*+')))
   5 - access("B1"."PARTY_NAME"="PARTY_NAME" AND "BO"."ROWID_OBJECT"="C_MT_PARTY_ADDRESS"."PARTY_ROWID")
       filter("B1"."ROWID_OBJECT"<>"BO"."ROWID_OBJECT")
   6 - access("B1"."COUNTRY"="C_MT_PARTY_ADDRESS"."COUNTRY" AND
              "B1"."BPA_STRING"=NVL2("C_MT_PARTY_ADDRESS"."ADDRESS_LINE_1","C_MT_PARTY_ADDRESS"."ADDRESS_LINE_1"||'
              ','')||NVL2("C_MT_PARTY_ADDRESS"."ADDRESS_LINE_2","C_MT_PARTY_ADDRESS"."ADDRESS_LINE_2"||'
              ','')||NVL2("C_MT_PARTY_ADDRESS"."CITY","C_MT_PARTY_ADDRESS"."CITY"||'
              ','')||NVL2("C_MT_PARTY_ADDRESS"."STATE","C_MT_PARTY_ADDRESS"."STATE"||'
              ','')||NVL2("C_MT_PARTY_ADDRESS"."POSTAL_CODE","C_MT_PARTY_ADDRESS"."POSTAL_CODE"||'
              ','')||NVL2("C_MT_PARTY_ADDRESS"."POSTAL_PLUS_CODE","C_MT_PARTY_ADDRESS"."POSTAL_PLUS_CODE"||' ',''))
   8 - filter("C_MT_PARTY_ADDRESS"."HUB_STATE_IND"=1 AND "C_MT_PARTY_ADDRESS"."CONSOLIDATION_IND"<>9)
   9 - filter("PARTY_TYPE"='ORG' AND "HUB_STATE_IND"=1 AND "CONSOLIDATION_IND"<>9)
  10 - filter("C_MT_PARTY_IDENTIFIER"."IDENTIFIER_TYPE_CD"(+)='DUNS_NUMBER' AND
              "C_MT_PARTY_IDENTIFIER"."HUB_STATE_IND"(+)=1 AND "C_MT_PARTY_IDENTIFIER"."CONSOLIDATION_IND"(+)<>9)
  11 - access("BO"."ROWID_OBJECT"="C_MT_PARTY_IDENTIFIER"."PARTY_ROWID"(+))

40 rows selected.


Size - Objects 
Elapsed: 00:00:00.01 
13:15:54 SQL> select sum(bytes)/(1024*1024) from user_segments where segment_name ='C_PARTY'; 

SUM(BYTES)/(1024*1024) 
---------------------- 
                  1865 

Elapsed: 00:00:00.04 
13:16:06 SQL> select sum(bytes)/(1024*1024) from user_segments where segment_name ='C_PARTY_IDENTIFIER'; 

SUM(BYTES)/(1024*1024) 
---------------------- 
                  2174 

Elapsed: 00:00:00.06 
13:16:24 SQL> select sum(bytes)/(1024*1024) from user_segments where segment_name ='C_PARTY_ADDRESS'; 

SUM(BYTES)/(1024*1024) 
---------------------- 
                  1479 

Elapsed: 00:00:00.04 
13:16:37 SQL> select sum(bytes)/(1024*1024) from user_segments where segment_name ='T$MLE_PARTY'; 

SUM(BYTES)/(1024*1024) 
---------------------- 
                   148 
 

SQL> select  sql_text, DIRECT_WRITES from v$sql where direct_writes<>0 and users_executing <>0

SQL_TEXT                  DIRECT_WRITES
insert /*+ APPEND */ into T_LOAD….            3,611,355


-- db_block_size   8192

-- after few mins.. and it kept growing


SQL> SELECT sum(blocks)*8192 /1024/1024/1024 GB
  FROM v$tempseg_usage
  WHERE tablespace = 'TEMP'

GB
124.5703125
-- we have now 220 GB allocated

--
 
SQL>select sql_text,  direct_writes  from v$sql where sql_id = '3gyv1d9gd6h1c' ORDER BY DIRECT_WRITES DESC

SQL_TEXT               DIRECT_WRITES
insert /*+ APPEND */ into T_LOAD….      16060140


select  sql_text, DIRECT_WRITES from v$sql where direct_writes<>0 and users_executing <>0

SQL_TEXT                  DIRECT_WRITES
insert /*+ APPEND */ into T_LOAD….            3,611,355

  

SQL> select  segtype, sql_id, blocks*8192/1024/1024/1024 GB  from v$sort_usage order by 3 desc

SEGTYPE  SQL_ID    GB
HASH,    1qqkpxrfhc36w  123.30859375 --- SQL_ID  belongs to a procudure call ( which is a logger proc..) from the same session... !!
LOB_DATA  9wra2qj9c4k5s  0.0009765625
INDEX   9wra2qj9c4k5s  0.0009765625
LOB_DATA  9wra2qj9c4k5s  0.0009765625
DATA   1qqkpxrfhc36w  0.0009765625
LOB_DATA  2kv7w0dfp4117  0.0009765625
DATA   2kv7w0dfp4117  0.0009765625
LOB_DATA  2kv7w0dfp4117  0.0009765625
INDEX   2kv7w0dfp4117  0.0009765625
DATA   9wra2qj9c4k5s  0.0009765625
LOB_DATA  1qqkpxrfhc36w   0.0009765625

June 25, 2012 - 9:03 pm UTC

what are your PGA settings

you are hash joining three tables

we'll hash one into TEMP, join the other to that and then hash that result into TEMP, then nested loops join to the fourth table and have to hash that into temp for the distinct.

Lots of copies there - if your pga settings are pretty small, maybe even more temp as we have to page things in and out - making multiple passes and multiple copies.

..contd(above) 1652

A Reader, June 25, 2012 - 3:12 pm UTC

Tom,
further to add:

a) When the same insert statement as above is directly ran on DB ( thus exlduing application layer) it never resulted running out of temp.

b) When the job ( hence the above INSERT statement) is run from the application , we used the query:

SELECT a.username, a.sid, a.serial#, a.osuser,a.LOGON_TIME , b.tablespace, b.blocks, b.blocks*8192/1024/1024 USED_MB , c.sql_text, c.sql_id
FROM v$session a, v$tempseg_usage b, v$sqlarea c
WHERE a.saddr = b.session_addr
AND c.address= a.sql_address
AND c.hash_value = a.sql_hash_value
ORDER BY b.tablespace, b.blocks;

to monitor the temp usage.
initially until below FTS is 20%

|* 8 | TABLE ACCESS FULL | C_PARTY_ADDRESS | 8204K| 578M| | 34250

we saw this temp usage query clearly showing the Insert statement consuming the temp . after that above query does not show no rows but the temp usage kept increasing.
at the same time..
select segtype, sql_id, blocks*8192/1024/1024/1024 GB from v$sort_usage order by 3 desc

SEGTYPE SQL_ID GB
HASH, 1qqkpxrfhc36w 123.30859375 --- SQL_ID belongs to a procudure call ( which is a
logger proc..) from the same session... !!
..

1qqkpxrfhc36w is the SQL_ID belonging to a procedure call ( which a logger - it does not use any sort/hash/group by etc ) .

Question is
1. why child cursor(1qqkpxrfhc36w) is displayed in v$sort_usage under hash SEGTYPE, when it does not involve in using the temp ( here FTS was though continuing on C_PARTY_ADDRESS during the time v$sort_usage shoed SQL_ID of the child cursor )

2. What setting at application level might influnce the temp usage and hence more temp usage than the direct SQL*PLus Insert statment run? something related with curosr setting? [ we monitored the sessions and found only the procedure logger call is being made additionally( as mentioned in point (b) above) , but this procedure has no scope of temp usage ].

June 25, 2012 - 9:08 pm UTC

a) and is it using the same plan?

..contd(above) 1652

A Reader, June 26, 2012 - 12:32 am UTC

thanks Tom,

Here are the answers.

...what are your PGA settings?
pga aggregate target = 4GB
work area size policy = AUTO

in last review
...a) and is it using the same plan?
yes, both (a) and (b) are using the same plan.

comments.
1) When we ran the INSERT directly in SQL*PLus it didnt used even 1 iota of temp, looks everything was lucky to find space in memory ( PGA).. we have 9GB pga set and sum of the size of all these tables is not more than 9GB.

2) Looks application is setting ( we are checking this ..) work area size policy manul and individually setting hash and sort area and if these are set too small, then temp might explode to hundreds of GB.

your comments

regards

June 26, 2012 - 7:56 am UTC

2) that could do it, sure.

Procedural parallelism

Juan, June 28, 2012 - 5:53 am UTC

Hi,

Why this

select sum(result)
from(select /*+ parallel(t1) */ log(power(log(power(id,2)/id,10),3),2)+log(power(log(power(id,2)/id,10),3),2)result from t1)

takes longer than this (complex calculation on the pipelined function)

select sum(id)
from table(parallel_pipelined
(CURSOR(select /*+ parallel(t1) */ id
from t1)))

t1 (2 million rows)

Many thanks

June 28, 2012 - 11:53 pm UTC

because your complex calculation in the pipelined function takes less cpu then

log(power(log(power(id,2)/id,10),3),2)+log(power(log(power(id,2)/id,10),3),2)res

did.

try casting the id column to a float or double and see what happens. power/log on numbers takes a ton of cpu.

you don't show us what your "complex" calculation is so I can only guess it was "easier" than log

Procedural parallelism

Juan, June 29, 2012 - 2:38 am UTC

Many thanks for your reply.

The calculation on the parallel_pipelined is exactly the same.

I tried cast float and there is no difference.

It seems doing the calculation on the parallel_pipelined uses less CPU.

I have quite a few queries doing this kind of calculations on many columns and I wonder if I should use parallel_pipelined to do so.

June 29, 2012 - 10:37 am UTC

post the code.

Procedural parallelism

Juan, June 29, 2012 - 5:58 am UTC

From the stats report

===============================================================================================
RunStats report : 29-JUN-2012 11:50:02
===============================================================================================
-----------------------------------------------------------------------------------------------
1. Summary timings
-----------------------------------------------------------------------------------------------
Type Name Run1 Run2 Diff
----- -------------------------------------------------- ------------ ------------ ------------
TIMER cpu time (hsecs) 1 4 3
TIMER elapsed time (hsecs) 7,814 13,876 6,062
Comments:
1) Run1 was 43.7% quicker than Run2
2) Run1 used 43.7% less CPU time than Run2
-----------------------------------------------------------------------------------------------
2. Statistics report
-----------------------------------------------------------------------------------------------
Type Name Run1 Run2 Diff
----- -------------------------------------------------- ------------ ------------ ------------
...
STAT physical read IO requests 2,440 2,440 0
STAT physical read bytes 282,468,352 282,468,352 0
STAT physical read total IO requests 2,440 2,440 0
STAT physical read total bytes 282,468,352 282,468,352 0
STAT physical read total multi block requests 2,427 2,427 0
STAT physical reads 34,481 34,481 0
STAT physical reads cache 0 0 0
STAT physical reads cache prefetch 0 0 0
STAT physical reads direct 34,481 34,481 0
...
LATCH redo writing 82 163 81
LATCH session allocation 321 404 83
LATCH object queue header heap 116 223 107
LATCH active service list 238 363 125
STAT bytes received via SQL*Net from client 611 749 138
LATCH library cache lock allocation 70 232 162
LATCH cache buffer handles 67 238 171
LATCH library cache pin allocation 85 275 190
LATCH object queue header operation 144 359 215
LATCH channel operations parent latch 377 662 285
LATCH process queue reference 16,215 16,527 312
LATCH messages 510 916 406
LATCH JS queue state obj latch 540 1,008 468
LATCH session idle bit 2,127 1,496 -631
LATCH enqueue hash chains 1,440 2,582 1,142
LATCH enqueues 1,402 2,611 1,209
LATCH shared pool simulator 1,132 2,783 1,651
LATCH checkpoint queue latch 2,087 3,829 1,742
LATCH library cache lock 1,591 4,346 2,755
LATCH library cache pin 1,275 4,271 2,996
LATCH shared pool 2,468 6,797 4,329
STAT session uga memory 196,488 203,880 7,392
LATCH library cache 6,462 21,315 14,853
STAT Cached Commit SCN referenced 26,782 571 -26,211
STAT CPU used when call started 58,443 102,861 44,418
STAT recursive cpu usage 58,424 102,855 44,431
STAT CPU used by this session 58,429 102,861 44,432
STAT session uga memory max 196,488 254,272 57,784
LATCH simulator hash latch 70,310 203,058 132,748
LATCH SQL memory manager workarea list latch 444,039 737,954 293,915
STAT session pga memory max 786,432 0 -786,432
STAT session pga memory -589,824 262,144 851,968
LATCH row cache objects 1,388,762 2,434,648 1,045,886
STAT consistent gets 1,573,400 36,422 -1,536,978
STAT consistent gets direct 1,571,459 34,481 -1,536,978
STAT no work - consistent read gets 1,571,443 34,465 -1,536,978
STAT session logical reads 1,573,400 36,422 -1,536,978
STAT table scan blocks gotten 1,571,459 34,481 -1,536,978
STAT recursive calls 1,571,234 845 -1,570,389
LATCH cache buffers chains 2,947,831 7,496,323 4,548,492
STAT table scan rows gotten 83,004,147 1,586,493 -81,417,654
-----------------------------------------------------------------------------------------------
3. Latching report
-----------------------------------------------------------------------------------------------
Type Name Run1 Run2 Diff
----- -------------------------------------------------- ------------ ------------ ------------
LATCH total latches used 4,891,602 10,945,762 6,054,160
Comments:
1) Run1 used 55.3% fewer latches than Run2
-----------------------------------------------------------------------------------------------
4. Time model report
-----------------------------------------------------------------------------------------------
Type Name Run1 Run2 Diff
----- -------------------------------------------------- ------------ ------------ ------------
TIME parse time elapsed 316 337 21
TIME PL/SQL execution elapsed time 8,143 7,374 -769
TIME DB CPU 267,534 144,606 -122,928
TIME sql execute elapsed time 430,322 143,631 -286,691
TIME DB time 431,404 144,606 -286,798

June 29, 2012 - 10:53 am UTC

post the code for us to run, all of it. complete example

Procedural parallelism

Juan, July 02, 2012 - 5:29 am UTC

>> Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod

drop table t1;

create table t1
as
SELECT ID
FROM(
SELECT ROWNUM id
FROM ( SELECT 1 just_a_column
FROM dual
CONNECT BY LEVEL <= 2000000
))
WHERE ID NOT IN(1,10);

ALTER TABLE t1
add CONSTRAINT t1_PK PRIMARY KEY (id);

exec dbms_stats.gather_table_stats( user, 't1' )

drop type t2_tab_type;
drop type t2_type;

CREATE OR REPLACE TYPE t2_type
AS OBJECT (
id number,
text varchar2(30),
session_id number
);

create or replace type t2_tab_type
as table of t2_type;
/

create or replace
function parallel_pipelined( l_cursor in sys_refcursor )
return t2_tab_type
pipelined
parallel_enable ( partition l_cursor by any )
is
l_session_id number;
v_id t1.id%type;
begin
/* select sid into l_session_id
from v$mystat
where rownum =1;*/
loop
fetch l_cursor into v_id;
exit when l_cursor%notfound;
-- complex process here
pipe row(t2_type(log(power(log(power(v_id,2)/v_id,10),3),2)+log(power(log(power(v_id,2)/v_id,10),3),2), 'text', l_session_id));
end loop;
close l_cursor;
return;
end;
/

select sum(id)
from table(parallel_pipelined
(CURSOR(select /*+ parallel(t1) */ id
from t1)
));

-524599.237324657
01:40 mins

select sum(result)
from(
select /*+ parallel(t1) */ log(power(log(power(id,2)/id,10),3),2)+log(power(log(power(id,2)/id,10),3),2) result
from t1
);

-524599.237324657
02:56 mins

select sum(result)
from(
select /*+ parallel(t1) */ log(power(log(power(cast(id as float),2)/cast(id as float),10),3),2)+log(power(log(power(cast(id as float),2)/cast(id as float),10),3),2) result
from t1
);

-524599.237324657
02:56 mins

Many thanks.

July 02, 2012 - 8:54 am UTC

that is interesting, and I can reproduce (but cannot yet explain...)

However, what happens with:

ops$tkyte%ORA10GR2> set numformat 999999999.999999999
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select sum(id)
  2  from table(parallel_pipelined
  3            (CURSOR(select /*+ parallel(t1, 5) */  id
  4                      from  t1)
  5             ));

             SUM(ID)
--------------------
    -58890.523578653

Elapsed: 00:00:27.72
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select sum(result)
  2  from(
  3  select  /*+ parallel(t1, 5) */
  4  log(
  5    power(
  6        log(
  7            power(id,2d)/id,10d
  8               ),
  9            3d),
 10    2d)+
 11  log(
 12     power(
 13        log(
 14               power(id,2d)/id,
 15             10d),
 16            3d),
 17    2d) result
 18  from (select cast(id as binary_double) id from t1)
 19  );

         SUM(RESULT)
--------------------
    -58890.523578653

Elapsed: 00:00:01.17
ops$tkyte%ORA10GR2> set timing off

for you (I'm getting a different number than you because I'm using a slightly different set of rows in my testing....)

Procedural parallelism

Juan, July 02, 2012 - 9:10 am UTC

Many thanks for looking into it.
Hope you can find an explanation as we have lots of SQL queries doing similar calculations. It might help us to improve them. I would like to know why, more than only an empirical confirmation.

Cheers.

July 02, 2012 - 1:42 pm UTC

did you see the use of the binary double there - do you see such a difference. It was like orders of magnitude faster all of a sudden.

did you try the pure sql approach i just gave you above?

Procedural parallelism

A reader, July 03, 2012 - 3:17 am UTC

Ok, thanks! I wasn't aware of the use of binary double for better performance.

But still having better performance with the parallel pipelined function in the same conditions:

select sum(result)
from(
select /*+ parallel(t1, 5) */
log(
power(
log(
power(id,2d)/id,10d
),
3d),
2d)+
log(
power(
log(
power(id,2d)/id,
10d),
3d),
2d) result
from (select cast(id as binary_double) id from t1)
);

00:25 seconds
-524599.237324659

create or replace
function parallel_pipelined( l_cursor in sys_refcursor )
return t2_tab_type
pipelined
parallel_enable ( partition l_cursor by any )
is
l_session_id number;
v_id t1.id%type;
begin
/* select sid into l_session_id
from v$mystat
where rownum =1;*/
loop
fetch l_cursor into v_id;
exit when l_cursor%notfound;
-- complex process here
pipe row(t2_type(log(power(log(power(v_id,2d)/v_id,10d),3d),2d)+log(power(log(power(v_id,2d)/v_id,10d),3d),2d), 'text', l_session_id));
end loop;
close l_cursor;
return;
end;
/

select sum(id)
from table(parallel_pipelined
(CURSOR(select /*+ parallel(t1,5) */ cast(id as binary_double) id
from t1)
));
00:21 seconds
-524599.237324657

I guess 2d is the same as cast(2 as binary_double) ?

Many thanks.

July 03, 2012 - 8:48 am UTC

I don't see 25 and 21 being "different" really, at that point, they are too close to call.

I agree there is something going on - but there is no switch or anything I know of to throw. You might consider filing your test case with support to get a bug started. The SQL guys won't like it that plsql toasts them so bad..

2d is a little better than cast(2 as binary_double) - no function call overhead, potentially for each row. I'd strongly suggest 2d.

Segment allocation during parallel processing

Juj, July 06, 2012 - 4:03 am UTC

Hi Tom,

Correct me if I am wrong, but is parallel processing somehow invokes extra extent allocation or any fragmentation for that matter?  

We have a table in production environment (Oracle9i EER 9.2.0.8.0) which is about 120 GB in size. There are two indexes one local, and the other global created on this table.

Now, we need to migrate this table and its dependencies to a new version (Oracle11g EER 11.2.0.2.0). We have created the indexes with PARALLEL degree 16.
But then, as a cost to the faster parallel process, we now find that the total space occupied by these indexes in target, is much larger than that in the source.

In source, for INDEXES:

SQL> SELECT SUM(BYTES)/1024/1024/1024 from dba_segments@<db_link_source> WHERE OWNER=<owner> AND SEGMENT_TYPE LIKE '%INDEX%';

SUM(BYTES)/1024/1024/1024
-------------------------
               56.4414063


In target, for INDEXES:

SQL> SELECT SUM(BYTES)/1024/1024/1024 from dba_segments WHERE OWNER=<owner> AND SEGMENT_TYPE LIKE '%INDEX%';

SUM(BYTES)/1024/1024/1024
-------------------------
               110.025391


The indexes are valid in both source and target and neither of them is compressed.

July 11, 2012 - 8:18 am UTC

see
http://www.oracle.com/technetwork/issue-archive/2007/07-may/o37asktom-101781.html

Juj, July 06, 2012 - 4:30 am UTC

I missed a part: These are the only two indexes that this owner has.

Thanks

Juj, July 16, 2012 - 3:24 am UTC

Thank you Tom,

It was such an informative article.

Advantages of uniform allocation

Juj, July 17, 2012 - 6:48 am UTC

Tom,

1) With benefits like automatic extent management and "extent trimming" why would people go for uniform space allocation while creating tablespaces.

As you say, the golden rule would be to understand the requirement first, but then can there be any requirement for uniform allocation at tablespace level. I guess it would require user predictibility, from which letting oracle to manage space should be much better.

One more thing,
if I create a LMT with autoallocate and then create a table having storage clauses such as initial and next, which one would override the other. Will it consider the initial, next values at the object level or will it consider the space management clause of the tablespace.

Thanks a lot for your help.

July 17, 2012 - 9:16 am UTC

1) I did initially when this feature first came out for the sole reason that "it felt better, I felt more in control"

I lost control sometime during 10gR1 :) The advantages of system allocated extents with extent trimming when appropriate and just the plain fact that you don't have to place objects by their size - but rather by their purpose outweighed that good feeling.

if you use a storage clause in a locally managed tablespace - uniform or not - we will take the storage clause and figure out how much space we WOULD HAVE allocated initially in a dictionary managed tablespace and allocate AT LEAST that much storage for it initially.

and then we would forget about it. we ignore initial, next and pctincrease from then on it.

even with segment creation deferred in 11g:

ops$tkyte%ORA11GR2> create table t ( x int ) storage( initial 2m next 1m minextents 5 );

Table created.

ops$tkyte%ORA11GR2> select extent_id, blocks*8/1024 from dba_extents where owner = user and segment_name = 'T';

no rows selected

ops$tkyte%ORA11GR2> insert into t values (  1 );

1 row created.

ops$tkyte%ORA11GR2> select extent_id, blocks*8/1024 from dba_extents where owner = user and segment_name = 'T';

 EXTENT_ID BLOCKS*8/1024
---------- -------------
         0             1
         1             1
         2             1
         3             1
         4             1
         5             1

6 rows selected.


ops$tkyte%ORA11GR2>

since an initial of 2m, next of 1m, minextents of 5 would create a 6m segment to start with in a dictionary managed tablespace - it created a table with 6-1mb extents.

Parallelism on Joins

Snehasish Das, July 24, 2012 - 8:11 am UTC

Hi Tom,

I have a general question on parallelism on Joins. Recenly one of my seniors told me that for parallel joins to happen we need to have parallelism on both the tables.

Select /*+ FULL(a) full(b) parallel(a,4) parallel(b,4) */ a.* from
emp a,dept b
where a.dept_no = b.dept_no;

But if we do

Select /*+ FULL(a) full(b) parallel(a,4) no_parallel(b) */ a.* from
emp a,dept b
where a.dept_no = b.dept_no;

The query will have large waits on serial to parallel conversion or vice versa.

Say the emp table is huge and size is about 5 GB and dept table is quite small about 50 MB . So parallelism on dept table is not necessary ,(full scan of dept will take few seconds).

1. Is it really required that both table must be of similar parallelism for a parallel join to happen.
2. If one is parallel and other is serial will we have parallel to serial wait event hampering the code.

Regards,
Snehasish das.

July 30, 2012 - 9:17 am UTC

... The query will have large waits on serial to parallel conversion or vice versa. ...

maybe yes, maybe no, maybe neither yes nor no. It depends.

and see, you came up with a counter case! ask your 'seniors' about it and see what they say, benchmark it, evaluate it.

and in real life, if you had primary/foreign keys in place - then the DEPT table would not even be accessed!!!!!

ops$tkyte%ORA11GR2> alter table dept add constraint dept_pk primary key(deptno);

Table altered.

ops$tkyte%ORA11GR2> alter table emp add constraint emp_fk_dept foreign key (deptno) references dept;

Table altered.

ops$tkyte%ORA11GR2> alter table emp modify deptno NOT NULL;

Table altered.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> Select /*+ FULL(a) full(b) parallel(a,4) no_parallel(b) */ a.*
  2    from emp a,dept b
  3   where a.deptno = b.deptno;

Execution Plan
----------------------------------------------------------
Plan hash value: 2873591275

--------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |    14 |  1218 |     2   (0)| 00:00:01 |        |      |            |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 |    14 |  1218 |     2   (0)| 00:00:01 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX BLOCK ITERATOR |          |    14 |  1218 |     2   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|   4 |     TABLE ACCESS FULL| EMP      |    14 |  1218 |     2   (0)| 00:00:01 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)

ops$tkyte%ORA11GR2> set autotrace off

create index parallel degree

beer, August 14, 2012 - 11:50 am UTC

Hi Tom!

I create index like that

create index schema.index_name.....parallel 10;

but i see only four process like p000 p001 p003 p004

can i increase parallel processes to 10 ? i have 14 cpu

August 17, 2012 - 2:40 pm UTC

what are your parallel settings. show parameter parallel

create index parallel deegre

beer, August 21, 2012 - 2:32 am UTC

SQL> show parameter parallel

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback         string      LOW
log_parallelism                      integer     1
parallel_adaptive_multi_user         boolean     FALSE
parallel_automatic_tuning            boolean     FALSE
parallel_execution_message_size      integer     2152
parallel_instance_group              string
parallel_max_servers                 integer     5
parallel_min_percent                 integer     0
parallel_min_servers                 integer     0
parallel_server                      boolean     FALSE
parallel_server_instances            integer     1

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0

August 28, 2012 - 1:05 pm UTC

parallel_max_servers integer 5

read up on that parameter.

from the docs

Dmitri, October 25, 2012 - 3:27 am UTC

Manually Specifying the Degree of Parallelism

A specific DOP can be requested from Oracle Database. For example, you can set a fixed DOP at a table or index level:

ALTER TABLE sales PARALLEL 8;

In this case, queries accessing just the sales table use a requested DOP of 8.

Does this mean EVERY query I run against the sales table will be run in parallel?

October 25, 2012 - 9:11 am UTC

ops$tkyte%ORA11GR2> create table t as select * from all_objects;

Table created.

ops$tkyte%ORA11GR2> alter table t add constraint t_pk primary key(object_id);

Table altered.

ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> alter table t parallel 8;

Table altered.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select * from t;

Execution Plan
----------------------------------------------------------
Plan hash value: 3050126167

--------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          | 72847 |  6900K|    40   (0)| 00:00:01 |        |      |            |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 | 72847 |  6900K|    40   (0)| 00:00:01 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX BLOCK ITERATOR |          | 72847 |  6900K|    40   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|   4 |     TABLE ACCESS FULL| T        | 72847 |  6900K|    40   (0)| 00:00:01 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------

ops$tkyte%ORA11GR2> select * from t where object_id = 42;

Execution Plan
----------------------------------------------------------
Plan hash value: 1303508680

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     1 |    97 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T    |     1 |    97 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | T_PK |     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"=42)

ops$tkyte%ORA11GR2> set autotrace off
ops$tkyte%ORA11GR2>

from the docs

Dmitri, October 29, 2012 - 9:51 am UTC

Interesting. In your example above, do you have PARALLEL_DEGREE_POLICY = AUTO ? Just curious how you got the first query to use PQ and the second not to.

Thanks

October 29, 2012 - 8:58 pm UTC

all defaults in 11gr2:

ops$tkyte%ORA11GR2> l
  1  select name || ' = ' || value nm, isdefault from v$parameter
  2* where name like 'parallel%'
ops$tkyte%ORA11GR2> /

NM                                                 ISDEFAULT
-------------------------------------------------- ---------
parallel_server = FALSE                            TRUE
parallel_server_instances = 1                      TRUE
parallel_min_percent = 0                           TRUE
parallel_min_servers = 0                           TRUE
parallel_max_servers = 160                         TRUE
parallel_instance_group =                          TRUE
parallel_execution_message_size = 16384            TRUE
parallel_degree_policy = MANUAL                    TRUE
parallel_adaptive_multi_user = TRUE                TRUE
parallel_threads_per_cpu = 2                       TRUE
parallel_automatic_tuning = FALSE                  TRUE
parallel_io_cap_enabled = FALSE                    TRUE
parallel_min_time_threshold = AUTO                 TRUE
parallel_degree_limit = CPU                        TRUE
parallel_force_local = FALSE                       TRUE
parallel_servers_target = 64                       TRUE

16 rows selected.

11g has a parallel min time threshold... if a query is deemed to run faster than fast enough serially - the parallel plan is not developed.

ORA 12838 on partitioned table

Amit, February 20, 2013 - 12:26 pm UTC

Hi Tom,

Following is one test scenario. I have created 2 tables SOURCE and DEST partitioned by range.SOURCE table is loaded with dummy data in both partitions. There are no indexes/constraints created on these 2 tables.

drop table source;

CREATE TABLE source
(invoice_no NUMBER NOT NULL,
invoice_date DATE NOT NULL
)
PARTITION BY RANGE (invoice_date)
(PARTITION source_q1 VALUES LESS THAN (TO_DATE('01/01/2013', 'MM/DD/YYYY')),
PARTITION source_q2 VALUES LESS THAN (TO_DATE('02/01/2013', 'MM/DD/YYYY'))
);

drop table dest;

CREATE TABLE dest
(invoice_no NUMBER NOT NULL,
invoice_date DATE NOT NULL
)
PARTITION BY RANGE (invoice_date)
(PARTITION dest_q1 VALUES LESS THAN (TO_DATE('01/01/2013', 'MM/DD/YYYY')),
PARTITION dest_q2 VALUES LESS THAN (TO_DATE('02/01/2013', 'MM/DD/YYYY'))
);

INSERT /*+ append */ INTO source
SELECT object_id,
TO_DATE('01/01/2013', 'MM/DD/YYYY') - MOD(rownum, 20) - 1
FROM all_objects;

commit;

INSERT /*+ append */ INTO source
SELECT object_id,
TO_DATE('02/01/2013', 'MM/DD/YYYY') - MOD(rownum, 20) - 1
FROM all_objects;

commit;

select count(1)
from source partition(source_q1);

COUNT(1)
--------
44773

select count(1)
from source partition(source_q2);

COUNT(1)
--------
44773

ALTER SESSION ENABLE PARALLEL DML;

-- After SOURCE table is loaded, one of the partitions is copied to DEST table.

INSERT /*+ append */ INTO dest
select *
from source partition(source_q1);

-- Without issuing COMMIT, when I tried to query second (empty) partition, I got ORA-12838 error.

select *
from dest partition(dest_q2);

Error starting at line 73 in command:
select *
from dest partition(dest_q2)
Error report:
SQL Error: ORA-12838: cannot read/modify an object after modifying it in parallel
12838. 00000 - "cannot read/modify an object after modifying it in parallel"
*Cause: Within the same transaction, an attempt was made to add read or
modification statements on a table after it had been modified in parallel
or with direct load. This is not permitted.
*Action: Rewrite the transaction, or break it up into two transactions
one containing the initial modification and the second containing the
parallel modification operation.

I was expecting that accessing only the partition that is modified in parallel would cause 12838 error. But after observing this behavior, I am not understanding the reason behind it. Can you please explain the reason that causes this? Why cant oracle allow operations on other partitions?

[My actual scenario was to copy 50% of data from source table to destination. Since both tables are partitioned and contain huge amount of data (source table has almost 80 partitions and each partition contains 20-30 million records), I was planning to copy data by partition and was checking the possibility of running different queries from different sessions concurrently.]

February 25, 2013 - 8:39 am UTC

why are you not just exchanging the data in? do you really really need to copy it - or are you trying to load it? If you are loading - just exchange - no data will move then.

but short of that, create a new table as select (no locking issue) and then alter that table in as an exchanged partition.

ORA 12838 on partitioned table

Amit, February 26, 2013 - 4:34 pm UTC

I am not exchanging partitions because I want to keep only 50% of data from every partition.

February 26, 2013 - 5:07 pm UTC

does that mean you are attempting to delete 50% of the data somewhere?

can you post the entire process (the action you are doing will generate the direct path error - cannot read until commit - and so we can try to find a way to avoid that - but we need to understand the flow)

ORA 12838 on partitioned table

Amit, February 27, 2013 - 12:27 pm UTC

Sorry Tom,
I should have been more clear in my scenario.
My requirement is to keep 50% of data in SOURCE table. This table is partitioned and every partition is going to have 50% data. So, instead of deleting data from SOURCE, I am copying 50% of data to keep from SOURCE table to DEST table. Once data is verified in DEST table, SOURCE will be dropped. Both tables are partitioned and have same structure. Since I need to keep all the partitions, I am copying 50% data from every partition. And when I tried to execute data copy statement for different partitions concurrently (to speed up overall copy process), it is giving ORA 12838 error. I can certainly execute these copying operations serially and COMMIT after every partition is copied. But this behavior made me curious because I was trying to work on 2 different partitions concurrently, So why isn't oracle allowing operations on other partitions? Is there some table level property that cannot allow direct path operations on different partitions of same table?

drop table source;

CREATE TABLE source
(invoice_no NUMBER NOT NULL,
invoice_date DATE NOT NULL
)
PARTITION BY RANGE (invoice_date)
(PARTITION source_q1 VALUES LESS THAN (TO_DATE('01/01/2013', 'MM/DD/YYYY')),
PARTITION source_q2 VALUES LESS THAN (TO_DATE('02/01/2013', 'MM/DD/YYYY'))
);

drop table dest;

CREATE TABLE dest
(invoice_no NUMBER NOT NULL,
invoice_date DATE NOT NULL
)
PARTITION BY RANGE (invoice_date)
(PARTITION dest_q1 VALUES LESS THAN (TO_DATE('01/01/2013', 'MM/DD/YYYY')),
PARTITION dest_q2 VALUES LESS THAN (TO_DATE('02/01/2013', 'MM/DD/YYYY'))
);

INSERT /*+ append */ INTO source
SELECT object_id,
TO_DATE('01/01/2013', 'MM/DD/YYYY') - MOD(rownum, 20) - 1
FROM all_objects;

commit;

INSERT /*+ append */ INTO source
SELECT object_id,
TO_DATE('02/01/2013', 'MM/DD/YYYY') - MOD(rownum, 20) - 1
FROM all_objects;

commit;

select count(1)
from source partition(source_q1);

COUNT(1)
--------
44773

select count(1)
from source partition(source_q2);

COUNT(1)
--------
44773

ALTER SESSION ENABLE PARALLEL DML;

-- After SOURCE table is loaded, one of the partitions is copied to DEST table.

INSERT /*+ append */ INTO dest
select *
from source partition(source_q1)
where invoice_date between to_date('12/22/2012', 'mm/dd/yyyy') AND to_date('12/31/2012', 'mm/dd/yyyy');

-- Without issuing COMMIT, when I tried to insert data into second partition at the same time, I got ORA-12838 error.

INSERT /*+ append */ INTO dest
select *
from source partition(source_q2)
where invoice_date between to_date('01/22/2013', 'mm/dd/yyyy') AND to_date('01/31/2013', 'mm/dd/yyyy');

Hope this clarifies my question.

Thanks!

February 27, 2013 - 1:42 pm UTC

since the direct path insert doesn't generate undo and can skip redo.

can't you just do one insert using parallel DML, let us parallelize everything?

but we can do the inserts into different partitions in different sessions - that isn't the issue. the issue was with your select.

if you

insert /*+ append */ into dest partition(pname)select * from source....;

and do that with different pnames in different transactions - it'll be OK.

ops$tkyte%ORA11GR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(30)
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> insert /*+ append */ into t select to_date( '12-mar-2003' ), 1, 'x' from dual;

1 row created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> declare
  2          pragma autonomous_transaction;
  3  begin
  4          insert /*+ append */ into t select to_date( '13-mar-2003' ), 1, 'x' from dual;
  5          commit;
  6  end;
  7  /
declare
*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at line 4


ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> commit;

Commit complete.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> truncate table t;

Table truncated.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> insert /*+ append */ into t partition(part1) select to_date( '12-mar-2003' ), 1, 'x' from dual;

1 row created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> declare
  2          pragma autonomous_transaction;
  3  begin
  4          insert /*+ append */ into t partition(part2)  select to_date( '13-mar-2003' ), 1, 'x' from dual;
  5          commit;
  6  end;
  7  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> commit;

Commit complete.

in the first case, without using the extended partition name - the second insert was blocked by the first - it deadlocked because I used a single session in this case to demo with..

In the second case - they both can go concurrently because we know they work on different partitions.

however, neither would be able to read the table until they commit.

ORA 12838 on partitioned table

Amit, February 27, 2013 - 12:40 pm UTC

Sorry, My mistake ! It do not throw ORA-12838, but query in second session waits until first session do not COMMIT.
I tried with conventional INSERT and both INSERTs were running concurrently. I am still confused as why is this restriction on different partitions.

Thanks!

February 27, 2013 - 1:56 pm UTC

you cannot read from a table that you've done a direct path operation on.

ORA 12838 on partitioned table

Amit, February 27, 2013 - 2:28 pm UTC

Wow, why couldn't I think of it!

Thanks a lot, Tom.

RECOVERY_PARALLELISM

sunny gambhir, May 01, 2013 - 9:26 pm UTC

I am bit confused regarding setting of RECOVERY_PARALLELISM parameter.
I need to change this parameter RECOVERY_PARALLELISM to 16.
How should i do that.
Also will it require a DB restart?

May 06, 2013 - 3:47 pm UTC

you don't know how to change an init.ora parameter???

if you are using a parameter file, edit it and add it.

if you are using an spfile (stored parameter file), use the alter system command to set it (with scope=spfile).

Since this affects how the database restarts after an instance crash - it would require a database restart to USE it (since it is only used during database startup, period) - but you don't have to restart the database (since it is already started and you don't have a crash scenario right now!)

Thanks

sunny gambhir, May 08, 2013 - 1:49 am UTC

Thanks a lot tom..
I did the changes as above and FAST_START_PARALLEL_ROLLBACK='HIGH'
but it didn't improve my rollback time for a particular job.
The job updates about 2.5 million rows in a table of 800 million rows in 30 mins & rollbacks in 2 hrs.
The table has indexes as well.
Can you suggest another way to improve the rollback timings?

May 08, 2013 - 1:09 pm UTC

that is because this is a "fast start", it is only for instance crash recovery. As I tried to say above:

Since this affects how the database restarts after an instance crash - it would require a database restart to USE it (since it is only used during database startup, period) -

My suggestion would be:

do not update 2.5 million rows
use DDL to select out what you want, drop and rename.

http://www.oracle.com/technetwork/issue-archive/2013/13-may/o33asktom-1917335.html

rolling back is hugely expensive, it is good for backing out an OLTP transaction of a few rows. it always works slow by slow (if you bulk update X rows and roll it back, we'll un-update X rows one by one to put them back the way they were). We are optimized to commit, never to rollback.

To perform such a large operation, I would be looking at DDL.

restrict parallelisation?

Markus, June 21, 2013 - 2:22 pm UTC

An Microsoft Premier Support Field Engineer advised us to turn off parallel query for SQL-Server on Windows.

He argued SQL-Server would parallelise queries over several cpu's and cores which could cause some parallel threads having to exchange data of their cpu's 2nd level caches. This might slow down the whole processing.

He advised us to restrict parallel queries to the number of cores per processor on physical machines or even turn parallelisation off on virtualised environments (as no-one knows which processor the logical cpu is on).

Is this also true for Oracle? Are there any similar points to consider?

July 01, 2013 - 5:07 pm UTC

get a new premier support field engineer - even MS would know that sometimes parallel query is great, sometimes it is horrible.

but to make a blanket statement like this is beyond "not smart".

if you OVER parallelize - try to use more than you have (the three resources: IO bandwidth, CPU, memory) - that'll be bad.

But if you have idle IO bandwidth, and cpus sitting around (you cannot put cpu in the bank and save it up), and memory - and a query that is going to do a ton of physical IO - ask your support engineer which takes longer - physical IO from disk to memory to cpu cache or.... memory to cpu cache.

Yes, you do not want to over whelm the machine, and certainly one way to not overwhelm the machine is to run your big queries serially - but then you are not using your machine are you?

yes, the same rules would apply for Oracle - for *any* software in fact. Use resources wisely. turning off parallel willy nilly like that is not "wise".

see:
http://tinyurl.com/RWP-OLTP-CONNECTIONS
http://tinyurl.com/RWP-DW-PART4

parallel hint

PCS, June 26, 2013 - 7:23 am UTC

Hi Tom,

I have a very basic question on parallelism ,you can say kind of naive one :).

While selecting data through parallelism (using parallel hint in select queries) what could be the disadvantages of using parallel hints ?

I know the one is, it eats up resources , If I have enough resource available and I also use parallel hints with degrees ( parallel(table_name,4) ) , Are there any other side effects of parallelism I have to keep in mind , If not then isn't it an advisable option to always use select queries in parallel restricting with degrees(obviously with enough resource).

July 01, 2013 - 6:46 pm UTC

see http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:580582800346975090

Parallel DDL

Rajeshwaran, June 28, 2013 - 12:14 am UTC

Tom:

I was reading about partitioning from docs at the below link. Where there is an example showing parallel ddl along with session enabled for parallel ddl.

q1) do we need to enable a session for parallel ddl like we do for parallel dml?
I did a test in my local db and founded that " we don't need to enable session for parallel ddl " - please let me know your thoughts.
http://docs.oracle.com/cd/E11882_01/server.112/e25523/part_warehouse.htm#CHDCBDHB
<quote src=example_from_docs>

ALTER SESSION enable parallel ddl;

ALTER TABLE sales
MOVE PARTITION sales_1995
COMPRESS FOR OLTP
PARALLEL NOLOGGING;

</quote>

my test case (ran this on session#1 and monitored v$px_session from session#2 )

rajesh@ORA11G> select userenv('sid') from dual;

USERENV('SID')
--------------
           156

1 row selected.
rajesh@ORA11G> create table big_table2
  2  partition by hash(id)
  3  partitions 4
  4  parallel 4
  5  as
  6  select * from big_table ;

monitoring from another session i got this.

rajesh@ORA11G> l
  1  select inst_id,sid,serial#,qcsid,degree,req_degree
  2  from gv$px_session
  3* where qcsid = 156
rajesh@ORA11G> /

   INST_ID        SID    SERIAL#      QCSID     DEGREE REQ_DEGREE
---------- ---------- ---------- ---------- ---------- ----------
         1        145       9541        156          4          4
         1        133        284        156          4          4
         1        135         43        156          4          4
         1        124         13        156          4          4
         1        121      15590        156          4          4
         1        142         23        156          4          4
         1        148        840        156          4          4
         1        125         39        156          4          4
         1        156      12142        156

9 rows selected.

July 01, 2013 - 9:09 pm UTC

it depends on the DDL. create table as select doesn't need it

Parallel_DDL

Rajeshwaran, July 02, 2013 - 7:39 am UTC

it depends on the DDL.

I did using enable parallel_ddl at session level, but still parallel servers didn't spanned. (is that Alter table modify column cannot be ran in parallel? I am using 11.2.0.1 on win7 32bit )

rajesh@ORA11G> select userenv('sid') from dual;

USERENV('SID')
--------------
            20

1 row selected.
rajesh@ORA11G> alter session enable parallel ddl;

Session altered.
rajesh@ORA11G> alter table big_table modify id not null ;

Table altered.
rajesh@ORA11G> select degree
  2  from user_tables
  3  where table_name='BIG_TABLE';

DEGREE
----------
         4

rajesh@ORA11G>

In session#2 I see this. (while alter table modify is running I did the below query in session#2)

rajesh@ORA11G> select *
  2  from gv$session
  3  where (inst_id,sid,serial#) in 
  4    ( select inst_id,sid,serial#
  5    from gv$px_session
  6    where qcsid = 20 )
  7  order by sid ;

no rows selected
rajesh@ORA11G> select inst_id,event,wait_class,pddl_status from gv$session where sid = 20 ;

   INST_ID EVENT                WAIT_CLASS           PDDL_STA
---------- -------------------- -------------------- --------
         1 db file scattered re User I/O             ENABLED
           ad

1 row selected.

July 02, 2013 - 5:04 pm UTC

big_table%ORA11GR2> alter session enable parallel dml;

Session altered.

big_table%ORA11GR2> alter table big_table parallel 8;

Table altered.

big_table%ORA11GR2> alter table big_table modify object_id constraint oid_not_null NOT NULL novalidate;

Table altered.

big_table%ORA11GR2> alter table big_table modify constraint oid_not_null validate;

the constraint validation can run in parallel. querying another session while that was running showed me:

--------------------
BIG_TABLE(6,87) ospid = 14551 command = 3 program =
oracle@localhost.localdomain (P005) dedicated server=14551
Tuesday   12:58  Tuesday   12:59 last et = 54
select /*+ all_rows ordered */ A.rowid, :1, :2, :3 from "BIG_TA
BLE"."BIG_TABLE" A where not ( "OBJECT_ID" IS NOT NULL)
--------------------
BIG_TABLE(97,977) ospid = 14541 command = 3 program =
oracle@localhost.localdomain (P000) dedicated server=14541
Tuesday   12:58  Tuesday   12:59 last et = 54
select /*+ all_rows ordered */ A.rowid, :1, :2, :3 from "BIG_TA
BLE"."BIG_TABLE" A where not ( "OBJECT_ID" IS NOT NULL)
--------------------
BIG_TABLE(35,59) ospid = 14553 command = 3 program =
oracle@localhost.localdomain (P006) dedicated server=14553
Tuesday   12:58  Tuesday   12:59 last et = 54
select /*+ all_rows ordered */ A.rowid, :1, :2, :3 from "BIG_TA
BLE"."BIG_TABLE" A where not ( "OBJECT_ID" IS NOT NULL)
--------------------
BIG_TABLE(66,473) ospid = 14555 command = 3 program =
oracle@localhost.localdomain (P007) dedicated server=14555
Tuesday   12:58  Tuesday   12:59 last et = 54
select /*+ all_rows ordered */ A.rowid, :1, :2, :3 from "BIG_TA
BLE"."BIG_TABLE" A where not ( "OBJECT_ID" IS NOT NULL)
--------------------
BIG_TABLE(223,43) ospid = 14549 command = 3 program =
oracle@localhost.localdomain (P004) dedicated server=14549
Tuesday   12:58  Tuesday   12:59 last et = 54
select /*+ all_rows ordered */ A.rowid, :1, :2, :3 from "BIG_TA
BLE"."BIG_TABLE" A where not ( "OBJECT_ID" IS NOT NULL)
--------------------
BIG_TABLE(130,499) ospid = 14543 command = 3 program =
oracle@localhost.localdomain (P001) dedicated server=14543
Tuesday   12:58  Tuesday   12:59 last et = 54
select /*+ all_rows ordered */ A.rowid, :1, :2, :3 from "BIG_TA
BLE"."BIG_TABLE" A where not ( "OBJECT_ID" IS NOT NULL)
--------------------
BIG_TABLE(160,1205) ospid = 14545 command = 3 program =
oracle@localhost.localdomain (P002) dedicated server=14545
Tuesday   12:58  Tuesday   12:59 last et = 54
select /*+ all_rows ordered */ A.rowid, :1, :2, :3 from "BIG_TA
BLE"."BIG_TABLE" A where not ( "OBJECT_ID" IS NOT NULL)
--------------------
BIG_TABLE(192,403) ospid = 14547 command = 3 program =
oracle@localhost.localdomain (P003) dedicated server=14547
Tuesday   12:58  Tuesday   12:59 last et = 54
select /*+ all_rows ordered */ A.rowid, :1, :2, :3 from "BIG_TA
BLE"."BIG_TABLE" A where not ( "OBJECT_ID" IS NOT NULL)
--------------------
BIG_TABLE(34,1969) ospid = 14475 command = 3 program =
sqlplus@localhost.localdomain (TNS V1-V3) dedicated server=14477
Tuesday   12:55  Tuesday   12:59 last et = 55
select /*+ all_rows ordered */ A.rowid, :1, :2, :3 from "BIG_TA
BLE"."BIG_TABLE" A where not ( "OBJECT_ID" IS NOT NULL)
ops$tkyte%ORA11GR2>

Parallel_DDL

Rajeshwaran, July 02, 2013 - 5:20 pm UTC

Tom:

We need to add a column of Varchar2(16) to an application table size 65GB partitioned by Hash on primary key. What is the best way to do this? - We cannot do CTAS since this table is referenced by Foreign key of nearly 36+ tables. We are on 10.2.0.5

July 02, 2013 - 5:26 pm UTC

just add it, it'll go on immediately and not touch any existing data.

unless you are hiding some really important information from me like "with a default value" or something like that.

Parallel_DDL

Rajeshwaran, July 02, 2013 - 6:00 pm UTC

Tom:

Thanks for your above examples and explanations and sorry for bothering you again.

Can you tell me on which scenario should one consider this " ALTER SESSION enable parallel ddl; ". As mentioned in the above doc link, I did a partition move command without "alter session enable parallel ddl" - but I am able to see parallel operations taking place from v$px_session from another session. (below is the scipt I did for my local demo)

select userenv('sid') from dual;
create table big_table2
partition by hash(id)
( partition p1,
  partition p2,
  partition p3,
  partition p4 )
nologging as
select * from big_table ;

alter table big_table2 parallel 4;

alter table big_table2
move partition p1 tablespace TS_DATA_02_D
parallel 4;

When I monitored from another session using v$px_session - I see parallel operation takes place.

July 16, 2013 - 12:14 pm UTC

because you have a parallel clause in there, yes. some DDL's do not have that clause - they would use the session setting.

12c and parallel processing

A reader, July 09, 2013 - 10:32 am UTC

Hi Tom

12c seems quite disappointing in term of parallel processing as described by some critical experts.

I put this question to you to high light some key area improved in 12 c in term of parallel processing

Db2 team says. Oracle just catch up with db2 since they already have the thread and process processing ability in their software.

Also, Tom larry said oracle will embed some software in hardware. Please describe that part also. Some of my clients think spac is dead.

July 16, 2013 - 3:12 pm UTC

12c seems quite disappointing in term of parallel processing as described by
some critical experts.

hah, Ok, I say is it awesome. now what? Now you have some non-crtical experts praising it. are we even?

I don't understand the DB2 quote as we've been doing threads and multiple processes since 1988 with version 6 on up? And we were doing parallel query in 1994 with version 7.1.6 - well before DB2 even did a parallel query.

here are the new features chapters:
http://docs.oracle.com/cd/E16655_01/server.121/e17613/whatsnew.htm#sthref4
http://docs.oracle.com/cd/E16655_01/server.121/e17749/release_changes.htm#sthref4

and Exadata/Exalytics/Exa-whatever are the hardware and they are x86 and or sparc based. It is not just about sparc (and not just about x86)

Index and Parallelism

Dillip Kumar Sahoo, July 19, 2013 - 11:17 am UTC

Hi Tom,
Thanks for wonderful explaination on oracle parallelism feature. You rock :-)

Could you please explain doubt(assuming we have enough system resources to create huge parallel queries):

1) Whether we are having any relationship between Index type and parallelism.

2) Is that Bitmap index run helps spawning more parallel sessions compared to Bitmap or vice versa?

July 19, 2013 - 1:08 pm UTC

it is not the index type inasmuch as the partitioning scheme applied to each index type.

Jess, September 16, 2013 - 1:13 pm UTC

Hi Tom,

In our OLTP database, we have PARALLEL_AUTOMATIC_TUNING = FALSE.
However, there is a specific bit of SQL executing during a 'slow' period that we'd like to run paralellised. That leaves the question of what degree to specify in the query.

Having read this thread, I would've thought that the thing to do here would be to set the parallel_automatic_tuning parameter to 'true', let Oracle run the query, look at what degree it used, reset the parameter back to 'false', and hint the query accordingly.

What's not clear to me is, if I did that, where/how do I look up the degree of parallelism used by the system during the 'true' run. Could you clarify that? Else, if this is not the way to do it, what is? Is it just leaving it as /*+ parallel(t) */ and letting it fend for itself every time?

(The table is about 20M rows, selecting for insert about 1% of the data, so the query is
insert ... into tableA
select /*+ parallel(t,??) */ ...
from tableB where C;
We don't want other queries against this table to run in parallel so would like to leave the table 'degree' set to 1. Also not using 'insert /*+ append */ as it didn't seem necessary here. Would you agree?)

September 24, 2013 - 4:53 pm UTC

this will be a function of your cpu count and how much of the machine you want to set aside for the OLTP things that will still be taking place during this slow period. You do not want to consume 100% of the machine (I'm assuming).

so the parallel degree would be between cpu_count/2 and 2*cpu_count. I'm assuming the parallel query will be doing physical IO - hence a parallel execution server won't always be on a cpu. If you wanted to ensure there was cpu left for the other transactions - you would start at cpu_count/2 and work up from there.

you would want to use append if you wanted a direct path insert. otherwise it would/could run the query in parallel but do a conventional path insert. what do you want it to do?

Jess, September 16, 2013 - 1:19 pm UTC

Sorry, forgot to say in http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:39946845137685#7359490700346724887

It's a clustered database, 2 nodes, 4 CPUs on each node.

Jess, September 26, 2013 - 6:00 pm UTC

Hi Tom,

Thank you for your response to http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:39946845137685#7359490700346724887 .

The table has 20M rows. A process lifts some data from the table, does a bit of processing, and sets the status of some records as 'closed'. We then want to move those to an archive table.

The volume will be about 300-500K rows (though could go up to 1M or down to 10K). This move is being done like so:

insert ... into tableA
select /*+ parallel(B,??) */ ...
from tableB where C;
delete /*+ parallel(B,??) */ tableB where C;
commit;

This has run in the past with an assortment of parallel degrees in the past, ranging between 5 and 20 (db box has 4 CPUs), with what appears to be a very slight difference in the time it takes to execute.

From what you're saying, it follows that the magic number is between 2 and 8. Is it beneficial to actually try to pick a number or to just leave the code as /*+parallel(t)*/

At the moment, it's just 'insert', not 'insert /*+append*/'. In terms of what we want--I am not sure. There are no selects from this table, so no problems with commit on direct path or other immeidate counterindications. On the other hand, it's a relatively small data set, and I'm not sure if we'll cause problems with the reusing the space. This is why we opted for conventional load. What would you advise?

Parallel DDL

Rajeshwaran Jeyabal, January 19, 2014 - 11:59 am UTC

Tom,

Can you help me why Parallel ddl is not happening on a Range partitioned table? I am on 10.2.0.5. Any work around for this in 10g? I am planning to go for Create Table followed by Parallel dml.

rajesh@ORA10GR2> drop table t1 purge;
rajesh@ORA10GR2> drop table t2 purge;
rajesh@ORA10GR2>
rajesh@ORA10GR2> create table t1 as select * from all_objects;
rajesh@ORA10GR2> create table t2 as select * from all_objects;
rajesh@ORA10GR2> exec dbms_stats.gather_table_stats(user,'T1');
rajesh@ORA10GR2> exec dbms_stats.gather_table_stats(user,'T2');
rajesh@ORA10GR2>
rajesh@ORA10GR2> delete from plan_table;
rajesh@ORA10GR2> commit;
rajesh@ORA10GR2> explain plan for
  2  create table t3
  3  nologging
  4  parallel 4 as
  5  select t1.*, t2.rowid as id
  6  from t1, t2
  7  where t1.object_id = t2.object_id ;
rajesh@ORA10GR2>
rajesh@ORA10GR2> select * from table( dbms_xplan.display );

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------
Plan hash value: 579899818

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------
|   0 | CREATE TABLE STATEMENT   |          | 58280 |  6317K|   154   (2)| 00:00:02 |        |      |            |
|   1 |  PX COORDINATOR          |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)    | :TQ10002 | 58280 |  6317K|   103   (1)| 00:00:02 |  Q1,02 | P->S | QC (RAND)  |
|   3 |    LOAD AS SELECT        | T3       |       |       |            |          |  Q1,02 | PCWP |            |
|*  4 |     HASH JOIN            |          | 58280 |  6317K|   103   (1)| 00:00:02 |  Q1,02 | PCWP |            |
|   5 |      PX RECEIVE          |          | 58281 |   967K|    51   (0)| 00:00:01 |  Q1,02 | PCWP |            |
|   6 |       PX SEND HASH       | :TQ10000 | 58281 |   967K|    51   (0)| 00:00:01 |  Q1,00 | P->P | HASH       |
|   7 |        PX BLOCK ITERATOR |          | 58281 |   967K|    51   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|   8 |         TABLE ACCESS FULL| T2       | 58281 |   967K|    51   (0)| 00:00:01 |  Q1,00 | PCWP |            |
|   9 |      PX RECEIVE          |          | 58280 |  5349K|    51   (0)| 00:00:01 |  Q1,02 | PCWP |            |
|  10 |       PX SEND HASH       | :TQ10001 | 58280 |  5349K|    51   (0)| 00:00:01 |  Q1,01 | P->P | HASH       |
|  11 |        PX BLOCK ITERATOR |          | 58280 |  5349K|    51   (0)| 00:00:01 |  Q1,01 | PCWC |            |
|  12 |         TABLE ACCESS FULL| T1       | 58280 |  5349K|    51   (0)| 00:00:01 |  Q1,01 | PCWP |            |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
rajesh@ORA10GR2>
rajesh@ORA10GR2>
rajesh@ORA10GR2> delete from plan_table;
rajesh@ORA10GR2> commit;
rajesh@ORA10GR2> explain plan for
  2  create table t3
  3  partition by range(object_id)
  4  (
  5     partition pmax values less than(maxvalue)
  6  )
  7  nologging
  8  parallel 4 as
  9  select t1.*, t2.rowid as id
 10  from t1, t2
 11  where t1.object_id = t2.object_id ;
rajesh@ORA10GR2>
rajesh@ORA10GR2> select * from table( dbms_xplan.display );

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------
Plan hash value: 3168156750

-------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | CREATE TABLE STATEMENT |      | 58280 |  6317K|   154   (2)| 00:00:02 |
|   1 |  LOAD AS SELECT        | T3   |       |       |            |          |
|*  2 |   HASH JOIN            |      | 58280 |  6317K|   103   (1)| 00:00:02 |
|   3 |    TABLE ACCESS FULL   | T2   | 58281 |   967K|    51   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL   | T1   | 58280 |  5349K|    51   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
rajesh@ORA10GR2>
rajesh@ORA10GR2>

This answer out of date

Régis, July 14, 2022 - 1:09 pm UTC

Hi,

We're using Oracle 19c and parameter PARALLEL_AUTOMATIC_TUNING doesn't exists. Are there a substitute?

Thanks ;)

July 15, 2022 - 10:39 am UTC

11.2 introduced auto degree of parallelism (dop) - look into that.

Here's a series of blog posts covering this:

https://blogs.oracle.com/datawarehousing/post/what-is-auto-dop
https://blogs.oracle.com/datawarehousing/post/configuring-and-controlling-auto-dop
https://blogs.oracle.com/datawarehousing/post/optimizer-processing-rates-for-auto-dop

Live SQL

Share and learn SQL and PL/SQL; free access to the latest version of Oracle Database!

Dev Gym

Classes, workouts and quizzes on Oracle Database technologies. Expertise through exercise!

Parallel Processing

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

noparallel at table level

I dont agree with

For The Human Fly

Ignoring PARALLEL hints

Unclear on PDML and parallel settings

integrity constraints with append hint

One more thing...

Constraints and parallel DML

Thanks

Help on performance

PDML is not enabled

Thanks

How about global partition index?

Sorry.. please ignore the above post I meant to say .

What about Parallel PLSQL?

JOB_QUEUE_PROCESSES set to 0 ?

How to calculate parallel_max_servers value?

Thanks for the detailed information

Try using parallel_index hint

Thanks very much

Parallel query in 10g

parallel_min_percent

Direct read

Transaction Free Lists

Documentation Bug ?

rownum effects

Documentation Bug - revisited.

HOW CAN WE VERIFY

How can we verify

How to check if executed parallely or not

Max parallel processes

parallel_max_proceses

create index in parallel

ORA-12838 Cannot read/modify an object after a parallel modification...

Direct Path and in parallel

Concrete Proof

Parallel DDL but no parallel DML

parallel DML

Parallel ctas

Parallel CTAS

Duplicating rows in a table with 1.1 billion records

v

Parallel process

Parallel_enable in a package ??

Thanks

I can do that..but

Bump for a Review above

Enabling PDML in a session

Parallel Processing

New Orcale formula to calculate PARALL_MAX_SERVER 10gR2

Parallel parameters

export in parallel

Import Buffer or recordlength

Log_checkpoint_interval

buffer or recordlength

buffer or recordlength for import

underscore parameters

Create index in parallel only one query running parallel

PArallelism parameters

Parallelism ver 9 and 10g

PDML on a non-partitioned table with bitmap indexes

Execution time and Parallel processing

Detrimental impact of parallelism

Default parallel clause

parallel dop

Detrimental impact

parallel_automatic_tuning

Serialized update

Re question above

Bad hair day

Metadata

Double checked everything

Parallel Query: v$SQL.PX_SERVERS_EXECUTIONS

Sorry.. please ignore the above post I meant to say.