Skip to Main Content
  • Questions
  • Parallel refresh of materialized view over database link

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Michael.

Asked: October 30, 2019 - 3:03 pm UTC

Last updated: November 12, 2019 - 11:17 pm UTC

Version: 12c

Viewed 1000+ times

You Asked

Hi Tom,

My goal is to populate local table using data from remote database using PARALLEL hint.
My problem is that the query is not being executed in parallel when it is part of an INSERT ... SELECT block that uses join.

My test case:

1) Connect to remote database and create table:
CREATE TABLE REMOTE_TABLE AS SELECT OWNER,OBJECT_NAME FROM SYS.DBA_OBJECTS;


2) Connect to local database and create database link to remote database:
CREATE DATABASE LINK TO_REMOTE CONNECT TO my_user IDENTIFIED BY my_pass USING 'remote_db';


3) In the local database, create empty table with the same structure as REMOTE_TABLE in remote database:
CREATE TABLE LOCAL_TABLE AS SELECT * FROM REMOTE_TABLE@TO_REMOTE WHERE 1=2;


4) Simple query runs in parallel:
SELECT /*+ PARALLEL(2) */ * FROM REMOTE_TABLE@TO_REMOTE;


Here is the statement I see being executed in remote database; parallel execution is used ( SHARED hint is present):

SELECT /*+ SHARED (2) */
"A1"."OWNER", "A1"."OBJECT_NAME"
FROM "REMOTE_TABLE" "A1"

5) INSERT INTO using query without joins also runs in parallel:
INSERT INTO LOCAL_TABLE SELECT /*+ PARALLEL(2) */ * FROM REMOTE_TABLE@TO_REMOTE;


Here is the statement I see being executed in remote database; parallel execution is used ( SHARED hint is present):

SELECT /*+ OPAQUE_TRANSFORM SHARED (2) */
"OWNER", "OBJECT_NAME"
FROM "REMOTE_TABLE" "REMOTE_TABLE"

6) Query with join also runs in parallel:
SELECT /*+ PARALLEL(2) */ A.* FROM REMOTE_TABLE@TO_REMOTE A, REMOTE_TABLE@TO_REMOTE B
 WHERE A.OBJECT_NAME = B.OBJECT_NAME
   AND A.OWNER       = B.OWNER;


Here is the statement I see being executed in remote database; parallel execution is used ( SHARED hint is present):

SELECT /*+ SHARED (2) */
"A2"."OWNER", "A2"."OBJECT_NAME"
FROM "REMOTE_TABLE" "A2", "REMOTE_TABLE" "A1"
WHERE "A2"."OBJECT_NAME" = "A1"."OBJECT_NAME"
AND "A2"."OWNER" = "A1"."OWNER"

7) But INSERT INTO using query with join runs in a single thread:
INSERT INTO LOCAL_TABLE
SELECT /*+ PARALLEL(2) */ A.* FROM REMOTE_TABLE@TO_REMOTE A, REMOTE_TABLE@TO_REMOTE B
 WHERE A.OBJECT_NAME = B.OBJECT_NAME
   AND A.OWNER       = B.OWNER


Here is the statement I see being executed in remote database; parallel execution is NOT used (no SHARED hint):

SELECT "A2"."OWNER", "A2"."OBJECT_NAME"
FROM "REMOTE_TABLE" "A2", "REMOTE_TABLE" "A1"
WHERE "A2"."OBJECT_NAME" = "A1"."OBJECT_NAME"
AND "A2"."OWNER" = "A1"."OWNER"

What is wrong? How can I force query to run in parallel in the remote database?
The quries above are to provide simple test case. My original query is much more complicated.

and Connor said...

Because we remap the aliases, I think you'll need to explicitly nominate the tables. When I did that, parallel worked in all instances as expected;

Remote node:

SQL> create table t1 as select * from dba_objects;

Table created.

SQL> create table t2 as select * from dba_Objects;

Table created.


Local node

select /*+ parallel */ t1.owner, count(*)
from t1@db19_pdb1
group by owner;

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT REMOTE  |          |    32 |   192 |    40   (3)| 00:00:01 |        |      |            |
|   1 |  PX COORDINATOR          |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)    | :TQ10001 |    32 |   192 |    40   (3)| 00:00:01 |  Q1,01 | P->S | QC (RAND)  |
|   3 |    HASH GROUP BY         |          |    32 |   192 |    40   (3)| 00:00:01 |  Q1,01 | PCWP |            |
|   4 |     PX RECEIVE           |          |    32 |   192 |    40   (3)| 00:00:01 |  Q1,01 | PCWP |            |
|   5 |      PX SEND HASH        | :TQ10000 |    32 |   192 |    40   (3)| 00:00:01 |  Q1,00 | P->P | HASH       |
|   6 |       HASH GROUP BY      |          |    32 |   192 |    40   (3)| 00:00:01 |  Q1,00 | PCWP |            |
|   7 |        PX BLOCK ITERATOR |          | 77667 |   455K|    39   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|   8 |         TABLE ACCESS FULL| T1       | 77667 |   455K|    39   (0)| 00:00:01 |  Q1,00 | PCWP |            |
------------------------------------------------------------------------------------------------------------------

Note
-----
   - fully remote statement
   

select /*+ parallel */ t1.owner, count(*)
from t1@db19_pdb1, t2@db19_pdb1
where t1.object_id = t2.object_id 
group by t1.owner;


------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT REMOTE        |          |    32 |   512 |    79   (3)| 00:00:01 |        |      |            |
|   1 |  PX COORDINATOR                |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)          | :TQ10003 |    32 |   512 |    79   (3)| 00:00:01 |  Q1,03 | P->S | QC (RAND)  |
|   3 |    HASH GROUP BY               |          |    32 |   512 |    79   (3)| 00:00:01 |  Q1,03 | PCWP |            |
|   4 |     PX RECEIVE                 |          |    32 |   512 |    79   (3)| 00:00:01 |  Q1,03 | PCWP |            |
|   5 |      PX SEND HASH              | :TQ10002 |    32 |   512 |    79   (3)| 00:00:01 |  Q1,02 | P->P | HASH       |
|   6 |       HASH GROUP BY            |          |    32 |   512 |    79   (3)| 00:00:01 |  Q1,02 | PCWP |            |
|*  7 |        HASH JOIN               |          | 77663 |  1213K|    78   (2)| 00:00:01 |  Q1,02 | PCWP |            |
|   8 |         PX RECEIVE             |          | 77668 |   379K|    39   (0)| 00:00:01 |  Q1,02 | PCWP |            |
|   9 |          PX SEND HYBRID HASH   | :TQ10000 | 77668 |   379K|    39   (0)| 00:00:01 |  Q1,00 | P->P | HYBRID HASH|
|  10 |           STATISTICS COLLECTOR |          |       |       |            |          |  Q1,00 | PCWC |            |
|  11 |            PX BLOCK ITERATOR   |          | 77668 |   379K|    39   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|  12 |             TABLE ACCESS FULL  | T2       | 77668 |   379K|    39   (0)| 00:00:01 |  Q1,00 | PCWP |            |
|  13 |         PX RECEIVE             |          | 77667 |   834K|    39   (0)| 00:00:01 |  Q1,02 | PCWP |            |
|  14 |          PX SEND HYBRID HASH   | :TQ10001 | 77667 |   834K|    39   (0)| 00:00:01 |  Q1,01 | P->P | HYBRID HASH|
|  15 |           PX BLOCK ITERATOR    |          | 77667 |   834K|    39   (0)| 00:00:01 |  Q1,01 | PCWC |            |
|  16 |            TABLE ACCESS FULL   | T1       | 77667 |   834K|    39   (0)| 00:00:01 |  Q1,01 | PCWP |            |
------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   7 - access("A2"."OBJECT_ID"="A1"."OBJECT_ID")

Note
-----
   - fully remote statement
   
   
SQL> create table results ( o varchar2(200), c int );

Table created.

insert into results
select /*+ parallel(t1) */ t1.owner, count(*)
from t1@db19_pdb1
group by owner;

-------------------------------------------------------------------------
| Id  | Operation                | Name    | Cost (%CPU)| Inst   |IN-OUT|
-------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |         |     0   (0)|        |      |
|   1 |  LOAD TABLE CONVENTIONAL | RESULTS |            |        |      |
|   2 |   REMOTE                 |         |            | DB19_~ | R->S |
-------------------------------------------------------------------------

Remote SQL Information (identified by operation id):
----------------------------------------------------

   2 - EXPLAIN PLAN SET STATEMENT_ID='a1' INTO "PLAN_TABLE"@! FOR
       SELECT /*+ PARALLEL ("A1",DEFAULT) */ "A1"."OWNER",COUNT(*) FROM "T1"
       "A1" GROUP BY "A1"."OWNER" (accessing 'DB19_PDB1' )
       
       

insert into results
select /*+ parallel(t1) parallel(t2) */ t1.owner, count(*)
from t1@db19_pdb1, t2@db19_pdb1
where t1.object_id = t2.object_id 
group by t1.owner;


-------------------------------------------------------------------------
| Id  | Operation                | Name    | Cost (%CPU)| Inst   |IN-OUT|
-------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |         |     0   (0)|        |      |
|   1 |  LOAD TABLE CONVENTIONAL | RESULTS |            |        |      |
|   2 |   REMOTE                 |         |            | DB19_~ | R->S |
-------------------------------------------------------------------------

Remote SQL Information (identified by operation id):
----------------------------------------------------

   2 - EXPLAIN PLAN SET STATEMENT_ID='a1' INTO "PLAN_TABLE"@! FOR
       SELECT /*+ PARALLEL ("A1",DEFAULT) PARALLEL ("A2",DEFAULT) */
       "A2"."OWNER",COUNT(*) FROM "T1" "A2","T2" "A1" WHERE
       "A2"."OBJECT_ID"="A1"."OBJECT_ID" GROUP BY "A2"."OWNER" (accessing
       'DB19_PDB1' )


Also note that you can have one or the other but not both, ie, the transmission across the database link will be done in serial.

https://asktom.oracle.com/pls/asktom/asktom.search?tag=parallel-select-from-remote-site

https://asktom.oracle.com/pls/apex/asktom.search?tag=ctas-dblink-parallel-hints

https://asktom.oracle.com/pls/asktom/asktom.search?tag=parallel-insertdelete-and-database-links

As per the links above, if you need parallel sessions on both local and remote, aka, multiple database links in operation, then you would need to do it yourself eg dbms_parallel_execute.

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

SELECT vs INSERT ... SELECT

Michael Petrovsky, November 06, 2019 - 3:04 pm UTC

Thank you, Connor. Your solution was very helpful and my query runs about 4 times faster (after all, now it runs in parallel).

Still a question:
Why INSERT from multi-table SELECT does not run in parallel unless I add table alias ta PARALLEL hint?
Single-table SELECT, multi-table SELECT and INSERT from single-table SELECT run just fine without specifying alias in PARALLEL hint.
What is the difference?

Connor McDonald
November 07, 2019 - 3:17 am UTC

We're not retaining the table aliases in the transformation to push it over the wire to the other node.

(That is possibly a bug)

Michael Petrovsky, November 11, 2019 - 3:41 pm UTC

But why INSERT from single-table SELECT run just fine without specifying alias in PARALLEL hint?
Because it's a single table and optimizer does not get confused?


Connor McDonald
November 12, 2019 - 11:17 pm UTC

yup

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.