Thanks for the question, Diego.

Asked: August 17, 2017 - 9:02 pm UTC

Last updated: August 19, 2017 - 2:04 am UTC

Version: 11.2

Viewed 1000+ times

You Asked

Hello, I'm doing some migration between two databases with user provided subsets.

The problem is that the parallel_level, doesn't seems to work. It always run one chunk at time.

First the procedure thrown an error saying that I need the Create_job grant.

So the dba gave me these grants:

GRANT CREATE JOB TO GEOPUB;
 GRANT SCHEDULER_ADMIN TO GEOPUB;
 GRANT MANAGE SCHEDULER TO GEOPUB;
 GRANT ADM_PARALLEL_EXECUTE_TASK TO GEOPUB;

and now it runs fine with parallel_level > 1.

Here is my code:

Chunks

dbms_utility.comma_to_table
      ( list   => FOLHAS
      , tablen => l_count
      , tab    => v_array
     );
select ... into maxRowNum
v_sqlChunk := 'select min(rownum) start_id, max(rownum) end_id from(
                        select rownum, ntile(' || ROUND(maxRowNum/CHUNK_SIZE + 1) || ') over (order by ID1) nt from ' || TABELA || '@prdgeoedt.world where ';
    for i in 1..l_count
    loop    
        v_sqlChunk := v_sqlChunk || ' CD_FCIM = ''' || v_array(i) || ''' OR';
    end loop;
    v_sqlChunk := SUBSTR(v_sqlChunk, 1, (LENGTH(v_sqlChunk) - 2));                    
    v_sqlChunk := v_sqlChunk || ') group by nt';

dbms_parallel_execute.create_chunks_by_sql
            (
                task_name => c_task_name,
                sql_stmt  => v_sqlChunk,
                by_rowid => false   
            );

Task:

v_sqlInsert := 'insert into ' || TABELA || '( ' || v_columns || ')  
                        select ' || v_columns || ' 
                            from ( select t.*, ROWNUM AS rn 
                                        FROM ( SELECT * FROM ' || TABELA || '@prdgeoedt.world where ';
                                                for i in 1..l_count
                                                loop
                                                    v_sqlInsert := v_sqlInsert || ' CD_FCIM = ''' || v_array(i) || ''' OR';
                                                end loop;
                                                v_sqlInsert := SUBSTR(v_sqlInsert, 1, (LENGTH(v_sqlInsert) - 2));
    v_sqlInsert := v_sqlInsert || ' order by ID1) t ) WHERE rn  >= :start_id AND rownum <= LEAST (' || CHUNK_SIZE || ',(:end_id - :start_id + 1))';

DBMS_PARALLEL_EXECUTE.RUN_TASK(c_task_name, v_sqlInsert, DBMS_SQL.NATIVE,
                                  parallel_level => 4);

Everything works, except the parallelism. It's always one chunk at time:

select count(*) FROM   user_parallel_execute_chunks WHERE STATUS = 'ASSIGNED'

It always give me 1.

Looking the chunks table, I can see they run one by one:

1987 Atualiza_Bloco_T_GM_AREA PROCESSED   51301 51400 TASK$_9352_1 17/08/17 14:20:58,331062000 17/08/17 14:21:24,375098000  
1988 Atualiza_Bloco_T_GM_AREA PROCESSED   52401 52500 TASK$_9352_1 17/08/17 14:21:24,388782000 17/08/17 14:21:39,550440000  
1989 Atualiza_Bloco_T_GM_AREA PROCESSED   53901 54000 TASK$_9352_1 17/08/17 14:21:39,561615000 17/08/17 14:21:54,785039000  
1990 Atualiza_Bloco_T_GM_AREA PROCESSED   54401 54500 TASK$_9352_1 17/08/17 14:21:54,795799000 17/08/17 14:22:08,870095000  
1991 Atualiza_Bloco_T_GM_AREA PROCESSED   55801 55900 TASK$_9352_1 17/08/17 14:22:08,880978000 17/08/17 14:22:21,343356000  
1992 Atualiza_Bloco_T_GM_AREA PROCESSED   56201 56300 TASK$_9352_1 17/08/17 14:22:21,352776000 17/08/17 14:22:34,610168000  
1993 Atualiza_Bloco_T_GM_AREA PROCESSED   56701 56800 TASK$_9352_1 17/08/17 14:22:34,623940000 17/08/17 14:22:46,582006000  
1994 Atualiza_Bloco_T_GM_AREA PROCESSED   57401 57500 TASK$_9352_1 17/08/17 14:22:46,594983000 17/08/17 14:22:58,181511000  
1995 Atualiza_Bloco_T_GM_AREA PROCESSED   58401 58500 TASK$_9352_1 17/08/17 14:22:58,195566000 17/08/17 14:23:09,096338000  
1996 Atualiza_Bloco_T_GM_AREA PROCESSED   58901 59000 TASK$_9352_1 17/08/17 14:23:09,113248000 17/08/17 14:23:22,524784000

First I thought the problem might be the dblink, but I saw this question:

https://asktom.oracle.com/pls/apex/f?p=100:11:111277586747378::NO:::

And he said it worked. I'm not just sure if in parallel or serial.

I tried to search for any documentation regarding parallel_level, but nothing interesting came up.

I also searched for all parallel parameters:

SELECT name, value
    FROM   v$parameter
    WHERE  name LIKE '%parallel%'

And looking the result:

parallel_server FALSE
parallel_server_instances 1
recovery_parallelism 0
fast_start_parallel_rollback LOW
parallel_min_percent 0
parallel_min_servers 0
parallel_max_servers 960
parallel_instance_group 
parallel_execution_message_size 2152
_parallel_broadcast_enabled TRUE
parallel_degree_policy MANUAL
parallel_adaptive_multi_user TRUE
parallel_threads_per_cpu 2
parallel_automatic_tuning FALSE
parallel_io_cap_enabled FALSE
parallel_min_time_threshold AUTO
parallel_degree_limit CPU
parallel_force_local FALSE
parallel_servers_target 384

I could not find anything that might help me.

Just for clarification, even though I'm not sure it is related:

*t_gm_area is the same table being processed.

explain plan for select /*+ PARALLEL (4) */ sum(id1)
 from t_gm_area;

results in:

   - Degree of Parallelism is 40 because of hint

So I'm not sure if this behavior is by design because of dblink or something is misconfigured. Any help?
Thanks!

and Connor said...

A couple of things here

1) dbms_parallel_execute does parallelism by utilizing the scheduler. So you need to ensure that you have your job_queue_processes set accordingly, and ensure that the job class being used by dbms_parallel_execute is not constrained in any way. So the first thing I would try is a very simple routine, ie, make v_sqlChunk = 'select rownum, rownum from dual connect by level <= 20', and make the task something trivial like:

procedure long_running_task(p1 int, p2 int) is
begin
  dbms_lock.sleep(60);
end;

and make sure that these can be done in parallel.

2) To work out your ranges, you scanned the entire table with an expensive sorting (ntile) operation. Then each of your tasks (parallel or otherwise) are also doing a lot of extra work to assign the correct chunks. Doesn't seem like a particularly efficient way to approach this.

You might be better off with rowid ranges. Now because its a remote table, I dont think you can use the native rowid carve up, but you could use something manually like the code here

https://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:10498431232211

to get the rowid ranges from the remote data dictionary, and leverage them to do rowid range scans. Alternatively, call dbms_parallel_execute on the remote node *just* to get the rowid ranging done, then call dbms_parallel_execute on the *local* node which will then use the information stored in the *remote* task to use rowid ranges.

Rating

(2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Rowid range by custom value

Diego Santos, August 18, 2017 - 7:07 pm UTC

Hello, Connor. Thanks for your reply. I made a simple test case for the run_task and it was also running sequentially. I looked up the value of job_queue_processes and it was set to 1.

I asked the DBA to change this and will test again later.

About the rowid range, I failed to understand how can I use it in the current form.
For what I can see, that procedure and some variations that I found online divides the whole table by n equal sizes.

My migration is to be used only with a user defined subset of the table that is randomly distributed across the tables blocks.

But I totally agree. the continuous ntile operation is slowing things up.

I think I will create a temporary table with the subset, then proceed with the rowid range approach.

At the thread, Tom said this:
1) it can be used on tables of any size, you just want lots of extents so the split routine can split it "good". Beware of small tables with small numbers of extents -- haven't tested thoroughly against them with this query (eg: if the number of extents is less than the number of chunks, I could foresee issues)

So, is limiting my chunk size to extents - 1 a good practice?

Thanks

August 19, 2017 - 2:04 am UTC

Re:

About the rowid range, I failed to understand how can I use it in the current form.
For what I can see, that procedure and some variations that I found online divides the whole table by n equal sizes.

My migration is to be used only with a user defined subset of the table that is randomly distributed across the tables blocks.

but the thing here is that for *arbitrary random* criteria, you'll be probably scanning the entire table anyway. So you break the table into rowid ranges, so your task would end up being

select *
from my_table
where [all of your user specified criteria]
and rowid between :start and :end

So that way it is similar to what a normal parallel query would be doing, ie, scanning a subset of the table for a nominated criteria.

DBMS_PARALLEL_EXECUTE.GET_ROWID_CHUNK problem

Luis Garcia, June 19, 2024 - 1:24 pm UTC

I had same problem, and i find one note with the explanation, this issue is related with the EXTENTS of the table.

https://www.igorkromin.net/index.php/2014/06/13/the-dbms_parallel_executecreate_chunks_by_rowid-not-creating-enough-chunks-bound-by-extents/

I fix my problem with a refedinition of the storage of my table to assure separete the data into diferents blocks:

ALTER TABLE WORKING_TABLE MOVE
STORAGE ( INITIAL 1M
NEXT 1M
MINEXTENTS 20
MAXEXTENTS 1000000
PCTINCREASE 0 )
TABLESPACE L2BPGS_TBS

Regards

Breadcrumb

Question and Answer

You Asked

and Connor said...

Rating

Comments

Rowid range by custom value

DBMS_PARALLEL_EXECUTE.GET_ROWID_CHUNK problem

More to Explore

PL/SQL demos

PL/SQL docs

DBMS_PARALLEL_EXECUTE.RUN_TASK is running sequentialy

Breadcrumb

Question and Answer

You Asked

and Connor said...

Rating

Comments

Rowid range by custom value

DBMS_PARALLEL_EXECUTE.GET_ROWID_CHUNK problem

More to Explore

PL/SQL demos

PL/SQL docs