Skip to Main Content

Breadcrumb

May 4th

Question and Answer

Chris Saxon

Thanks for the question.

Asked: July 05, 2016 - 7:13 pm UTC

Last updated: July 13, 2016 - 1:14 am UTC

Version: Oracle 11g

Viewed 10K+ times! This question is

You Asked

Hi Tom,

I am running into an issue while using BULK COLLECT INTO in a stored procedure. I use a cursor to get data (42 million rows, 70 columns) from a remote database (a table in Teradata), then insert those records into a table in Oracle. After I run the procedure, the table in Oracle only has 1000 rows. However, if I use a cursor to get data from a table in the same environment, i.e. Oracle, the BULK COLLECT INTO works, and the final table in Oracle has all the 42 million records. Just wondering if BULK COLLECT INTO only works when data resides in the same environment? Here is what the code looks like:

CREATE OR REPLACE PROCEDURE P_TEST
AS
v_msg_proc_name varchar2(30) := ‘P_TEST’;
g_err_code VARCHAR2(10);
g_err_msg VARCHAR2(400);
g_err_msg_log VARCHAR2(4000);

-- Create a record set for bulk insert
CURSOR c_test_rec
IS
(
select a.*
from a table in Teradata@dblink a -- a table in Teradata that has 42 million rows and 70 columns. If I use a table in Oracle. I don't have any issues.
);

TYPE c_test_table IS TABLE OF c_test_rec%ROWTYPE;
c_test_table_ins c_test_table := c_test_table();

BEGIN

EXECUTE IMMEDIATE 'TRUNCATE TABLE TEST';

OPEN c_test_rec;
LOOP
FETCH c_test_rec
BULK COLLECT INTO c_test_table_ins LIMIT 1000;
EXIT WHEN c_test_table_ins.COUNT = 0;

FORALL i IN 1 .. c_test_table_ins.COUNT
INSERT /*+ APPEND_VALUES */ INTO test VALUES c_test_table_ins(i);
COMMIT;
END LOOP;

CLOSE c_test_rec;

COMMIT;

EXCEPTION
WHEN OTHERS THEN
g_err_code := SQLCODE;
g_err_msg := substr(SQLERRM, 1, 200);
g_err_msg_log := g_err_msg_log || v_msg_proc_name || ': ' || g_err_code ||' - ' || g_err_msg || chr(13);

IF c_test_rec%ISOPEN
THEN
CLOSE c_test_rec;
END IF;

END P_TEST;

and Connor said...

There is no documented limitation about 'same' environment. I worked for a client once and we did the near exact process but from SQL Server to Oracle, with no issues.

So let's try narrow it down

Step 1
======
set serverout on
declare
  CURSOR c_test_rec IS
    select a.*
    from a table in Teradata@dblink a;
  TYPE c_test_table IS TABLE OF c_test_rec%ROWTYPE; 
  c_test_table_ins c_test_table := c_test_table(); 

BEGIN

OPEN c_test_rec;
LOOP
  FETCH c_test_rec
  BULK COLLECT INTO c_test_table_ins LIMIT 1000;
  EXIT WHEN c_test_table_ins.COUNT = 0 or c_test_rec%rowcount > 50000;

  dbms_output.put_line(c_test_rec%rowcount);
END LOOP;
CLOSE c_test_rec;

END;



So we'll see if we at least iterate through the loop (I've capped it at 50 passes).

Step 2
======
Let's see if its there's an issue with the nested table type being reset when populated from teradata, so try:

set serverout on
declare
  CURSOR c_test_rec IS
    select a.*
    from a table in Teradata@dblink a;
  TYPE c_test_table IS TABLE OF c_test_rec%ROWTYPE; 
  c_test_table_ins c_test_table := c_test_table(); 

BEGIN

OPEN c_test_rec;
LOOP
  FETCH c_test_rec
  BULK COLLECT INTO c_test_table_ins LIMIT 1000;
  EXIT WHEN c_test_table_ins.COUNT = 0 or c_test_rec%rowcount > 50000;

  dbms_output.put_line(c_test_rec%rowcount);
  c_test_table_ins c_test_table(); 
END LOOP;
CLOSE c_test_rec;

END;




Step 3
======
Let's see if its related to the nested table type...so try an assoc array

set serverout on
declare
  CURSOR c_test_rec IS
    select a.*
    from a table in Teradata@dblink a;
  TYPE c_test_table IS TABLE OF c_test_rec%ROWTYPE index by pls_integer; 
  c_test_table_ins c_test_table;

BEGIN

OPEN c_test_rec;
LOOP
  FETCH c_test_rec
  BULK COLLECT INTO c_test_table_ins LIMIT 1000;
  EXIT WHEN c_test_table_ins.COUNT = 0 or c_test_rec%rowcount > 50000;

  dbms_output.put_line(c_test_rec%rowcount);
END LOOP;
CLOSE c_test_rec;

END;



and so forth...let's see how we go. You can see I've removed ALL exception handling, because we want to SEE if anything goes wrong.

Keep us posted

Rating

  (6 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Restrictions?

Ghassan, July 06, 2016 - 7:31 am UTC

Is there restrictions using this in plsql?
Did you try a one shot sql stmt?
Or using a gtt to bulk all rows from remote then process the forall from the gtt.

A reader, July 06, 2016 - 6:54 pm UTC

Hi Connor,

Thank you so much for reviewing the question. I tried all the three suggested steps, and each of the dbms_output.put_line showed 50000. That means the loop is working correctly. right?

Thanks!
Chris Saxon
July 06, 2016 - 11:39 pm UTC

Yes, so we have increased confidence the fetch/loop construct.

So now...introduce back in your forall statement, once again along similar lines to what I've done with the fetch examples.

ie, *just* enough code, eg

version 1: forall insert *without* append or append_values or commit
version 2: then add in append_values and the commit

etc.... Keep us posted, or let us know if you get stuck

A reader, July 08, 2016 - 3:11 am UTC

Thanks again for the quick review. Initially I introduced FORALL insert by having COMMIT right after the INSERT, and I got an error message:
ORA-01002: fetch out of sequence
ORA-02063: preceding line from remote database (Teradata)
ORA-06512: at line 15

So, I moved COMMIT after END LOOP; and it worked! All the 42 million rows and 70 columns from the table in Teradata got inserted into the table in Oracle, but it took 2 hours. I did test with just pulling 42 million rows and 3 columns, and it took 6 minutes.

We are sure that we need all the columns, so wondering if there is a way to possibly improve the performance.

Thanks again!
Connor McDonald
July 08, 2016 - 8:37 am UTC

To suggest performance improvements, we need to understand why the process is currently taking so long.

To do this, trace the session to see what it's doing:

exec DBMS_monitor.session_trace_enable ( null, null, true, true );
<your code here>
exec DBMS_monitor.session_trace_disable; 


Then use tkprof to parse the tracefile. For more details see:

https://blogs.oracle.com/sql/entry/how_to_create_an_execution#tkprof

Once you've got this, post the results here.

Chris

A reader, July 12, 2016 - 6:20 am UTC

Thanks Chris for the review. I forgot to mention in the previous post - when I first added FORALL back to the code, I did not remove APPEND_VALUES. When I ran it, it gave me an error message - ORA-12838: cannot read/modify an object after modifying it in parallel. So I removed it. By removing APPEND_VALUES might slow down the INSERT process?
Connor McDonald
July 12, 2016 - 6:39 am UTC

OK, 2 hours was spent

- fetching
- inserting

First we need to know which is the one to tackle. So some simple tests

1) just the fetch bulk collect (no insert)

If that's lengthy, then we may have dramas tuning it, because its a cross-database platform process. But we'll cross that bridge if we come to it.

2) bulk collect just 500,000 rows (in one hit), and then compare an insert versus an insert append-values

(I chose 500k as a reasonable limit between getting enough rows to test our insert costs and not blowing up your pga...so adjust if necessary).


Let us know how you get on

A reader, July 12, 2016 - 4:08 pm UTC

Thanks! I tested 3 times for the following 4 scenarios and timed each instance. I only tested once for the last scenario as it gave me an error message. All scenarios were using 500k records.

1. Fetch only no Insert: 2 min 36 secs/ 1 min 54 secs / 1 min 56 secs

2. Fetch + Insert without Append Values or any other hint: 2 min 5 secs/ 1 min 52 secs / 2 mins

3. Fetch + Insert with Parallel hint (something like this INSERT /*+ PARALLEL (A,4,1)*/ INTO…: 2 min 1 secs / 1 min 53 secs / 2 min 5 secs

4. Fetch + Insert with Append Values hint: Error message - ORA-12838: cannot read/modify an object after modifying it in parallel

Thanks!

Chris Saxon
July 13, 2016 - 1:14 am UTC

So it looks to me like its the fetching of the data that is the issue here. The insert processing looks like its fine.

So when it comes to fetching, potential factors are:

a- how fast teradata can give you the data
b- the raw network speed
c- any network protocol overhead (ie, sqlnet etc)
d- the gateway mechanism (ie oracle-to-from-teradata)
e- the fetch size

We already know (e), because we're doing bulk collect. I'll leave you to work out whether (a) and (b) is an issue.

For (c), you might want to explore any benefits possible with SDU/TDU settings. Some info here

https://docs.oracle.com/cd/B28359_01/network.111/b28316/performance.htm

https://oraganism.wordpress.com/2011/09/24/setting-sdu-size-mainly-in-11-2/

but in my experience, I've rarely seen *huge* gains by tinkering with them.

For (d) - what are you using ? the gateway ? or HS ? other ?

Hope this helps.

A reader, July 19, 2016 - 5:38 pm UTC

Thanks again for the information.

For (d) - we are using the gateway.

More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library