Skip to Main Content
  • Questions
  • Generating output file using spool is very slow when compared to UTL_FILE

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Kartheek.

Asked: February 07, 2019 - 7:27 am UTC

Last updated: February 25, 2020 - 10:48 am UTC

Version: 12.1.0.2.0

Viewed 10K+ times! This question is

You Asked

Hello Team,

We have to extract huge data from EBS tables. We have tried two approaches

Approach1 (Using SQL and Spool command)
---------
We have created a script to generate the output using SPOOL command but this is taking 12 hours of time

Approach2 (Using Cursor, Table Type, Bulk Collect (Limit 10000) and Utl_file)
----------
We have created an anonymous PLSQL script which fetches the data from database using cursors, and puts the data into table type and then write data to file using utl_file. This approach is taking 3 hours

Driving table has 250 millions of data and expected count from the query is 25 Million.

My understanding is that spool command is faster than utl_file but the results are different here.

I have attached the both the scripts here (Just changed the actual sql to dummy sql but rest of the structure is same).

https://livesql.oracle.com/apex/livesql/s/hxt1psaw8n5b15v09lqj9bpq4 (Spool Approach)

https://livesql.oracle.com/apex/livesql/s/hxt1psavb0e5bd5ppd4w5rlsw (UTL_FILE Approach)

Hope to hear back from you soon.

Thanks
Kartheek

and Chris said...

You're likely suffering from setting the arraysize too small in SQL*Plus. This determines how many rows it gets on each request to the database.

For example, if I create a table with 10,000 rows:

create table t as 
  select level c1, lpad ( 'x', 20, 'x' ) c2
  from   dual
  connect by level <= 10000;


Then spool it out twice. First with an arraysize of 10, then with 500:

set termout off
set timing on
spool out.log
set arraysize 10
select * from t;

set arraysize 500
select * from t;
spool off


Getting the timings from the spooled file reveals:

10000 rows selected.

Elapsed: 00:02:37.03

...

10000 rows selected.

Elapsed: 00:00:03.77


So increasing the arraysize from 10 => 500 dropped the query time from over 2.5 minutes to under five seconds!

PS - both your scripts appear to be UTL_file based...

Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Kartheek Katragadda, February 07, 2019 - 2:20 pm UTC

Thanks for the response.
Setting the arraysize to 5000 improved the performance but it is still taking more time than the utl_file approach. Below are the parameters that I am setting before running the sql.

SET SERVEROUTPUT ON FORMAT WRAPPED
SET VERIFY OFF
SET PAGESIZE 0
SET FEEDBACK OFF
SET TERMOUT OFF
SET HEADING OFF
SET TRIM ON
SET WRAP ON
SET TRIMSPOOL ON
SET LINESIZE 32000
SET FLUSH ON
SET TAB OFF
SET ARRAYSIZE 5000

I am generating delimited file and enclosing the data with double quotes (") using this script. Below is the sample snippet.

select '"'||username
||'"|"'||user_id||'"'
from all_users;

Please let me know if any of these parameters or concatenating with | impacts performance.

Also, I see the valid values are 1 to 5000 for arraysize and I believe bulk collect does not have any limit.

Does it mean that bulk collect is faster if we use limit as > 5000 than spool script?

Thanks
Kartheek
Chris Saxon
February 07, 2019 - 3:24 pm UTC

Does it mean that bulk collect is faster if we use limit as > 5000 than spool script?

Not necessarily. You get diminishing returns for bulk collect as you increase the limit:

https://asktom.oracle.com/pls/apex/asktom.search?tag=bulk-collect-usage

To figure out why one approach is faster than the other, trace the sessions to see what they're doing. Then look at the trace files to see what's going on.

You can find details of how to do this at:

https://blogs.oracle.com/sql/how-to-create-an-execution-plan#tkprof

If you need help understand these files, post the formatted tkprof results (wrapped in code tags).

Kartheek Katragadda, February 13, 2019 - 10:53 am UTC

Here are the trace file details for spool file approach where we have an issue. I see database execution is getting completed 4022 seconds but SQL*Net message from client wait event is too high.

I have used arraysize parameter as 200 to get this trace file. I tried with different arraysizes but end result is same.

Is database waiting for command when sql plus is writing the data to output file? If so, How can we minimize this time?
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      1.26       1.42        265        274          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    84493   2755.65    4020.83    8922368  589570466          0    16898358
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    84495   2756.91    4022.25    8922633  589570740          0    16898358


  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                   84493        0.00          0.07
  Disk file operations I/O                      293        0.00          0.03
  db file sequential read                   1635372        0.09       1049.99
  db file parallel read                      423901        0.03        440.80
  db file scattered read                       4764        0.01          2.63
  latch: cache buffers chains                     9        0.00          0.00
  CSS initialization                              1        0.00          0.00
  CSS operation: action                           1        0.00          0.00
  read by other session                         168        0.00          0.32
  latch: redo allocation                          2        0.00          0.00
  latch free                                      3        0.00          0.00
  direct path write temp                       2185        0.00          1.83
  direct path read temp                       52947        0.03         42.99
  asynch descriptor resize                       32        0.00          0.05
  SQL*Net message from client                 84493        1.63      42877.58
  SQL*Net more data to client               2933107        0.00         33.49
********************************************************************************

Chris Saxon
February 13, 2019 - 11:15 am UTC

Where are you running the spool script? On the database server or some other machine? If it's a different machine, can you run it from the database?

Kartheek Katragadda, February 13, 2019 - 5:05 pm UTC

I am running the script from application server. Now I understand the issue. But I am not allowed to run the script from database server.

Thanks a lot for your help.
Connor McDonald
February 14, 2019 - 4:27 am UTC

One option you could look at is running the script on the server via the scheduler

https://connor-mcdonald.com/2018/06/20/the-death-of-utl_file/

Spool output did not appear for long running SQL commands

Ankit Vohra, February 21, 2020 - 2:49 pm UTC

Hi guys,

I am working on Oracle 12cR2 on AIX 7.1 machine.

I tried executing a long running INSERT and an ALTER table statement, as shown below:

spool /home/oracle/CHG0046380/Log_run_misc2_Spool2.log

set feedback on

INSERT INTO CLAIMS_NEW
SELECT * FROM CLAIMS where trunc(load_date) = trunc( to_date('16-02-2020','dd-mm-yyyy'));
commit;

ALTER TABLE CLAIMS_NEW ENABLE validate constraint CLAIMS_NEW_FK  EXCEPTIONS INTO MY_EXCEPTIONS;

spool off


I was expecting to see something like following in spool file:

n rows created.
Commit complete.
Table altered successfully



I tried checking results after around 5 hrs and Interestingly, spool file was empty, but the records were inserted in table as I saw via count. The constraint was in DISABLED state and MY_EXCEPTIONS table was empty.

When I ran the same ENABLE VALIDATE CONSTRAINT command again separately, it successfully validated the constraint in 1 hr time.

Also, similar commands on another table finished in 20 mins and I did get expected spool output.
n rows created.
Commit complete.


Can you tell me:

1) Does time taken by SQL command(s) impact the writing of output to spool file ?
2) How can I ensure writing output to spool file irrespective of the time taken by SQL command.

Thanks
Connor McDonald
February 25, 2020 - 10:48 am UTC

We buffer the output and flush it from time to time to ensure that spooling does not impact performance.

SET FLUSH ON|OFF

can be used to control this.

If you find that immediately flushing has a large impact, you can selectively flush it out by re-spooling with an append, ie

spool myfile.out
command1
command2
command3
command4
spool off


becomes

spool myfile.out
command1
command2
spool myfile.out append
command3
command4
spool off




More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library