Skip to Main Content
  • Questions
  • Bulk operations. jdbc using batch option VS jdbc calling stored procedure

Breadcrumb

May 4th

Question and Answer

Tom Kyte

Thanks for the question, Lior.

Asked: December 25, 2000 - 6:49 am UTC

Last updated: January 18, 2012 - 7:12 am UTC

Version: 8.1.5

Viewed 10K+ times! This question is

You Asked

Hi Tom !

I have a question regarding bulk operations (inserts/updates and deletes). I need to insert a large amount of records into a table. Performance wise what is the preferred way. Using JDBC with the Batch option or implement a JDBC call to a stored procedure using array processing where I can take advantage of the FORALL option or other Oracle options.
What is the best way ? what are the pros and the cons of each way ?

Thanks in advance,
Lior.

and Tom said...

I think (but have not tested, a simple benchmark would bear this out) that a JDBC batch (set the execute batch to a hundred or so) would outperform PLSQL in this particular case.

The reason is:

we will go from JDBC ->
single network call with 100 records ->
database kernel


versus

JDBC ->
single network call with 100 record ->
PLSQL array ->
database kernel


In this particular case, PLSQL would add a layer to the process versus just sending 100 rows directly there. I would probably opt for JDBC in this case unless I needed to use a stored procedure to "hide" access to the base tables (eg: I did not want to grant INSERT on the table but granting EXECUTE on this one procedure was OK).



Rating

  (51 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Elobrate in detail

RAVICHANDRAN V, August 29, 2001 - 9:37 am UTC

Not upto my expectations please give me more information regarding this topic.

DBMS JVM JDBC Java stored procedure vs PLSQL?

Tom Starr, August 03, 2003 - 8:46 pm UTC

Can one similarly avoid granting inserts by granting EXECUTE on a DBMS JVM JDBC Java stored procedure, in this case even if the Java method is Public?

I was expecting one would secure the Java, and it's PLSQL call spec, and possibly the related trigger, hoping Oracle privileges would secure Java object execution, despite the necessary (in this case) public static Java method.




Tom Kyte
August 03, 2003 - 9:01 pm UTC

yes, as you call java in the db via a plsql wrapper and you need execute on the wrapper and the wrapper runs with the privs of the definer by default....

Excellent Review

Tom Starr, August 04, 2003 - 4:02 am UTC

Thank-you for clarifying that only the PLSQL wrapper, and neither the Java it calls, nor the trigger which calls it need execute protection.

May I ask if it should work in a unix environment for that execute privilege to be granted to an ops$<oracle_user> mapped to the owner of a setuid executable?

Or could it be preferrable to use OS_ROLES?


Tom Kyte
August 04, 2003 - 9:02 am UTC

believe you meant "execute permission" -- they are "protected"


I do not understand your implied relationship to a setuid executable?

Clarification

Tom Starr, August 04, 2003 - 12:22 pm UTC

A Solaris app executes menu options as the invoker, a new menu option needs Oracle 8.1.7 authentication, and unless our DBAs prefer OS_ROLES, our plan is to have the menu pick invoke a setuid program, which we hope will authenticate against an externally identified Oracle user associated with the unix owner of the setuid program, which Oracle user will also have EXECUTE privilege on any required Oracle objects. The main concern is whether Oracle will accept the effective UID associated with the setuid program,for purposes of external authentication.

Tom Kyte
August 04, 2003 - 12:41 pm UTC

we seem to be using getuid() not geteuid() for os authentication...

I copied sqlplus to sqlplus_setuid and set uid on it:

[ora920@tkyte-pc-isdn bin]$ cp sqlplus sqlplus_setuid
[ora920@tkyte-pc-isdn bin]$ chmod a+s sqlplus_setuid
[ora920@tkyte-pc-isdn bin]$ ls -l sqlplus_setuid
-rwsr-s--x    1 ora920   ora920     664980 Aug  4 12:34 sqlplus_setuid
[ora920@tkyte-pc-isdn bin]$


sqlplus_setuid /

SQL*Plus: Release 9.2.0.3.0 - Production on Mon Aug 4 12:41:50 2003

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.


Connected to:
Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production

ops$tkyte@ORA920> select user from dual;

USER
------------------------------
OPS$TKYTE

ops$tkyte@ORA920> !ps -auxww | grep sqlplus_setuid
ora920   26651  0.2  1.0 13468 5456 pts/3    S    12:41   0:00 sqlplus_setuid



so no, i don't believe that will work for you. 

A Bit Surprised

Steve, August 04, 2003 - 4:29 pm UTC

Tom,
Aren't you the guy who said,
"I firmly believe the best java programs are those that have ZERO
"selects/inserts/updates/deletes" in them....."

Are you making an exception here?

Tom Kyte
August 04, 2003 - 4:48 pm UTC

this is a bulk loader...

nothing is ever black and white. for a loader, rules are bent, broken and disregarded sometimes.

some jdbc benchmarks

Menon, August 23, 2003 - 5:31 pm UTC

OK - this is not directly related to the above
benchmark that Tom suggested for the original
question but this evaluates the impact of batch
size when using dml statements.
Tom, others - it would be great if you can just
take a look at if I am making any mistakes
in the benchmark itself or not!:)

The reason I ran these benchmarks are:
1. Wanted to see how different batch sizes impact
and how much impact batch sizing has in terms
of response time and resources consumed.
2. Wanted to evaluate the guidelines from Oracle
documentation in the context of my simple benchmark
- The guideline is in Oracle's JDBC document
at </code> http://download-west.oracle.com/docs/cd/B10501_01/java.920/a96654/oraperf.htm#1056502
     in section "Oracle Update Batching
     Characteristics and Limitations"
     Specifically the guideline that says:
     "Batch values between 5 and 30 tend to be the most effective. Setting a very high value might even have a negative effect. It is worth trying different values to verify the effectiveness for your particular application. "

I wanted to see how important it is to evaluate this
guideline (of not going beyond 30) on a case by case
basis.

OK.

I used following utilities supplied already by Tom
1. DbmsOutput.java at 
   
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:433029981484,
    Note to Tom: Two minor errors in this class
    in the original posting:
    1. iimport in the first line instead of import
    2. the line "public void disable() throws SQLException"
       is repeated.
2. the runstats utility at 
    
http://asktom.oracle.com/~tkyte/runstats.html <code>
3. My own test program in TestBatchUpdates.java
which is given below at the end of the message.
It essentially compares inserts of 10000 records
(can be changed by editing a constant defined)
in a table t1 defined as follows:

create table t1 ( x varchar2(100) );

It can be invoked with two parameters
the first run's batch size and the second run's
batch size as follows:
java TestBatchUpdates 1 20

would compare inserting 10000 records - Run1 would
be for batch size of 1 and Run2 would be for batch
size of 20.

The results summarized for me are given below:
----------------------------------------------

Batch1 Batch2 Time1 Time2 %age (Time) %age Latches
(100th of sec)
1 20 1870 140 1336% 1380%
20 30 273 129 212% 131%
30 50 264 88 300% 137%
50 100 236 87 271% 134%
100 200 226 74 305% 142%
200 500 160 194 83% 108%
500 1000 263 69 381% 97%

So the run for 1 batchsize took 1336% of time it took
to run with a batch of 20 and consumed 1380% of the resources of arun with 20 batches.

Conclusions:
1. Batching makes a *huge* difference (see the first run
result)
2. As the size of batch increases the differences
between different runs seems to go down in terms of
time and resources consumed.
3. I had a case (with runs of inserting 1000 records)
where for batch size of 500 records the results were
consistently worse than both batch size of 200 and
1000! The point is you have to benchmark for your
operations to estimate a good batch size.
4. So the batch size of 30 - specified in the Oracle doc
- at least in my case does not seem to be the optimal
batch size. But then they say clearly ask you to run
your own benchmarks:)
5. Benchmark every thing!

----------------------------------------------
---- TestBatchUpdates.java begins
import java.sql.*;

import oracle.jdbc.driver.*;

public class TestBatchUpdates
{

public static void main(String[] args) throws SQLException
{
DriverManager.registerDriver( new OracleDriver());
_checkUsage( args );
_setBatchValues ( args );

OracleConnection connection = _getOracleConnection();
_prepareBenchmarkStatements ( connection );

// we are going to benchmark the following statement
String stmtString = "insert into t1 values ( ? )";

OraclePreparedStatement stmt = (OraclePreparedStatement)
connection.prepareStatement ( stmtString );

long updateCount = 0;

_benchmarkStatementArray[BENCHMARK_START_INDEX].execute();

// for each batch size being tested

for( int k = 0; k < UPDATE_BATCH_SIZES_BEING_TESTED.length; k++)
{
stmt.setExecuteBatch ( UPDATE_BATCH_SIZES_BEING_TESTED[k] );

// benchmark starts
long startTime = System.currentTimeMillis();
updateCount = 0;
for( int i=0; i < NUMBER_OF_RECORDS_INSERTED; i++)
{
stmt.setString(1, "hi" + i);

updateCount += stmt.executeUpdate();
}
updateCount += stmt.sendBatch(); // remaining data if any
long endTime = System.currentTimeMillis();

if( k == 0 ) // set the middle point only in first iteration
_benchmarkStatementArray[BENCHMARK_MIDDLE_INDEX].execute();

// benchmark ends
long timeTaken = (endTime - startTime);
benchmarkTimingResults[k] = timeTaken;
}
_benchmarkStatementArray[BENCHMARK_STOP_INDEX].execute();

_printBenchmarkResults ();

connection.commit();

System.out.println( "----Java Timer based results begin\n" );
for (int i=0; i < UPDATE_BATCH_SIZES_BEING_TESTED.length; i++)
{
System.out.println( "Time taken for inserting " +
NUMBER_OF_RECORDS_INSERTED +
" with a batch size of " +
UPDATE_BATCH_SIZES_BEING_TESTED[i] +
" = " + benchmarkTimingResults[i] );
System.out.flush();
}

System.out.println( "----Java Timer based results end\n" );
stmt.close();
_closeBenchmarkStatements ( connection );
connection.close();
}

private static void _closeBenchmarkStatements (
OracleConnection connection ) throws SQLException
{
for( int i=0; i < _benchmarkStatementArray.length; i++)
{
_benchmarkStatementArray[i].close();
}
}

private static void _prepareBenchmarkStatements (
OracleConnection connection ) throws SQLException
{
_benchmarkStatementArray[BENCHMARK_START_INDEX]=
(OracleCallableStatement) connection.prepareCall( BENCHMARK_START );

_benchmarkStatementArray[BENCHMARK_MIDDLE_INDEX]=
(OracleCallableStatement) connection.prepareCall( BENCHMARK_MIDDLE );

_benchmarkStatementArray[BENCHMARK_STOP_INDEX]=
(OracleCallableStatement) connection.prepareCall( BENCHMARK_STOP );
_benchmarkStatementArray[BENCHMARK_STOP_INDEX].setInt(1,
BENCHMARK_DIFFERENCE_THRESHOLD);
_dbmsOutput = new DbmsOutput ( connection );
_dbmsOutput.enable ( DBMS_OUTPUT_BUFFER_SIZE );
}

private static void _checkUsage (String[] args)
{
int argc = args.length;
if( argc != 2 )
{
System.out.println( "usage: java TestBatchUpdates <batch1> <batch2>");
System.out.flush();
System.exit(1);
}
}

private static OracleConnection _getOracleConnection ()
throws SQLException
{
String connectionURL = "jdbc:oracle:thin:@lothar:1521:apps";
OracleConnection connection = (OracleConnection)
DriverManager.getConnection ( connectionURL, "benchmark", "benchmark");

connection.setAutoCommit ( false) ;
return connection;
}

private static void _setBatchValues (String[] args)
{
int firstBatchValue = Integer.parseInt( args[0] );
int secondBatchValue = Integer.parseInt( args[1] );

UPDATE_BATCH_SIZES_BEING_TESTED[0] = firstBatchValue;
UPDATE_BATCH_SIZES_BEING_TESTED[1] = secondBatchValue;

System.out.println( "First batch value = " +
UPDATE_BATCH_SIZES_BEING_TESTED[0]);System.out.flush();
System.out.println( "Second batch value = " +
UPDATE_BATCH_SIZES_BEING_TESTED[1]);System.out.flush();
}

private static void _printBenchmarkResults()
throws SQLException
{
_dbmsOutput.show();
_dbmsOutput.close();
}


private static int[] UPDATE_BATCH_SIZES_BEING_TESTED = new int[2];

private static long[] benchmarkTimingResults = new long[UPDATE_BATCH_SIZES_BEING_TESTED.length];

private static String BENCHMARK_START = "begin runstats_pkg.rs_start; end;";
private static String BENCHMARK_MIDDLE = "begin runstats_pkg.rs_middle; end;";
private static String BENCHMARK_STOP = "begin runstats_pkg.rs_stop(?); end;";

private static OracleCallableStatement[] _benchmarkStatementArray = new OracleCallableStatement[3];

private static DbmsOutput _dbmsOutput;
private static final int DBMS_OUTPUT_BUFFER_SIZE = 1000000;

private static final int BENCHMARK_START_INDEX = 0;
private static final int BENCHMARK_MIDDLE_INDEX = 1;
private static final int BENCHMARK_STOP_INDEX = 2;
private static final int BENCHMARK_DIFFERENCE_THRESHOLD = 200;
private static final int NUMBER_OF_RECORDS_INSERTED = 10000;
}
---- TestBatchUpdates.java ends






Tom Kyte
August 23, 2003 - 6:32 pm UTC

all i can say is "bravo"

<quote>
Conclusions:
1. Batching makes a *huge* difference (see the first run
result)
2. As the size of batch increases the differences
between different runs seems to go down in terms of
time and resources consumed.
3. I had a case (with runs of inserting 1000 records)
where for batch size of 500 records the results were
consistently worse than both batch size of 200 and
1000! The point is you have to benchmark for your
operations to estimate a good batch size.
4. So the batch size of 30 - specified in the Oracle doc
- at least in my case does not seem to be the optimal
batch size. But then they say clearly ask you to run
your own benchmarks:)
5. Benchmark every thing!
</quote>

excellent. I agree 110%.

add #6

6. parameterize so that you can change this array size externally -- so when things change, you can change easily with them

thanx!

A reader, August 23, 2003 - 7:32 pm UTC

A "bravo" from Tom Kyte means a lot to me!:)
Thanx for point # 6. Just in case you missed it - a minor
point I made is:
"Note to Tom: Two minor errors in this class
in the original posting:
1. iimport in the first line instead of import
2. the line "public void disable() throws SQLException"
is repeated."

You may want to correct it - although anyone who
has written even a java program to print "Hello World "
should be able to correct those errors:)

If anyone else can test those results and let me know
if there are any bugs in it - I would be really
grateful!

Thanx!

Tom Kyte
August 23, 2003 - 8:01 pm UTC

corrected

Here is another JDBC benchmark

A reader, August 24, 2003 - 6:31 pm UTC

Not to get carried away or anything - but since I ran
this one and shared it with my team - might as well share
it on this forum.

Tom
Let me know whether this thread was the right one
for this benchmark!

Question:
Is the index version of getColumn() faster than the
name version of it, i.e. resultSet.getColumn( int
columnIndex) faster than resultSet.getColumn( String
columnName)?

Answer:
I had heard a lot that the index version would be faster
hence I wrote a simple benchmark to find out.
(ran it on 9.0.1.0)using Oracle jdbc drivers.
Here we don't need to look at latches etc - just
need to find times from the JDBC client.

The program is given below:

The result:

Approach Taken Time taken
(milliseconds)
resultSet.getColumn() based on column index 12,773
resultSet.getColumn() based on column name 10,503
%age Difference 20%

So it turns out that the the getColumn() by name is around 20% faster than getColumn() by index.

--the test program -TestGetColumnNameOrIndex.java begins--
import java.sql.*;

import oracle.jdbc.driver.*;

public class TestGetColumnNameOrIndex
{

public static void main(String[] args) throws SQLException
{
String columnName = "object_name";
String testQuery = "select object_name from t1";
DriverManager.registerDriver( new OracleDriver());
_checkUsage( args );

OracleConnection connection = _getOracleConnection();
Statement stmt = connection.createStatement();

ResultSet rset = stmt.executeQuery( testQuery );

// benchmark - first case - using index

long startTime = System.currentTimeMillis();

int indexOfColumn = rset.findColumn ( columnName );

int i=0;
while( rset != null && rset.next() )
{
String objectName = rset.getString( indexOfColumn );
i++;
}
long endTime = System.currentTimeMillis();
//System.out.println( "total rows found = " + i);
System.out.println( "Time taken using column index " +
"for selecting " + i + " rows " + ( endTime - startTime ) );

rset.close();

rset = stmt.executeQuery( testQuery );

// benchmark - second case - using column name

startTime = System.currentTimeMillis();

i=0;
while( rset != null && rset.next() )
{
String objectName = rset.getString( columnName );
i++;
}
endTime = System.currentTimeMillis();
//System.out.println( "total rows found = " + i);

System.out.println( "Time taken using column name " +
"for selecting " + i + " rows " + ( endTime - startTime ) );
rset.close();
stmt.close();
connection.close();
}

private static void _checkUsage (String[] args)
{
int argc = args.length;
if( argc != 0 )
{
System.out.println( "usage: java TestGetColumnNameOrIndex");
System.out.flush();
System.exit(1);
}
}

private static OracleConnection _getOracleConnection ()
throws SQLException
{
String connectionURL = "jdbc:oracle:thin:@lothar:1521:apps";
OracleConnection connection = (OracleConnection)
DriverManager.getConnection ( connectionURL, "benchmark",
"benchmark");

connection.setAutoCommit ( false) ;
return connection;
}

}

-- the test program ends

Regards
Menon:)





Tom Kyte
August 24, 2003 - 9:06 pm UTC

try it with a 20 column query -- be more "real world".

see what happens to get 20 columns by index
and then by name.


post that!

thanx Tom!

Menon, August 25, 2003 - 1:02 pm UTC

OK - here you go - the results are different - thanx for
your tip! (Just to make sure people can run the benchmarks
easily - I am posting the 2 versions of same Java programs
that I ran here (I could have modified the same Java program to work for all cases but was too lazy!

Sooo - first I created a table.

drop table lots_of_columns;
create table lots_of_columns
as select rownum a,
a.object_name b,
a.object_type c,
b.object_name d,
b.object_type e,
a.object_name f,
a.object_type g,
b.object_name h,
b.object_type i,
a.object_name j,
a.object_type k,
b.object_name l,
b.object_type m,
a.object_name n,
b.object_type o,
b.object_name p,
a.object_type q,
b.object_name r,
b.object_type s,
a.object_name t
from all_objects a, all_objects b
where rownum <= 100000;
alter table lots_of_columns add constraint lots_of_columns_pk primary key (a);

Then I ran the benchmark by selecting 10 columns.

Following is the modified Java program
------ TestGetColumnNameOrIndex.java (*10* columns)
import java.sql.*;

import oracle.jdbc.driver.*;

public class TestGetColumnNameOrIndex
{

public static void main(String[] args) throws SQLException
{
String columnName = "object_name";
String testQuery = "select a, b, c, d, e, f, g, h,i, j, k, " +
"l, m, n,o, p, q, r, s, t " +
" from lots_of_columns";
DriverManager.registerDriver( new OracleDriver());
_checkUsage( args );

OracleConnection connection = _getOracleConnection();
Statement stmt = connection.createStatement();

ResultSet rset = stmt.executeQuery( testQuery );

// benchmark - first case - using index

long startTime = System.currentTimeMillis();

int index1 = rset.findColumn ( "a" );
int index2 = rset.findColumn ( "b" );
int index3 = rset.findColumn ( "c" );
int index4 = rset.findColumn ( "d" );
int index5 = rset.findColumn ( "e" );
int index6 = rset.findColumn ( "f" );
int index7 = rset.findColumn ( "g" );
int index8 = rset.findColumn ( "h" );
int index9 = rset.findColumn ( "i" );
int index10 = rset.findColumn ( "j" );
int index11 = rset.findColumn ( "k" );
int index12 = rset.findColumn ( "l" );
int index13 = rset.findColumn ( "m" );
int index14 = rset.findColumn ( "n" );
int index15 = rset.findColumn ( "o" );
int index16 = rset.findColumn ( "p" );
int index17 = rset.findColumn ( "q" );
int index18 = rset.findColumn ( "r" );
int index19 = rset.findColumn ( "s" );
int index20 = rset.findColumn ( "t" );

int x=0;
while( rset != null && rset.next() )
{
int a = rset.getInt( index1 );
String b = rset.getString( index2 );
String c = rset.getString( index3 );
String d = rset.getString( index4 );
String e = rset.getString( index5 );
String f = rset.getString( index6 );
String g = rset.getString( index7 );
String h = rset.getString( index8 );
String i = rset.getString( index9 );
String j = rset.getString( index10 );
if ( x == 0)
{
/*
System.out.println(a + "," +
b + "," +
c + "," +
d + "," +
e + "," +
f + "," +
i + "," +
j );System.out.flush();
*/
}
x++;
}
long endTime = System.currentTimeMillis();
//System.out.println( "total rows found = " + x);
System.out.println( "Time taken using column index " +
"for selecting " + x + " rows " + ( endTime - startTime ) );

rset.close();

rset = stmt.executeQuery( testQuery );

// benchmark - second case - using column name

startTime = System.currentTimeMillis();

x=0;
while( rset != null && rset.next() )
{
int a = rset.getInt( "a" );
String b = rset.getString( "b" );
String c = rset.getString( "c" );
String d = rset.getString( "d" );
String e = rset.getString( "e" );
String f = rset.getString( "f" );
String g = rset.getString( "g" );
String h = rset.getString( "h" );
String i = rset.getString( "i" );
String j = rset.getString( "j" );
if ( x == 0)
{
/*
System.out.println(a + "," +
b + "," +
c + "," +
d + "," +
e + "," +
f + "," +
i + "," +
j );System.out.flush();
*/
}
x++;
}
endTime = System.currentTimeMillis();
//System.out.println( "total rows found = " + x);

System.out.println( "Time taken using column name " +
"for selecting " + x + " rows " + ( endTime - startTime ) );
rset.close();
stmt.close();
connection.close();
}

private static void _checkUsage (String[] args)
{
int argc = args.length;
if( argc != 0 )
{
System.out.println( "usage: java TestGetColumnNameOrIndex");
System.out.flush();
System.exit(1);
}
}

private static OracleConnection _getOracleConnection ()
throws SQLException
{
String connectionURL = "jdbc:oracle:thin:@lothar:1521:apps";
OracleConnection connection = (OracleConnection)
DriverManager.getConnection ( connectionURL, "benchmark",
"benchmark");

connection.setAutoCommit ( false) ;
return connection;
}

}

----- program ends (*10 columns)

I ran this 10 times (Yeah I noticed that I did a findColumn for 20 columns above - but don't think it
matters a lot since the next loop is the main one
(going for 100000 times) - Did nt bother to test with
this minor correction. - would not have changed
the results materially.

--- my 10 runs-- (for 10 columns)
Time taken using column index for selecting 100000 rows 31353
Time taken using column name for selecting 100000 rows 32655

Time taken using column index for selecting 100000 rows 31041
Time taken using column name for selecting 100000 rows 32912

Time taken using column index for selecting 100000 rows 30922
Time taken using column name for selecting 100000 rows 32163

Time taken using column index for selecting 100000 rows 31128
Time taken using column name for selecting 100000 rows 32265

Time taken using column index for selecting 100000 rows 30977
Time taken using column name for selecting 100000 rows 32280

Time taken using column index for selecting 100000 rows 30668
Time taken using column name for selecting 100000 rows 32735

Time taken using column index for selecting 100000 rows 30783
Time taken using column name for selecting 100000 rows 31853

Time taken using column index for selecting 100000 rows 30754
Time taken using column name for selecting 100000 rows 32348

Time taken using column index for selecting 100000 rows 31993
Time taken using column name for selecting 100000 rows 32756

Time taken using column index for selecting 100000 rows 30981
Time taken using column name for selecting 100000 rows 32014
---

So the average time taken was pretty much the same - in
fact selection using getColumn() by index was slightly (1%)
faster.

Then I used the following program (on the same table)
to run the benchmark for 20 columns.

-----TestGetColumnNameOrIndex.java (*20* columns)
import java.sql.*;

import oracle.jdbc.driver.*;

public class TestGetColumnNameOrIndex
{

public static void main(String[] args) throws SQLException
{
String testQuery = "select a, b, c, d, e, f, g, h,i, j, k, " +
"l, m, n,o, p, q, r, s, t " +
" from lots_of_columns";
DriverManager.registerDriver( new OracleDriver());
_checkUsage( args );

OracleConnection connection = _getOracleConnection();
Statement stmt = connection.createStatement();

ResultSet rset = stmt.executeQuery( testQuery );

// benchmark - first case - using index

long startTime = System.currentTimeMillis();

int index1 = rset.findColumn ( "a" );
int index2 = rset.findColumn ( "b" );
int index3 = rset.findColumn ( "c" );
int index4 = rset.findColumn ( "d" );
int index5 = rset.findColumn ( "e" );
int index6 = rset.findColumn ( "f" );
int index7 = rset.findColumn ( "g" );
int index8 = rset.findColumn ( "h" );
int index9 = rset.findColumn ( "i" );
int index10 = rset.findColumn ( "j" );
int index11 = rset.findColumn ( "k" );
int index12 = rset.findColumn ( "l" );
int index13 = rset.findColumn ( "m" );
int index14 = rset.findColumn ( "n" );
int index15 = rset.findColumn ( "o" );
int index16 = rset.findColumn ( "p" );
int index17 = rset.findColumn ( "q" );
int index18 = rset.findColumn ( "r" );
int index19 = rset.findColumn ( "s" );
int index20 = rset.findColumn ( "t" );

int x=0;
while( rset != null && rset.next() )
{
int a = rset.getInt( index1 );
String b = rset.getString( index2 );
String c = rset.getString( index3 );
String d = rset.getString( index4 );
String e = rset.getString( index5 );
String f = rset.getString( index6 );
String g = rset.getString( index7 );
String h = rset.getString( index8 );
String i = rset.getString( index9 );
String j = rset.getString( index10 );
String k = rset.getString( index11 );
String l = rset.getString( index12 );
String m = rset.getString( index13 );
String n = rset.getString( index14 );
String o = rset.getString( index15 );
String p = rset.getString( index16 );
String q = rset.getString( index17 );
String r = rset.getString( index18 );
String s = rset.getString( index19 );
String t = rset.getString( index20 );
if ( x == 0)
{
/*
System.out.println(a + "," +
b + "," +
c + "," +
d + "," +
e + "," +
f + "," +
i + "," +
j + "," +
k + "," +
l + "," +
m + "," +
n + "," +
o + "," +
p + "," +
q + "," +
r + "," +
s + "," +
t );System.out.flush();
*/
}
x++;
}
long endTime = System.currentTimeMillis();
//System.out.println( "total rows found = " + x);
System.out.println( "Time taken using column index " +
"for selecting " + x + " rows " + ( endTime - startTime ) );

rset.close();

rset = stmt.executeQuery( testQuery );

// benchmark - second case - using column name

startTime = System.currentTimeMillis();

x=0;
while( rset != null && rset.next() )
{
int a = rset.getInt( "a" );
String b = rset.getString( "b" );
String c = rset.getString( "c" );
String d = rset.getString( "d" );
String e = rset.getString( "e" );
String f = rset.getString( "f" );
String g = rset.getString( "g" );
String h = rset.getString( "h" );
String i = rset.getString( "i" );
String j = rset.getString( "j" );
String k = rset.getString( "k" );
String l = rset.getString( "l" );
String m = rset.getString( "m" );
String n = rset.getString( "n" );
String o = rset.getString( "o" );
String p = rset.getString( "p" );
String q = rset.getString( "q" );
String r = rset.getString( "r" );
String s = rset.getString( "s" );
String t = rset.getString( "t" );
if ( x == 0)
{
/*
System.out.println(a + "," +
b + "," +
c + "," +
d + "," +
e + "," +
f + "," +
i + "," +
j + "," +
k + "," +
l + "," +
m + "," +
n + "," +
o + "," +
p + "," +
q + "," +
r + "," +
s + "," +
t );System.out.flush();
*/
}
x++;
}
endTime = System.currentTimeMillis();
//System.out.println( "total rows found = " + x);

System.out.println( "Time taken using column name " +
"for selecting " + x + " rows " + ( endTime - startTime ) );
rset.close();
stmt.close();
connection.close();
}

private static void _checkUsage (String[] args)
{
int argc = args.length;
if( argc != 0 )
{
System.out.println( "usage: java TestGetColumnNameOrIndex");
System.out.flush();
System.exit(1);
}
}

private static OracleConnection _getOracleConnection ()
throws SQLException
{
String connectionURL = "jdbc:oracle:thin:@lothar:1521:apps";
OracleConnection connection = (OracleConnection)
DriverManager.getConnection ( connectionURL, "benchmark",
"benchmark");

connection.setAutoCommit ( false) ;
return connection;
}

}

--- program ends

the results of 10 runs follows
-- 10 runs (for selecting 20 columns)--
Time taken using column index for selecting 100000 rows 35996
Time taken using column name for selecting 100000 rows 48999

Time taken using column index for selecting 100000 rows 35770
Time taken using column name for selecting 100000 rows 49516

Time taken using column index for selecting 100000 rows 36501
Time taken using column name for selecting 100000 rows 49337

Time taken using column index for selecting 100000 rows 36454
Time taken using column name for selecting 100000 rows 50061

Time taken using column index for selecting 100000 rows 36331
Time taken using column name for selecting 100000 rows 49259

Time taken using column index for selecting 100000 rows 35157
Time taken using column name for selecting 100000 rows 47626

Time taken using column index for selecting 100000 rows 36589
Time taken using column name for selecting 100000 rows 50654
Time taken using column index for selecting 100000 rows 36518
Time taken using column name for selecting 100000 rows 49734
Time taken using column index for selecting 100000 rows 36254
Time taken using column name for selecting 100000 rows 49710
Time taken using column index for selecting 100000 rows 36106
Time taken using column name for selecting 100000 rows 50773
-- 10 runs end

Here we see that on average the getColumn() by index is
37% faster.

So my conclusion is:
If you select 1 column, resultSet.getColumn() by name is around 20% faster as compared to resultSet.getColumn() by index.
If you select 10 columns, resultSet.getColumn() by name is around the same as compared to resultSet.getColumn() by index.
If you select 20 columns, resultSet.getColumn() by index is around 37% faster as compared to resultSet.getColumn() by name.


Overall, if you select many columns, use resultSet.getColumn() by index. If you select few columns ( <= 5 to 10) then use resultSet.getColumn() by name.

Thanx!
Menon:)
PS: I love your service, your book and am waiting eagerly
for the next book - have put order on your new book
and Cary Milsap's book (his first chapter available online is really interesting)

Tom Kyte
August 25, 2003 - 2:42 pm UTC

yes, Cary is an interesting speaker/writer. You should check him out at the hotsos conference next march if you can. it is well worth it.

re: thanx!

Menon, August 25, 2003 - 3:02 pm UTC

"yes, Cary is an interesting speaker/writer. You should check him out at the
hotsos conference next march if you can. it is well worth it.
"

Yes - I am really interested in attending the conference -
not just for Cary but to listen to you and JL too!
I had sent a request for re-imbursement to my boss awhile
back - I hope it goes through!

direct insert VS stored procedure

Edgar Chupit, August 26, 2003 - 1:34 pm UTC

Hello Tom,

Using the benchmark that was givven above, I have found out that direct inserting into table (with batch size 100) is almost 60 times faster than inserting from a stored procedure, and changing batch size have almost no effect to execution performance of stored procedure (I think it's because you have to switch from SQL engine to PL/SQL). Are there some approaches to tune procedure calls?

To benchmark stored procedure in TestBatchUpdates.java I have changed line:
OraclePreparedStatement stmt = (OraclePreparedStatement) connection.prepareStatement ( stmtString );
to
OracleCallableStatement stmt = (OracleCallableStatement) connection.prepareCall ( stmtString );
and changed stmtString to:
String stmtString = "begin tt1.doInsert(?,?); end;";

Thank you!

Tom Kyte
August 26, 2003 - 1:41 pm UTC

well, you could save up 100 rows -- send them to a procedure and forall BULK insert them. it would be comparable.

you are right

Edgar Chupit, August 26, 2003 - 3:50 pm UTC

Thank you! it is working faster, not as fast as direct insert, but way more faster than simple 'one in a time' pl/sql procedure call.

To Edgar

Menon, August 26, 2003 - 4:14 pm UTC

Hi Edgar
Could you post the code here ?

Thanx!
Menon:)

code for array processing

Edgar Chupit, August 27, 2003 - 1:13 am UTC

Hello,

I'm not real Java programmer, so I guess that this code can be optimized, but my point was to change original benchmark code, that was kindly provided by Menon.

Results:

Batch1    Batch2    Time1 (hsec)    Time2 (hsec)    %age (time)    %age (latches)
1    20    1231    70    1758,57%    1852,53%
20    30    92    60    153,33%    148,12%
30    50    67    40    167,50%    155,52%
50    100    45    25    180%    171,01%
100    200    30    17    176%    162,06%
200    500    24    11    218%    148,17%
500    1000    17    11    155%    104,95%
1000    1500    16    11    145,50%    109,24%
1500    2000    16    10    160%    104,73%

SQL*Plus:

SQL> create table tt (id integer, time timestamp default systimestamp, code integer, val integer);

Table created

SQL> create type integerarray as table of integer;

Type created

SQL> create or replace package tt1 is
  2  
  3     procedure doInsert2( aCode integer, aVal integer );
  4     procedure doInsert( aCode integerArray, aVal integerArray );
  5  
  6  end tt1;
  7  /

Package created

SQL> create or replace package body tt1 is
  2  
  3     procedure doInsert( aCode integerArray, aVal integerArray ) is
  4     begin
  5        forall i in aCode.first .. aCode.last
  6          insert into tt (id,code,val) values (1,aCode(i),aVal(i));
  7     end;
  8  
  9     procedure doInsert2( aCode integer, aVal integer ) is
 10     begin
 11          insert into tt (id,code,val) values (1,aCode,aVal);
 12     end;
 13  
 14  end tt1;
 15  /

Package body created

SQL> show errors;
No errors for PACKAGE BODY TT.TT1

Java:

---- TestBatchUpdates2.java begins
public class TestBatchUpdates2
{
  public static void main(String[] args) throws SQLException
  {  
    DriverManager.registerDriver( new OracleDriver());
    _checkUsage( args );
    _setBatchValues ( args );
    OracleConnection connection = _getOracleConnection();
    _prepareBenchmarkStatements ( connection );
   
    // we are going to benchmark the following statement
    String stmtString = "begin tt1.doInsert(?,?); end;";
 
    OracleCallableStatement stmt = (OracleCallableStatement) connection.prepareCall ( stmtString );
    long updateCount = 0;
    _benchmarkStatementArray[BENCHMARK_START_INDEX].execute();

    ArrayDescriptor desc = ArrayDescriptor.createDescriptor( "TT.INTEGERARRAY", connection );
 
    // for each batch size being tested
    for( int k = 0; k < UPDATE_BATCH_SIZES_BEING_TESTED.length; k++)
    {
      // benchmark starts 
      long startTime = System.currentTimeMillis();
      updateCount = 0;
      int[] aCode = new int [ UPDATE_BATCH_SIZES_BEING_TESTED[k] ];
      int[] aVal = new int [ UPDATE_BATCH_SIZES_BEING_TESTED[k] ];
      
      for( int i=0; i < NUMBER_OF_RECORDS_INSERTED; i++)
      {
    if( i%UPDATE_BATCH_SIZES_BEING_TESTED[k] == 0 && i > 0 )
        {
           stmt.setArray(1,new ARRAY (desc, connection, aCode) );
           stmt.setArray(2,new ARRAY (desc, connection, aVal) );
           stmt.execute();

           // if NUMBER_OF_RECORDS_INSERTED%UPDATE_BATCH_SIZES_BEING_TESTED[k] != 0 than
           //   last portion of data will be smaller than UPDATE_BATCH_SIZES_BEING_TESTED[k]
           if( (NUMBER_OF_RECORDS_INSERTED-i > 0) && (UPDATE_BATCH_SIZES_BEING_TESTED[k] > NUMBER_OF_RECORDS_INSERTED-i) )
           {
              aCode = new int [ NUMBER_OF_RECORDS_INSERTED-i ];
              aVal = new int [ NUMBER_OF_RECORDS_INSERTED-i ];
           }
        }
        aCode[(i%UPDATE_BATCH_SIZES_BEING_TESTED[k])] = i;
        aVal[(i%UPDATE_BATCH_SIZES_BEING_TESTED[k])] = i*i*3;
      }
      stmt.setArray(1,new ARRAY (desc, connection, aCode) );
      stmt.setArray(2,new ARRAY (desc, connection, aVal) );
      stmt.execute();
      connection.commit();

      long endTime = System.currentTimeMillis();
      if( k == 0 ) // set the middle point only in first iteration
        _benchmarkStatementArray[BENCHMARK_MIDDLE_INDEX].execute();
      // benchmark ends
      long timeTaken = (endTime - startTime);
      benchmarkTimingResults[k] = timeTaken;
    }
    _benchmarkStatementArray[BENCHMARK_STOP_INDEX].execute();
    _printBenchmarkResults ();
    connection.commit();
    System.out.println( "----Java Timer based results begin\n" );
    for (int i=0; i < UPDATE_BATCH_SIZES_BEING_TESTED.length; i++)
    {
      System.out.println( "Time taken for inserting " + 
        NUMBER_OF_RECORDS_INSERTED + 
        " with a batch size of " + 
        UPDATE_BATCH_SIZES_BEING_TESTED[i] + 
       " = " + benchmarkTimingResults[i] );
      System.out.flush();
    }
    System.out.println( "----Java Timer based results end\n" );
    stmt.close();
    _closeBenchmarkStatements ( connection );
    connection.close();
  }
  private static void _closeBenchmarkStatements ( 
    OracleConnection connection )  throws SQLException
  {
    for( int i=0; i < _benchmarkStatementArray.length; i++)
    {
      _benchmarkStatementArray[i].close();
    }
  }
  private static void _prepareBenchmarkStatements ( 
    OracleConnection connection ) throws SQLException
  {
    _benchmarkStatementArray[BENCHMARK_START_INDEX]= 
      (OracleCallableStatement) connection.prepareCall( BENCHMARK_START );
    _benchmarkStatementArray[BENCHMARK_MIDDLE_INDEX]= 
      (OracleCallableStatement) connection.prepareCall( BENCHMARK_MIDDLE );
    _benchmarkStatementArray[BENCHMARK_STOP_INDEX]= 
      (OracleCallableStatement) connection.prepareCall( BENCHMARK_STOP );
    _benchmarkStatementArray[BENCHMARK_STOP_INDEX].setInt(1, 
      BENCHMARK_DIFFERENCE_THRESHOLD);
    _dbmsOutput = new DbmsOutput ( connection );
    _dbmsOutput.enable ( DBMS_OUTPUT_BUFFER_SIZE );
  }
  private static void _checkUsage (String[] args)
  {
    int argc = args.length;
    if( argc != 2 )
    {
      System.out.println( "usage: java TestBatchUpdates3 <batch1> <batch2>");
      System.out.flush();
      System.exit(1);
    }
  }
  private static OracleConnection _getOracleConnection () 
    throws SQLException
  {
    String connectionURL = "jdbc:oracle:thin:@127.0.0.1:1521:TEST01";
    OracleConnection connection = (OracleConnection)
      DriverManager.getConnection ( connectionURL, "tt", "tt");
    connection.setAutoCommit ( false) ;
    return connection;
  }
  private static void _setBatchValues (String[] args)
  {
    int firstBatchValue = Integer.parseInt( args[0] );
    int secondBatchValue = Integer.parseInt( args[1] );
    UPDATE_BATCH_SIZES_BEING_TESTED[0] = firstBatchValue;
    UPDATE_BATCH_SIZES_BEING_TESTED[1] = secondBatchValue;
    System.out.println( "First batch value = " + 
      UPDATE_BATCH_SIZES_BEING_TESTED[0]);System.out.flush();
    System.out.println( "Second batch value = " + 
      UPDATE_BATCH_SIZES_BEING_TESTED[1]);System.out.flush();
  }
  private static void _printBenchmarkResults()
    throws SQLException
  {
    _dbmsOutput.show();
    _dbmsOutput.close();
  }

  private static int[] UPDATE_BATCH_SIZES_BEING_TESTED = new int[2];
  private static long[] benchmarkTimingResults = new long[UPDATE_BATCH_SIZES_BEING_TESTED.length];
  private static String BENCHMARK_START = "begin runstats_pkg.rs_start; end;";
  private static String BENCHMARK_MIDDLE = "begin runstats_pkg.rs_middle; end;";
  private static String BENCHMARK_STOP = "begin runstats_pkg.rs_stop(?); end;";
  private static OracleCallableStatement[] _benchmarkStatementArray = new OracleCallableStatement[3]; 
 
  private static DbmsOutput _dbmsOutput;
  private static final int DBMS_OUTPUT_BUFFER_SIZE = 1000000;
  private static final int BENCHMARK_START_INDEX = 0;
  private static final int BENCHMARK_MIDDLE_INDEX = 1;
  private static final int BENCHMARK_STOP_INDEX = 2;
  private static final int BENCHMARK_DIFFERENCE_THRESHOLD = 200;
  private static final int NUMBER_OF_RECORDS_INSERTED = 10000;
}
---- TestBatchUpdates2.java ends
 

thanx Edgar!

Menon, August 27, 2003 - 11:20 am UTC


view/table information to see java stored procedure

A reader, September 25, 2003 - 8:16 am UTC

Tom,

If i want to know, what are all the java stored procedure that are created in my schema - which view/table gives me that.

Please let me know

Thanks

Tom Kyte
September 25, 2003 - 8:44 am UTC

select * from user_objects where object_type like 'JAVA%'

Thanks a lot

A reader, September 25, 2003 - 9:47 am UTC


JDBC batch caveat?

Thomas Starr, May 04, 2004 - 2:47 am UTC

Re: Implementing Oracle JDBC batch...
Will using an OraclePreparedStatement benefit a single insert statement that inserts 600K records into an empty table, or are database kernal speed benefits only be acheived by collecting groups of insert statements?
Thanks!!
And thanks for Seattle Technology Day, you were great!!!!


Tom Kyte
May 04, 2004 - 7:22 am UTC

Not sure I'm following 100% but -- if you are inserting 600k records from a java client -- I would use an array size of around 100 records and "read 100/send 100/read 100/send 100" and so on -- flow the data in.

Clarification on implementing Oracle JDBC Batch...

Tom Starr, May 04, 2004 - 5:36 pm UTC

My apology for providing insufficient info.

To clarify, would it be appropriate usage of PreparedStatement if a trigger on table A invokes prep_stmt.executeUpdate() on a single insert statement that copies all 600K records from table B into identical but empty table C?

I ask because varying the batch parameter has no effect.

I noticed a superfluous:
((OraclePreparedStatement)prep_stmt.sendBatch()) between the executeUpdate() and the commit() after testing, but I wonder if executeUpdate is intended to be invoked repeatedly, instead of once as I did.

AutoCommit is off.

This is in 8.1.7.4

I can only presently recommend changes, but want to make sure my understanding is clear, and I also wonder whether BULK COLLECT / FORALL might possibly be faster in this case?

Thanks again very much for sharing your very much appreciated knowledge.

Tom





Tom Kyte
May 04, 2004 - 7:39 pm UTC

sorry -- don't understand the flow here at all. why would a trigger on a table run java?

Oracle JDBC Batch explanation of attempted strategy

Tom Starr, May 05, 2004 - 12:15 am UTC

Thanks for bearing with me.

The strategy I chose uses a trigger on logging/control table "A", to invoke a procedure that calls a Java method in the DBMS internal JVM, which copies ~600K records from inwork table "B" to empty production table "C", using JDBC Batch, and a single prepared insert statement.

I used a trigger on a logging/control table for these reasons:
1. The customer can continue to use MS Access to manage this data, even though it now resides in Oracle, using ODBC, and can easily insert records into the logging/control table similarly.
2. The logging / control data provides operational records.
3. The result is there is no GUI to write and maintain, and minimal training is required.

I used Java for these reasons:
1. Existing staff can easily maintain complex procedural logic.
2. The same application uses Java elsewhere to URL encode an entered value (using your excellent mutating table solution).
3. To achieve database kernel batch performance.

It works great, but performance is less than expected.

Where have I gone wrong?

Tom Kyte
May 05, 2004 - 7:37 am UTC


you should use a single SQL statement in a plsql based trigger. The only way to copy 600k records would be

insert into c select * from b;

or

insert /*+ append */ into c select * from b;


You'll find java working with database data (any database's data) to be much slower than <sp language of database> working with database data.

You'll find procedural code to be much much slower than SQL.

Oracle JDBC Batch conclusion....

Tom Starr, May 05, 2004 - 5:02 pm UTC

Excellent.

I had added the append hint.

And it will be easy to use plsql as you recommend.

In this case there are actually three related tables with 200K records each, and the intent down the road is complex procedural logic to surgically manipulate individual members within multi-table record sets, rather than this simple 100% replacement of all data in all three tables, and speed will be less an issue.

I do have two questions about using the plsql here...

1. Would you agree I should try BULK COLLECT / FORALL?

2. And I'm puzzled why this excerpt to Lior from the beginning of this thread doesn't make the Java faster...

JDBC ->
single network call with 100 records ->
database kernel

versus

JDBC ->
single network call with 100 record ->
PLSQL array ->
database kernel

Thanks again!!

Tom Kyte
May 05, 2004 - 7:52 pm UTC

1) if and only if a single sql statement cannot be used to accomplish the same

2) not having seen the code, i cannot tell -- but sql is better than plsql is better than java in the database for manipulating data.

JDBC batch v PL/SQL vs single SQL statement

Tom Starr, May 06, 2004 - 3:46 am UTC

If I understand correctly, since JDBC batch and bulk collect / forall both have overhead, and since both function at the statement level, SQL will always be much faster unless enough statements are involved, and then JDBC batch will only be faster than PL/SQL when it's relatively greater overhead is less than that of the PL/SQL array processing?

I just found JDBC batch detailed in a wonderful old WROX book "Oracle 8i Application Programming with Java, PL/SL and XML", of which you are a co-author, which doesn't seem to be available anymore.

If they can find one, possibly that title might be helpful to your other readers.

Thanks again. Very much.

Tom Kyte
May 06, 2004 - 7:36 am UTC

(me, i don't think jdbc batch will be faster than plsql unless you are doing something in java that is just impractical in plsql -- maybe a fast fourier transform or something)....



jdbc and array processing

Ryan Gaffuri, August 17, 2004 - 9:21 am UTC

In another thread you discussed set arraysize in sqlplus to determine how many rows you retrieve per LIO. Is it possible to adjust this setting with the JDBC? Our software engineers are doing straight sql through the jdbc?

Tom Kyte
August 17, 2004 - 10:00 am UTC

tell them to look up setRowPreFetch

</code> http://docs.oracle.com/docs/cd/B10501_01/java.920/a96654/oraperf.htm#1056107 <code>

Extremely useful comparison

Tom Starr, March 28, 2008 - 7:51 pm UTC

I added three more cases to Menon's/Edgar's program:
- Non-Oracle manual Update Batching
- PLSQL BULK COLLECT Procedure called twice with batch parameter as LIMIT
- PLSQL BULK COLLECT Function called once with two LIMIT parameters
(last option runs runstats within PLSQL & passes back timing array)

These are the reults with 10 & 1000 batch sizes:
1) Original Oracle Update Batching per Menon
* Run1 ran in 151 hsecs
* Run2 ran in 42 hsecs
2) Non-Oracle manual Update Batching
* Run1 ran in 924 hsecs
* Run2 ran in 821 hsecs
3) Original PLSQL per Menon and Edgar Chupit
* Run1 ran in 936 hsecs
* Run2 ran in 85 hsecs
4) PLSQL BULK COLLECT Procedure called twice with LIMIT
* Run1 ran in 70 hsecs
* Run2 ran in 42 hsecs
5) PLSQL BULK COLLECT Function called once with two LIMITs
* Run1 ran in 60 hsecs
* Run2 ran in 29 hsecs
*
* SQL inserts the same 10000 rows in 00:00:00.15 (15 hsec)
* begin
* insert into bctab4 (select * from bctab0);
* commit;
* end;

No surprise SQL is fastest followed by native PLSQL.

But Oracle JDBC Batching appears faster than calling PLSQL over JDBC even though Edgar chunked the array and used FORALL as your suggested: "save up 100 rows -- send them to a procedure and forall BULK insert them".

Questions:
1) Would any other options be worthwhile to compare?
2) Am I right to think there is no way to implement BULK/LIMIT on a passed in array?
3) Could passing in too many rows easily overtax server memory?
4) If so, should one pass the JDBC array into a VARRAY?

The VARRAY option seems difficult but is probably easier than I think.

Thanks again for your brilliant work!!

It is an honor to stand in the shadow of your shoes.

Tom Starr
Tom Kyte
March 31, 2008 - 8:12 am UTC

1) i would have stopped with SQL and native PLSQL myself :)

2) not sure what you mean, the client sending the array "sizes it"... it decides how many to put in there.

3) sure

4) no real difference programing wise between varray and collection except varray is limited in size. On disk a varray and collection would be very different - but in a program, they are conceptually the same.


How to best pass arrays from Java to PLSQL

Tom Starr, April 02, 2008 - 1:30 am UTC

Right, the Java programmers would decide how big an array to pass.

So my question pertains to enforcing an intentional limit at the Oracle end using a VARRAY.

But I suspect preallocating more space than needed could be equally undesirable.

And I also wonder if arrays are passed from Java to PLSQL by reference.

Thanks again for the thousand mile view.

Tom Starr

Tom Kyte
April 02, 2008 - 7:52 am UTC

arrays (data in general) cannot be passed from a CLIENT outside of the database to inside the database - the client is typically on another machine and ALWAYS in a different address space.

It wouldn't matter

Stewart Bryson, April 02, 2008 - 11:07 am UTC

Tom answered the question as presented... but I think the question is flawed.

I would like to see the justification for using either one of these approaches over some straight SQL. I often face clients who believe they need array processing because they don't think an operation can be achieved in straight SQL. I have seen very, very few bulk load operations that can't be done with a "INSERT /*+ APPEND */ into tab1 select x,y,z from tab2 join tab3 on... where... ".

If the basis of the load program were written in straight SQL, then the overhead of executing it either via JDBC or JDBC then PLSQL would be negligible.
Tom Kyte
April 02, 2008 - 12:04 pm UTC

just read 2 reviews up :)

I said that already...

short attention span

Stewart Bryson, April 02, 2008 - 1:12 pm UTC

I stopped reading where all the Java code was pasted in... let me rephrase... the FIRST time all the Java code was pasted in.

I would never be able to "do" AskTom... forgetting for a moment all the wrong answers I would probably give. I don't have the attention span to sort through everything in your average posting.

I'm not counting posts between yourself and DKB. For that one... I read EVERY word.

SQL vs PLSQL vs Java

Tom Starr, April 03, 2008 - 6:59 pm UTC

Agreed, but you know how Java programmers are, they often prefer to do everything in
Java and use the database as a bucket.

So performance of a Java app doing file loading or record set processing could be
dramatically improved just by upgrading to Oracle and using Oracle JDBC Update Batching
instead of regular Java Batching.

Also interesting how that compares to sending arrays to PLSQL from Java running
in the database JVM as the JDBC examples above.

Maybe I will add the "/*+ APPEND */" hint to the batching examples to see how it
effects their performance.

tws
Tom Kyte
April 03, 2008 - 9:22 pm UTC

...
So performance of a Java app doing file loading or record set processing could
be
dramatically improved just by upgrading to Oracle and using Oracle JDBC Update
Batching
instead of regular Java Batching.

...

or by replacing the java programmers that we know how they are with IT professionals.

append only works on insert SELECT - never never never (fortunately) on insert values.

Re: replacing the Java programmers

Stew Ashton, April 04, 2008 - 8:23 am UTC


Since there aren't many IT professionals around, why not just replace the Java programmers with PHP developers? That seems to be a popular option these days.

Then, when the application doesn't access the database right, management can just switch to MySQL ;)

I don't think Java programmers are much less "professional" than other IT folks. It's more that they're purists : everything is OO, everything is Java, all processing must be done in the J2EE middle tier, no calls to proprietary Oracle APIs, and so on. I suspect it would be more accurate to say that those who taught them are like that.

In my shop, we're starting to ramp up PHP development. It will be interesting to see if they use Oracle better than their Java counterparts.
Tom Kyte
April 04, 2008 - 10:32 am UTC

The problem is - java programmers are


PROGRAMMERS

not data processing people. And we need a couple of good old fashioned data processing people that understand transactions, concurrency, locking, how to scale concurrent DATA access, data integrity and so on....

Now I follow you

Stew Ashton, April 05, 2008 - 10:19 am UTC


"good old fashioned data processing people"...

Hey, how do you know how old I am :)

Now that I get the point, I couldn't agree more. I would add that the programmers and "data processing people" can't just stay in their own worlds: someone has to know and show how programs and databases can best communicate and cooperate. I think that's what an "architect" should do, help fit the pieces together. This thread helps with that, so thanks.

Caveats with APPEND hint and direct path

Tom Starr, April 08, 2008 - 6:58 pm UTC

Thanks, I needed to know this:
"append only works on insert SELECT - never never never (fortunately) on insert values".

Would the VALUES clause also contribute to the poor performance of Edgar's FORALL
example above, and be another reason for your original suggestion at the beginning
of this thread to use Oracle Update Batching?

And is '(fortunately)' just because many small direct-path inserts would cause swiss cheese blocks?

Thanks again for the thought provoking answers,

tws



Tom Kyte
April 09, 2008 - 2:47 pm UTC

this is a big page, not sure which bit you were referring to...


insert append always writes above the high water mark, never reuses space. so, insert /*+ append */ into t values - if it worked - would always have one row per block and never reuse any blocks ever. it would make the table always grow and be very inefficient

APPEND hint with respect to original question above

Tom Starr, April 10, 2008 - 1:24 am UTC

Thanks, that's clear.

I just seek confirmation FORALL also will not use direct-path if it's insert has a
VALUES clause, as in this code from above:
begin
forall i in aCode.first .. aCode.last
insert into tt (id,code,val) values
(1,aCode(i),aVal(i));
end;

Btw that was a Most Useful presentation about new 11g features at Seattle
Oracle Technology Day today.

...
Tom Kyte
April 10, 2008 - 10:39 am UTC

forall will not use direct path on a insert values, correct - it cannot.

forall just batches up the inputs and sends them over in bulk to the "server" (to the sql engine), the insert statement would be executed "N" times using each of the inputs...

Just less round trips

Context switch / direct path details with SQL/PLSQL/JDBC batching

Tom Starr, April 10, 2008 - 12:30 pm UTC

Can you confirm direct-path is also not used by non-Oracle JDBC Batching,
but is used by Oracle JDBC Update Batching?

And that fewer round trips means fewer PLSQL/SQL context switches,
and that JDBC/PLSQL switches cost significantly more?

Thanks again for the details.


Tom Kyte
April 10, 2008 - 1:13 pm UTC

direct path operations only happen with things like

insert /*+ append */
merge /*+ append */
alter table t move
create table as select
sqlldr direct=yes


direct path operations do not happen with any conventional path operations (eg: there is no direct path "update" - ever)

Performance...

Tom Starr, April 11, 2008 - 2:02 pm UTC

Okay understood, direct-path is not available to Oracle JDBC Batching or
to FORALL with INSERT & VALUES.

So what factors do contribute to Batching and FORALL performance?

I would guess fewer context switches and fewer network packets hence
less cpu and reduced bandwidth and less work to packetize & unpacketize.

I also know OraclePreparedStatements are stored in the shared pool
to avoid hard parses.

And I read in Wrox Oracle 9i Java Programming that soft parses can also be avoided
by declaring the OraclePreparedStatement objects at the class level rather than
in methods so they will be cached at the session level as with PLSQL.

And I would also guess FORALL / INSERT would be faster using bind variables.

But are there other important factors to keep in mind, especially underlying the
excellent performance of FORALL?


Tom Kyte
April 11, 2008 - 2:15 pm UTC

...
So what factors do contribute to Batching and FORALL performance?
....

reduced round trips. Better ability to do index maintenance in some cases (especially with bitmapped indexes)

...
I also know OraclePreparedStatements are stored in the shared pool
to avoid hard parses.
....

all sql is actually - whether you use statement, preparedStatement or callableStatement

prepared statements allow you to use bind variables which promotes reuse of sql in the shared pool - it is not really possible to reuse sql in the shared pool if each sql statement is unique (a statement)


...
And I would also guess FORALL / INSERT would be faster using bind variables.
....

show me how to use forall without using binds!!!! it is for binding an array of values


...
But are there other important factors to keep in mind, especially underlying
the
excellent performance of FORALL?

....

sure, if you are using forall - ask yourself first why you are not just using SQL - do you really NEED procedural code in the first place.

Performance comparisons

Tom Starr, April 14, 2008 - 3:32 pm UTC

Thanks for the reminder and clarification.

When using Runstats to compare successive batch values as both Java programs do above,
should one execute the prepared statement once initially to avoid an initial hard parse
which would skew the performance data towards the second trial?

And would Oracle's use of bind variables behind the scenes explain why Oracle JDBC
Batching runs up to 25 times faster using oracle.sql.OraclePreparedStatement than
JDBC Batching using java.sql.PreparedStatement for the same batch size?

I can post a version of the Java/PLSQL above to compare two batch sizes using either:
- Oracle JDBC Batching (per Menon)
- Oracle JDBC Batching (per Menon but declaring the PreparedStatement in the CLASS)
- non-Oracle JDBC batching
- non-Oracle JDBC batching (declaring the PreparedStatement in the CLASS not in MAIN)
- sending batch size array chunks to PL/SQL which uses FORALL (per Edgar)
- calling PL/SQL which uses FORALL and BULK COLLECT once with each batch/LIMIT size
- calling PL/SQL which uses FORALL and BULK COLLECT once with two LIMITs

It also parameterizes the array size.

It confirms native PL/SQL outperforms Oracle JDBC Batching, and Oracle JDBC Batching
outperforms sending batch sized array chunks to PL/SQL from JDBC, and everything
outperforms non-Oracle JDBC Batching.

The effect of declaring the prepared statements in the CLASS to avoid soft parses seems
small but variable.

Tom Starr
Tom Kyte
April 16, 2008 - 2:08 pm UTC

i would use runstats wrapped around the code of interest - the stuff you are going to do over and over and over....

...
The effect of declaring the prepared statements in the CLASS to avoid soft
parses seems
small but variable.
...

parsing is something that rears its ugly ugly head in a MULTI-USER situation, you won't see the major downside to excessive parsing without a couple of users banging away - single user tests are not going to show it (short of showing the latching - which indicates "this will NOT scale")

Multi-user testing

Tom Starr, April 17, 2008 - 3:33 pm UTC

If I understand correctly, runstats does wrap the code of interest in
the provided Java above.

My version below is also modified to persist multi-session information, in
addition to the facets I listed above.

It uses a modified runstats that accepts a session id and a test name.

The Java provides separate test cases declaring Prepared statements in CLASS vs
method, and it attempts to invoke their SQL once before starting Runstats.

Possibly I will see those effects as I better understood the output.

I have yet to fathom documentation which says declaring a Prepared Statement
(not initializing it) at the Java CLASS level would avoid an Oracle "Soft Parse".

But I'm in this to learn.

Tom Starr

I hope this is useful to others.

I will paste the modified runstats and bulk_collect packages into a separate post
to stay under the size limit:

/*
* Purpose: Compare Oracle & non-Oracle JDBC Batching to PLSQL array passing with
* FORALL to BULK COLLECT / FORALL
*
* Author: twstarr - Based on original work noted below
* (Not optimized - Not a real Java programmer)
*
* Based on the following:
* http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1210436024540#18176936452010
* Subject: "Bulk operations. jdbc using batch option VS jdbc calling stored proced..."
* Original Java TestBatchUpdates program kindly provided by Menon
* Modified Java TestBatchUpdates based on Menon's program kindly provided by Edgar Chupit
*
* Depends on the following to display output from PL/SQL:
* Tom Kyte's DbmsOutput.java must exist in the CLASSPATH (same dirctory in this case):
* http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:45027262935845
*
* Depends on the following to compare performance between successive tests:
* Tom Kyte's PL/SQL PACKAGE RUNSTATS Package must be available in the current schema:
* http://asktom.oracle.com/tkyte/runstats
*
* Depends on the DDL and PL/SQL in the subsequent post below.
*
* To compile in UNIX (for DOS replace $ORACLE_HOME with %ORACLE_HOME% and / with \ and : with ;):
* $ORACLE_HOME/jdk/bin/javac CompareBatchBulkUpdating.java -classpath .:$ORACLE_HOME/jdbc/lib/classes12.zip
*
* To run in UNIX (for DOS replace $ORACLE_HOME with %ORACLE_HOME% and / with \ and : with ;):
* $ORACLE_HOME/jdk/bin/java -classpath .:$ORACLE_HOME/jdbc/lib/classes12.zip CompareBatchBulkUpdating 100000 1 10 1000 2
*
* Usage:
* First value is mode:
* Second value is number of records to insert
* Third and Forth values are Batch or LIMIT values
* Forth value is session identifier
* Allowable modes are 1-9 as follows:
* Mode 1 is Oracle JDBC Batching (per Menon)
* Mode 2 is Oracle JDBC Batching (per Menon but declaring the PreparedStatement in the CLASS)
* Mode 3 is non-Oracle JDBC batching
* Mode 4 is non-Oracle JDBC batching (declaring the PreparedStatement in the CLASS not in MAIN)
* Mode 5 is sending batch size array chunks to PL/SQL using FORALL (per Edgar)(very slow)
* Mode 6 is calling PL/SQL which uses FORALL and BULK COLLECT from a TABLE with a passed in LIMIT twice
* Mode 7 is calling PL/SQL which uses FORALL and BULK COLLECT from an MVIEW with a passed in LIMIT twice
* Mode 8 is calling PL/SQL which uses FORALL and BULK COLLECT once from a TABLE with two LIMITs
* Mode 9 is calling PL/SQL which uses FORALL and BULK COLLECT once from an MVIEW with two LIMITs
*
*/

import java.sql.*;
import oracle.sql.DATE;
import oracle.sql.ARRAY;
import oracle.sql.NUMBER;
import oracle.sql.ArrayDescriptor;
import oracle.jdbc.driver.*;

public class CompareBatchBulkUpdating
{
public static void main(String[] args) throws SQLException
{
System.out.println( "In main method of CompareBatchBulkUpdating after preload updates");
DriverManager.registerDriver( new OracleDriver());
_checkUsage( args );
_setBatchValues ( args );
OracleConnection connection = _getOracleConnection();
((OracleConnection) connection).setDefaultRowPrefetch(100);

_dbmsOutput = new DbmsOutput ( connection );
_dbmsOutput.enable ( DBMS_OUTPUT_BUFFER_SIZE );

long updateCount = 0;

// execute the requested mode
switch(OPMODE)
{
case(1):
System.out.print("Case 1: Original Oracle JDBC Update Batching per Menon with OraclePreparedStatement declared in instance (main()) \nComparing batch sizes: ");
System.out.print( UPDATE_BATCH_SIZES_BEING_TESTED[0]);
System.out.print(" and ");
System.out.println( UPDATE_BATCH_SIZES_BEING_TESTED[1]);
System.out.print(" for record count: ");
System.out.println( NUMBER_OF_RECORDS_INSERTED );

_prepareBenchmarkStatements ( connection,
"Case 1: Original Oracle JDBC Update Batching per Menon with OraclePreparedStatement declared in instance (main())");

statementString = "insert into bctab1 values ( ?,?,?,? )";
OraclePreparedStatement opslocal1 = (OraclePreparedStatement) connection.prepareStatement ( statementString );

strDate = "2008-03-31 06:00:00";
someOracleDATE = new oracle.sql.DATE( strDate );

//Get this SQL into the shared pool
opslocal1.setInt(1, 1 );
opslocal1.setDATE(2,someOracleDATE );
opslocal1.setInt(3, 1 );
opslocal1.setInt(4, 1 );
updateCount += opslocal1.executeUpdate();
updateCount += opslocal1.sendBatch();

_benchmarkStatementArray[BENCHMARK_START_INDEX].execute();

// for each batch size being tested
for( int k = 0; k < UPDATE_BATCH_SIZES_BEING_TESTED.length; k++)
{
opslocal1.setExecuteBatch ( UPDATE_BATCH_SIZES_BEING_TESTED[k] );
// benchmark starts
long startTime = System.currentTimeMillis();
updateCount = 0;
for( int i=0; i < NUMBER_OF_RECORDS_INSERTED; i++)
{
opslocal1.setInt(1, i );
opslocal1.setDATE(2,someOracleDATE );
opslocal1.setInt(3, i );
opslocal1.setInt(4, i*i*3 );
updateCount += opslocal1.executeUpdate();
}
updateCount += opslocal1.sendBatch(); // remaining data if any not really necessary since commit does
connection.commit();

long endTime = System.currentTimeMillis();

if( k == 0 ) // set the middle point only in first iteration
_benchmarkStatementArray[BENCHMARK_MIDDLE_INDEX].execute();
// benchmark ends
long timeTaken = (endTime - startTime);
benchmarkTimingResults[k] = timeTaken;
}
_benchmarkStatementArray[BENCHMARK_STOP_INDEX].execute();
opslocal1.close();
break;

case(2):
System.out.print("Case 2: Original Oracle JDBC Update Batching per Menon but with OraclePreparedStatement declared in CLASS \nComparing batch sizes: ");
System.out.print( UPDATE_BATCH_SIZES_BEING_TESTED[0]);
System.out.print(" and ");
System.out.println( UPDATE_BATCH_SIZES_BEING_TESTED[1]);
System.out.print(" for record count: ");
System.out.println( NUMBER_OF_RECORDS_INSERTED );

_prepareBenchmarkStatements ( connection,
"Case 2: Original Oracle JDBC Update Batching per Menon but with OraclePreparedStatement declared in CLASS");

statementString = "insert into bctab1 values ( ?,?,?,? )";
statementString = "insert into bctab2 values ( ?,?,?,? )";
ops = (OraclePreparedStatement) connection.prepareStatement ( statementString );

strDate = "2008-03-31 06:00:00";
someOracleDATE = new oracle.sql.DATE( strDate );

//Get this SQL into the shared pool
ops.setInt(1, 1 );
ops.setDATE(2,someOracleDATE );
ops.setInt(3, 1 );
ops.setInt(4, 1 );
updateCount += ops.executeUpdate();
updateCount += ops.sendBatch();

_benchmarkStatementArray[BENCHMARK_START_INDEX].execute();

// for each batch size being tested
for( int k = 0; k < UPDATE_BATCH_SIZES_BEING_TESTED.length; k++)
{
ops.setExecuteBatch ( UPDATE_BATCH_SIZES_BEING_TESTED[k] );
// benchmark starts
long startTime = System.currentTimeMillis();
updateCount = 0;
for( int i=0; i < NUMBER_OF_RECORDS_INSERTED; i++)
{
ops.setInt(1, i );
ops.setDATE(2,someOracleDATE );
ops.setInt(3, i );
ops.setInt(4, i*i*3 );
updateCount += ops.executeUpdate();
}
updateCount += ops.sendBatch(); // remaining data if any not really necessary since commit does
connection.commit();

long endTime = System.currentTimeMillis();

if( k == 0 ) // set the middle point only in first iteration
_benchmarkStatementArray[BENCHMARK_MIDDLE_INDEX].execute();
// benchmark ends
long timeTaken = (endTime - startTime);
benchmarkTimingResults[k] = timeTaken;
}
_benchmarkStatementArray[BENCHMARK_STOP_INDEX].execute();
ops.close();
break;

case(3):
System.out.print("Case 3: Non-Oracle manual JDBC Update Batching \nComparing batch sizes: ");
System.out.print( UPDATE_BATCH_SIZES_BEING_TESTED[0]);
System.out.print(" and ");
System.out.println( UPDATE_BATCH_SIZES_BEING_TESTED[1]);
System.out.print(" for record count: ");
System.out.println( NUMBER_OF_RECORDS_INSERTED );

_prepareBenchmarkStatements ( connection,
"Case 3: Non-Oracle manner JDBC Update Batching");

statementString = "insert into bctab1 values ( ?,?,?,? )";
statementString = "insert into bctab3 values ( ?,?,?,? )";
ps = (PreparedStatement) connection.prepareStatement ( statementString );

someJavaDate = java.sql.Date.valueOf( "2008-03-31" );

//Get this SQL into the shared pool
ps.setInt(1, 1);
ps.setDate(2, someJavaDate);
ps.setInt(3, 1);
ps.setInt(4, 1);
updateCount += ps.executeUpdate();

_benchmarkStatementArray[BENCHMARK_START_INDEX].execute();

// for each batch size being tested
for( int k = 0; k < UPDATE_BATCH_SIZES_BEING_TESTED.length; k++)
{
// benchmark starts
long startTime = System.currentTimeMillis();
updateCount = 0;
for( int i=0; i < NUMBER_OF_RECORDS_INSERTED; i++)
{
ps.setInt(1, i );
ps.setDate(2, someJavaDate);
ps.setInt(3, i );
ps.setInt(4, i*i*3 );
updateCount += ps.executeUpdate();
if( i > 0 && i%UPDATE_BATCH_SIZES_BEING_TESTED[k] == 0 )
{
updateCount += ps.executeBatch().length;
}
}
updateCount += ps.executeBatch().length; // remaining data if any
connection.commit();

long endTime = System.currentTimeMillis();
if( k == 0 ) // set the middle point only in first iteration
_benchmarkStatementArray[BENCHMARK_MIDDLE_INDEX].execute();
// benchmark ends
long timeTaken = (endTime - startTime);
benchmarkTimingResults[k] = timeTaken;
}
_benchmarkStatementArray[BENCHMARK_STOP_INDEX].execute();
ps.close();
break;

case(4):
System.out.print("Case 4: Non-Oracle manual JDBC Update Batching with PreparedStatement declared in CLASS \nComparing batch sizes: ");
System.out.print( UPDATE_BATCH_SIZES_BEING_TESTED[0]);
System.out.print(" and ");
System.out.println( UPDATE_BATCH_SIZES_BEING_TESTED[1]);
System.out.print(" for record count: ");
System.out.println( NUMBER_OF_RECORDS_INSERTED );

_prepareBenchmarkStatements ( connection,
"Case 4: Non-Oracle manual JDBC Update Batching with PreparedStatement declared in CLASS");

statementString = "insert into bctab4 values ( ?,?,?,? )";
PreparedStatement pslocal = (PreparedStatement) connection.prepareStatement ( statementString );

java.sql.Date someJavaDate = java.sql.Date.valueOf( "2008-03-31" );

//Get this SQL into the shared pool
pslocal.setInt(1, 1);
pslocal.setDate(2, someJavaDate);
pslocal.setInt(3, 1);
pslocal.setInt(4, 1);
updateCount += pslocal.executeUpdate();

_benchmarkStatementArray[BENCHMARK_START_INDEX].execute();

// for each batch size being tested
for( int k = 0; k < UPDATE_BATCH_SIZES_BEING_TESTED.length; k++)
{
// benchmark starts
long startTime = System.currentTimeMillis();
updateCount = 0;
for( int i=0; i < NUMBER_OF_RECORDS_INSERTED; i++)
{
pslocal.setInt(1, i );
pslocal.setDate(2, someJavaDate);
pslocal.setInt(3, i );
pslocal.setInt(4, i*i*3 );
updateCount += pslocal.executeUpdate();
if( i > 0 && i%UPDATE_BATCH_SIZES_BEING_TESTED[k] == 0 )
{
updateCount += pslocal.executeBatch().length;
}
}
updateCount += pslocal.executeBatch().length; // remaining data if any
connection.commit();

long endTime = System.currentTimeMillis();
if( k == 0 ) // set the middle point only in first iteration
_benchmarkStatementArray[BENCHMARK_MIDDLE_INDEX].execute();
// benchmark ends
long timeTaken = (endTime - startTime);
benchmarkTimingResults[k] = timeTaken;
}
_benchmarkStatementArray[BENCHMARK_STOP_INDEX].execute();
pslocal.close();
break;

case(5):
System.out.print("Case 5: Calling PL/SQL using FORALL passing batch size ARRAY chunks (per Edgar) \nComparing sizes: ");
System.out.print( UPDATE_BATCH_SIZES_BEING_TESTED[0]);
System.out.print(" and ");
System.out.println( UPDATE_BATCH_SIZES_BEING_TESTED[1]);
System.out.print(" for record count: ");
System.out.println( NUMBER_OF_RECORDS_INSERTED );

_prepareBenchmarkStatements ( connection,
"Case 5: Calling PL/SQL using FORALL passing batch size ARRAY chunks (per Edgar)");

//Addition: Attempt to preload this SQL into the shared pool
statementString = "INSERT INTO bctab5 (id,code,val) VALUES (?,?,?)";
ops = (OraclePreparedStatement) connection.prepareStatement ( statementString );
ops.setInt(1, 1);
ops.setInt(2, 1);
ops.setInt(3, 1);
updateCount += ops.executeUpdate();
updateCount += ops.sendBatch();

statementString = "begin bulkcollpkg.doInsert(?,?); end;";
ocs = (OracleCallableStatement) connection.prepareCall ( statementString );
desc = ArrayDescriptor.createDescriptor( "TOMS.INTEGERARRAY", connection );

_benchmarkStatementArray[BENCHMARK_START_INDEX].execute();

// for each batch size being tested
for( int k = 0; k < UPDATE_BATCH_SIZES_BEING_TESTED.length; k++)
{
// benchmark starts
long startTime = System.currentTimeMillis();
updateCount = 0;
int[] aCode = new int [ UPDATE_BATCH_SIZES_BEING_TESTED[k] ];
int[] aVal = new int [ UPDATE_BATCH_SIZES_BEING_TESTED[k] ];

for( int i=0; i < NUMBER_OF_RECORDS_INSERTED; i++)
{
if( i%UPDATE_BATCH_SIZES_BEING_TESTED[k] == 0 && i > 0 )
{
ocs.setArray(1,new ARRAY (desc, connection, aCode) );
ocs.setArray(2,new ARRAY (desc, connection, aVal) );
ocs.execute();
// if NUMBER_OF_RECORDS_INSERTED%UPDATE_BATCH_SIZES_BEING_TESTED[k] != 0 than
// last portion of data will be smaller than UPDATE_BATCH_SIZES_BEING_TESTED[k]
if( (NUMBER_OF_RECORDS_INSERTED-i > 0) && (UPDATE_BATCH_SIZES_BEING_TESTED[k] >
NUMBER_OF_RECORDS_INSERTED-i) )
{
aCode = new int [ NUMBER_OF_RECORDS_INSERTED-i ];
aVal = new int [ NUMBER_OF_RECORDS_INSERTED-i ];
}
}
aCode[(i%UPDATE_BATCH_SIZES_BEING_TESTED[k])] = i;
aVal[(i%UPDATE_BATCH_SIZES_BEING_TESTED[k])] = i*i*3;
}
//Is oracle.jdbc.OracleCallableStatement.setArray(int,oracle.sql.Array)
//Is inherited from oracle.jdbc.OraclePreparedStatement
//Is inherited from java.sql.OraclePreparedStatement
//Is INTERFACE java.sql.Array NOT CLASS oracle.sql.ARRAY
//Driver converts java.sql.Array to a SQL ARRAY value when sending to database
//INTERFACE oracle.jdbc.OracleCallableStatement extends INTERFACE oracle.jdbc.OraclePreparedStatement
//INTERFACE oracle.jdbc.OracleCallableStatement extends INTERFACE java.sql.OracleCallableStatement
ocs.setArray(1,new ARRAY (desc, connection, aCode) );
ocs.setArray(2,new ARRAY (desc, connection, aVal) );
ocs.execute();
connection.commit();

long endTime = System.currentTimeMillis();
if( k == 0 ) // set the middle point only in first iteration
_benchmarkStatementArray[BENCHMARK_MIDDLE_INDEX].execute();
// benchmark ends
long timeTaken = (endTime - startTime);
benchmarkTimingResults[k] = timeTaken;
}
_benchmarkStatementArray[BENCHMARK_STOP_INDEX].execute();
ocs.close();
break;

case(6):
System.out.print("Case 6: Calling PL/SQL which uses FORALL and BULK COLLECT with a passed in batch size LIMIT \nComparing batch sizes: ");
System.out.print( UPDATE_BATCH_SIZES_BEING_TESTED[0]);
System.out.print(" and ");
System.out.println( UPDATE_BATCH_SIZES_BEING_TESTED[1]);
System.out.print(" for record count: ");
System.out.println( NUMBER_OF_RECORDS_INSERTED );

_prepareBenchmarkStatements ( connection,
"Case 6: Calling PL/SQL which uses FORALL and BULK COLLECT with a passed in batch size LIMIT");

statementString = "begin bulkcollpkg.doBulkInsert1(?); end;";
ocs = (OracleCallableStatement) connection.prepareCall ( statementString );

_benchmarkStatementArray[BENCHMARK_START_INDEX].execute();

// for each batch size being tested
for( int k = 0; k < UPDATE_BATCH_SIZES_BEING_TESTED.length; k++)
{
// benchmark starts
long startTime = System.currentTimeMillis();
int aBat = UPDATE_BATCH_SIZES_BEING_TESTED[k];

//CLASS oracle.sql.NUMBER
//No setNumber() in java.sql or oracle.jdbc.sql
ocs.setNUMBER(1,new NUMBER(aBat));
ocs.execute();
connection.commit();

long endTime = System.currentTimeMillis();
if( k == 0 ) // set the middle point only in first iteration
_benchmarkStatementArray[BENCHMARK_MIDDLE_INDEX].execute();
// benchmark ends
long timeTaken = (endTime - startTime);
benchmarkTimingResults[k] = timeTaken;
}
_benchmarkStatementArray[BENCHMARK_STOP_INDEX].execute();
ocs.close();
break;
case(7):
System.out.print("Case 7: Calling PL/SQL which uses FORALL and BULK COLLECT from MVIEW with a passed in batch size LIMIT \nComparing batch sizes: ");
System.out.print( UPDATE_BATCH_SIZES_BEING_TESTED[0]);
System.out.print(" and ");
System.out.println( UPDATE_BATCH_SIZES_BEING_TESTED[1]);
System.out.print(" for record count: ");
System.out.println( NUMBER_OF_RECORDS_INSERTED );

_prepareBenchmarkStatements ( connection,
"Case 7: Calling PL/SQL which uses FORALL and BULK COLLECT from MVIEW with a passed in batch size LIMIT");

statementString = "begin bulkcollpkg.doBulkInsert2(?); end;";
ocs = (OracleCallableStatement) connection.prepareCall ( statementString );

_benchmarkStatementArray[BENCHMARK_START_INDEX].execute();

// for each batch size being tested
for( int k = 0; k < UPDATE_BATCH_SIZES_BEING_TESTED.length; k++)
{
// benchmark starts
long startTime = System.currentTimeMillis();
int aBat = UPDATE_BATCH_SIZES_BEING_TESTED[k];
//CLASS oracle.sql.NUMBER
//No setNumber() in java.sql or oracle.jdbc.sql
ocs.setNUMBER(1,new NUMBER(aBat));
ocs.execute();
connection.commit();

long endTime = System.currentTimeMillis();
if( k == 0 ) // set the middle point only in first iteration
_benchmarkStatementArray[BENCHMARK_MIDDLE_INDEX].execute();
// benchmark ends
long timeTaken = (endTime - startTime);
benchmarkTimingResults[k] = timeTaken;
}
_benchmarkStatementArray[BENCHMARK_STOP_INDEX].execute();
ocs.close();
break;

case(8):
System.out.print("Case 8: Calling PL/SQL which uses FORALL and BULK COLLECT once with two passed in LIMIT \nComparing sizes: ");
System.out.print( UPDATE_BATCH_SIZES_BEING_TESTED[0]);
System.out.print(" and ");
System.out.println( UPDATE_BATCH_SIZES_BEING_TESTED[1]);
System.out.print(" for record count: ");
System.out.println( NUMBER_OF_RECORDS_INSERTED );
System.out.println("Note: In this case RUNSTATS is run within the PLSQL rather than invoked via JDBC.");
System.out.println(" For some reason the millisecond values obtained from DBMS_UTILITY.GET_TIME are too long.");

_prepareBenchmarkStatements ( connection,
"Case 8: Calling PL/SQL which uses FORALL and BULK COLLECT once with two passed in LIMIT");

statementString = "{? = call bulkcollpkg.doBulkInsert3(?,?) }";
ocs = (OracleCallableStatement) connection.prepareCall ( statementString );

ocs.registerOutParameter(1,OracleTypes.ARRAY,"TOMS.INTEGERTIMINGARRAY");
ocs.setNUMBER(2,new NUMBER(UPDATE_BATCH_SIZES_BEING_TESTED[0]));
ocs.setNUMBER(3,new NUMBER(UPDATE_BATCH_SIZES_BEING_TESTED[1]));
ocs.execute();

results = ocs.getARRAY(1);
resultset = results.getResultSet();

if (resultset.next())
benchmarkTimingResults[0] = resultset.getInt(2);
if (resultset.next())
benchmarkTimingResults[1] = resultset.getInt(2);

// benchmark run within PLSQL
ocs.close();
break;

case(9):
System.out.print("Case 9: Calling PL/SQL which uses FORALL and BULK COLLECT from MVIEW once with two passed in LIMIT \nComparing sizes: ");
System.out.print( UPDATE_BATCH_SIZES_BEING_TESTED[0]);
System.out.print(" and ");
System.out.println( UPDATE_BATCH_SIZES_BEING_TESTED[1]);
System.out.print(" for record count: ");
System.out.println( NUMBER_OF_RECORDS_INSERTED );
System.out.println("Note: In this case RUNSTATS is run within the PLSQL rather than invoked via JDBC.");
System.out.println(" For some reason the millisecond values obtained from DBMS_UTILITY.GET_TIME are too long.");

_prepareBenchmarkStatements ( connection,
"Case 9: Calling PL/SQL which uses FORALL and BULK COLLECT from MVIEW once with two passed in LIMIT");

statementString = "{? = call bulkcollpkg.doBulkInsert4(?,?) }";
ocs = (OracleCallableStatement) connection.prepareCall ( statementString );

ocs.registerOutParameter(1,OracleTypes.ARRAY,"TOMS.INTEGERTIMINGARRAY");
ocs.setNUMBER(2,new NUMBER(UPDATE_BATCH_SIZES_BEING_TESTED[0]));
ocs.setNUMBER(3,new NUMBER(UPDATE_BATCH_SIZES_BEING_TESTED[1]));
ocs.execute();

results = ocs.getARRAY(1);
resultset = results.getResultSet();

if (resultset.next())
benchmarkTimingResults[0] = resultset.getInt(2);
if (resultset.next())
benchmarkTimingResults[1] = resultset.getInt(2);

// benchmark run within PLSQL
ocs.close();
break;

}

_printBenchmarkResults ();

System.out.println( "----Java Timer based results begin\n" );
for (int i=0; i < UPDATE_BATCH_SIZES_BEING_TESTED.length; i++)
{
System.out.println( "Milliseconds taken for inserting " + NUMBER_OF_RECORDS_INSERTED +
" with a batch size of " + UPDATE_BATCH_SIZES_BEING_TESTED[i] +
" = " + benchmarkTimingResults[i] );
System.out.flush();
}
System.out.println( "----Java Timer based results end\n" );
_closeBenchmarkStatements ( connection );
connection.close();
}

private static void _closeBenchmarkStatements ( OracleConnection connection ) throws SQLException
{
for( int i=0; i < _benchmarkStatementArray.length; i++)
{
_benchmarkStatementArray[i].close();
}
}
private static void _prepareBenchmarkStatements ( OracleConnection connection, String thisTest ) throws SQLException
{
_benchmarkStatementArray[BENCHMARK_START_INDEX]= (OracleCallableStatement) connection.prepareCall( BENCHMARK_START );
_benchmarkStatementArray[BENCHMARK_START_INDEX].setInt(1, sessionId);
_benchmarkStatementArray[BENCHMARK_START_INDEX].setString(2, thisTest);
_benchmarkStatementArray[BENCHMARK_MIDDLE_INDEX]= (OracleCallableStatement) connection.prepareCall( BENCHMARK_MIDDLE );
_benchmarkStatementArray[BENCHMARK_MIDDLE_INDEX].setInt(1, sessionId);
_benchmarkStatementArray[BENCHMARK_MIDDLE_INDEX].setString(2, thisTest);
_benchmarkStatementArray[BENCHMARK_STOP_INDEX]= (OracleCallableStatement) connection.prepareCall( BENCHMARK_STOP );
_benchmarkStatementArray[BENCHMARK_STOP_INDEX].setInt(1, sessionId);
_benchmarkStatementArray[BENCHMARK_STOP_INDEX].setString(2, thisTest);
_benchmarkStatementArray[BENCHMARK_STOP_INDEX].setInt(3, BENCHMARK_DIFFERENCE_THRESHOLD);
}
private static void _checkUsage (String[] ar

DDL and PLSQL packages for above Java

Tom Starr, April 17, 2008 - 3:34 pm UTC

-- THE FOLLOWING ORACLE OBJECTS AND PLSQL PACKAGE MUST EXIST IN THE SCHEMA USED FOR THE CONNECTION:
--
-- Modified version of Oracle package runstats_pkg by Tom Kyte
-- http://asktom.oracle.com/tkyte/runstats
--
-- Requires:
-- select on SYS.V_$STATNAME
-- select on SYS.V_$MYSTAT
-- select on SYS.V_$LATCH
--
-- Modifications:
-- Accepts a session id and test name and adds those parameters to inserts and output
-- Replaces the global temporary table to provide visibility from other sessions
--
-- create table run_stats_cum
-- ( session_id int,
-- run_time varchar2(20),
-- test_name varchar2(256),
-- runid varchar2(15),
-- name varchar2(80),
-- value int );
--
-- create or replace view stats
-- as select 'STAT...' || a.name name, b.value
-- from v$statname a, v$mystat b
-- where a.statistic# = b.statistic#
-- union all
-- select 'LATCH.' || name, gets
-- from v$latch;
--

create or replace package runstats_pkg_cum
as
procedure rs_start(session_id pls_integer, this_test varchar2);
procedure rs_middle(session_id pls_integer, this_test varchar2);
procedure rs_stop(session_id pls_integer, this_test varchar2, p_difference_threshold in number default 0 );
end runstats_pkg_cum;
/

create or replace package body runstats_pkg_cum
as

g_start number;
g_run1 number;
g_run2 number;

procedure rs_start(session_id pls_integer, this_test varchar2)
is
begin
insert into run_stats_cum
(session_id, run_time, test_name, runid, name, value)
(select session_id, TO_CHAR(SYSDATE, 'YYYY:MM:DD:HH:MI:SS'), this_test, 'before', stats.* from stats);
g_start := dbms_utility.get_time;
end rs_start;

procedure rs_middle(session_id pls_integer, this_test varchar2)
is
begin
g_run1 := (dbms_utility.get_time-g_start);
dbms_output.put_line('Parameter this_test in rs_middle: '|| this_test);
insert into run_stats_cum
(session_id, run_time, test_name, runid, name, value)
(select session_id, TO_CHAR(SYSDATE, 'YYYY:MM:DD:HH:MI:SS'), this_test, 'after 1', stats.* from stats);
g_start := dbms_utility.get_time;
end rs_middle;

procedure rs_stop(session_id pls_integer, this_test varchar2, p_difference_threshold in number default 0)
is
begin
g_run2 := (dbms_utility.get_time-g_start);

dbms_output.put_line('Parameter this_test: '|| this_test);
dbms_output.put_line('Parameter p_difference_threshold: '|| p_difference_threshold);
dbms_output.put_line
( 'Test: ' || this_test || ' Run1 ran in ' || g_run1 || ' hsecs' );
dbms_output.put_line
( 'Test: ' || this_test || ' Run2 ran in ' || g_run2 || ' hsecs' );
dbms_output.put_line
( 'Test: ' || this_test || ' run 1 ran in ' || round(g_run1/g_run2*100,2) ||
'% of the time' );
dbms_output.put_line( chr(9) );

insert into run_stats_cum
(session_id, run_time, test_name, runid, name, value)
(select session_id, TO_CHAR(SYSDATE, 'YYYY:MM:DD:HH:MI:SS'), this_test, 'after 2', stats.* from stats);

dbms_output.put_line
( rpad( 'Name', 30 ) || lpad( 'Run1', 10 ) ||
lpad( 'Run2', 10 ) || lpad( 'Diff', 10 ) );

for x in
( select rpad( a.name, 30 ) ||
to_char( b.value-a.value, '9,999,999' ) ||
to_char( c.value-b.value, '9,999,999' ) ||
to_char( ( (c.value-b.value)-(b.value-a.value)), '9,999,999' ) data
from run_stats_cum a, run_stats_cum b, run_stats_cum c
where a.name = b.name
and b.name = c.name
and a.runid = 'before'
and b.runid = 'after 1'
and c.runid = 'after 2'
and (c.value-a.value) > 0
and abs( (c.value-b.value) - (b.value-a.value) )
> p_difference_threshold
and a.test_name = this_test
and b.test_name = this_test
and c.test_name = this_test
order by abs( (c.value-b.value)-(b.value-a.value))
) loop
dbms_output.put_line( x.data );
end loop;

dbms_output.put_line( chr(9) );
dbms_output.put_line
( 'Test: ' || this_test || ' Run1 latches total versus runs -- difference and pct' );
dbms_output.put_line
( lpad( 'Run1', 10 ) || lpad( 'Run2', 10 ) ||
lpad( 'Diff', 10 ) || lpad( 'Pct', 8 ) );

for x in
( select to_char( run1, '9,999,999' ) ||
to_char( run2, '9,999,999' ) ||
to_char( diff, '9,999,999' ) ||
to_char( round( run1/run2*100,2 ), '999.99' ) || '%' data
from ( select sum(b.value-a.value) run1, sum(c.value-b.value) run2,
sum( (c.value-b.value)-(b.value-a.value)) diff
from run_stats_cum a, run_stats_cum b, run_stats_cum c
where a.name = b.name
and b.name = c.name
and a.runid = 'before'
and b.runid = 'after 1'
and c.runid = 'after 2'
and a.name like 'LATCH%'
and a.test_name = this_test
and b.test_name = this_test
and c.test_name = this_test

)
) loop
dbms_output.put_line( x.data );
end loop;
end rs_stop;

end runstats_pkg_cum;
/

--The following Oracle objects are required:
CREATE TABLE bctab0 AS SELECT object_id id, to_timestamp(created) time, object_id code, data_object_id val from all_objects WHERE ROWNUM < 10001;
CREATE MATERIALIZED VIEW bcmview0 AS SELECT object_id id, to_timestamp(created) time, object_id code, data_object_id val from all_objects WHERE ROWNUM < 10001;
CREATE TABLE bctab1 (id INTEGER, time TIMESTAMP DEFAULT SYSTIMESTAMP, code INTEGER, val INTEGER);
CREATE TABLE bctab2 (id INTEGER, time TIMESTAMP DEFAULT SYSTIMESTAMP, code INTEGER, val INTEGER);
CREATE TABLE bctab3 (id INTEGER, time TIMESTAMP DEFAULT SYSTIMESTAMP, code INTEGER, val INTEGER);
CREATE TABLE bctab4 (id INTEGER, time TIMESTAMP DEFAULT SYSTIMESTAMP, code INTEGER, val INTEGER);
CREATE TABLE bctab5 (id INTEGER, time TIMESTAMP DEFAULT SYSTIMESTAMP, code INTEGER, val INTEGER);
CREATE TABLE bctab6 (id INTEGER, time TIMESTAMP DEFAULT SYSTIMESTAMP, code INTEGER, val INTEGER);
CREATE TABLE bctab7 (id INTEGER, time TIMESTAMP DEFAULT SYSTIMESTAMP, code INTEGER, val INTEGER);
CREATE TABLE bctab8 (id INTEGER, time TIMESTAMP DEFAULT SYSTIMESTAMP, code INTEGER, val INTEGER);
CREATE TABLE bctab9 (id INTEGER, time TIMESTAMP DEFAULT SYSTIMESTAMP, code INTEGER, val INTEGER);
CREATE TABLE bctabA (id INTEGER, time TIMESTAMP DEFAULT SYSTIMESTAMP, code INTEGER, val INTEGER);
CREATE OR REPLACE TYPE integertimingarray IS TABLE OF INTEGER;
/
CREATE OR REPLACE TYPE integerarray IS TABLE OF INTEGER;
/

CREATE OR REPLACE PACKAGE bulkcollpkg IS
PROCEDURE doInsert( aCode IN INTEGERARRAY, aVal IN INTEGERARRAY );
PROCEDURE doBulkInsert1( batchSize IN INTEGER );
PROCEDURE doBulkInsert2( batchSize IN INTEGER );
FUNCTION doBulkInsert3( sessionId IN INTEGER, batchSize1 IN INTEGER, batchSize2 IN INTEGER ) RETURN integertimingarray ;
FUNCTION doBulkInsert4( sessionId IN INTEGER, batchSize1 IN INTEGER, batchSize2 IN INTEGER ) RETURN integertimingarray ;
END bulkcollpkg;
/
CREATE OR REPLACE PACKAGE BODY bulkcollpkg IS
--Insert Two Arrays
PROCEDURE doInsert( aCode IN INTEGERARRAY, aVal IN INTEGERARRAY ) IS
BEGIN
FORALL i IN aCode.FIRST .. aCode.LAST
INSERT INTO bctab5 (id,code,val) VALUES (1,aCode(i),aVal(i));
COMMIT;
END doInsert;
--Insert using BULK COLLECT and FORALL with CURSOR
PROCEDURE doBulkInsert1( batchSize IN INTEGER ) IS
--Note: Caller sets batchSize1 and batchSize2
CURSOR bc_cur IS SELECT id, time, val, code FROM bctab0;
TYPE source_rows_array IS TABLE OF bctab0%ROWTYPE;
bc_cur_array source_rows_array;
BEGIN
OPEN bc_cur;
LOOP
FETCH bc_cur BULK COLLECT INTO bc_cur_array LIMIT batchSize;
FORALL i IN 1..bc_cur_array.COUNT
INSERT INTO bctab6 VALUES bc_cur_array(i);
EXIT WHEN bc_cur%NOTFOUND;
END LOOP;
CLOSE bc_cur;
COMMIT;
END doBulkInsert1;
--Insert using BULK COLLECT and FORALL with CURSOR
PROCEDURE doBulkInsert2( batchSize IN INTEGER ) IS
--Note: Caller sets batchSize1 and batchSize2
CURSOR bc_cur IS SELECT id, time, val, code FROM bcmview0;
TYPE source_rows_array IS TABLE OF bcmview0%ROWTYPE;
bc_cur_array source_rows_array;
BEGIN
OPEN bc_cur;
LOOP
FETCH bc_cur BULK COLLECT INTO bc_cur_array LIMIT batchSize;
FORALL i IN 1..bc_cur_array.COUNT
INSERT INTO bctab7 VALUES bc_cur_array(i);
EXIT WHEN bc_cur%NOTFOUND;
END LOOP;
CLOSE bc_cur;
COMMIT;
END doBulkInsert2;
--Insert using BULK COLLECT and FORALL with CURSOR called once with RUNSTATS in PLSQL
FUNCTION doBulkInsert3( sessionId IN INTEGER, batchSize1 IN INTEGER, batchSize2 IN INTEGER) RETURN integertimingarray IS
--Note: Caller sets batchSize1 and batchSize2
CURSOR bc_cur IS SELECT id, time, val, code FROM bctab0;
TYPE source_rows_array IS TABLE OF bctab0%ROWTYPE;
bc_cur_array source_rows_array;
time_start NUMBER;
time_middle NUMBER;
time_end NUMBER;
elapsed_run1 NUMBER;
elapsed_run2 NUMBER;
timing_array integertimingarray;
BEGIN
time_start := dbms_utility.get_time;
runStats_pkg_cum.rs_start(sessionId, 'Insert using BULK COLLECT and FORALL with CURSOR called once with RUNSTATS in PLSQL');
OPEN bc_cur;
LOOP
FETCH bc_cur BULK COLLECT INTO bc_cur_array LIMIT batchSize1;
FORALL i IN 1..bc_cur_array.COUNT
INSERT INTO bctab8 VALUES bc_cur_array(i);
EXIT WHEN bc_cur%NOTFOUND;
END LOOP;
CLOSE bc_cur;
time_middle := dbms_utility.get_time;
runStats_pkg_cum.rs_middle(sessionId, 'Insert using BULK COLLECT and FORALL with CURSOR called once with RUNSTATS in PLSQL');
OPEN bc_cur;
LOOP
FETCH bc_cur BULK COLLECT INTO bc_cur_array LIMIT batchSize2;
FORALL i IN 1..bc_cur_array.COUNT
INSERT INTO bctab8 VALUES bc_cur_array(i);
EXIT WHEN bc_cur%NOTFOUND;
END LOOP;
CLOSE bc_cur;
runStats_pkg_cum.rs_stop(sessionId, 'Insert using BULK COLLECT and FORALL with CURSOR called once with RUNSTATS in PLSQL',200);
COMMIT;
time_end := dbms_utility.get_time;
elapsed_run1 := 10 * (time_middle - time_start);
elapsed_run2 := 10 * (time_end - time_middle);
timing_array := integertimingarray(elapsed_run1, elapsed_run2);
RETURN timing_array;
END doBulkInsert3;
--Insert using BULK COLLECT and FORALL with CURSOR called once with RUNSTATS in PLSQL
FUNCTION doBulkInsert4( sessionId IN INTEGER, batchSize1 IN INTEGER, batchSize2 IN INTEGER) RETURN integertimingarray IS
--Note: Caller sets batchSize1 and batchSize2
CURSOR bc_cur IS SELECT id, time, val, code FROM bcmview0;
TYPE source_rows_array IS TABLE OF bcmview0%ROWTYPE;
bc_cur_array source_rows_array;
time_start NUMBER;
time_middle NUMBER;
time_end NUMBER;
elapsed_run1 NUMBER;
elapsed_run2 NUMBER;
timing_array integertimingarray;
BEGIN
time_start := dbms_utility.get_time;
runStats_pkg_cum.rs_start(sessionId, 'Insert using BULK COLLECT and FORALL with CURSOR called once with RUNSTATS in PLSQL');
OPEN bc_cur;
LOOP
FETCH bc_cur BULK COLLECT INTO bc_cur_array LIMIT batchSize1;
FORALL i IN 1..bc_cur_array.COUNT
INSERT INTO bctab9 VALUES bc_cur_array(i);
EXIT WHEN bc_cur%NOTFOUND;
END LOOP;
CLOSE bc_cur;
time_middle := dbms_utility.get_time;
runStats_pkg_cum.rs_middle(sessionId, 'Insert using BULK COLLECT and FORALL with CURSOR called once with RUNSTATS in PLSQL');
OPEN bc_cur;
LOOP
FETCH bc_cur BULK COLLECT INTO bc_cur_array LIMIT batchSize2;
FORALL i IN 1..bc_cur_array.COUNT
INSERT INTO bctab9 VALUES bc_cur_array(i);
EXIT WHEN bc_cur%NOTFOUND;
END LOOP;
CLOSE bc_cur;
runStats_pkg_cum.rs_stop(sessionId, 'Insert using BULK COLLECT and FORALL with CURSOR called once with RUNSTATS in PLSQL',200);
COMMIT;
time_end := dbms_utility.get_time;
elapsed_run1 := 10 * (time_middle - time_start);
elapsed_run2 := 10 * (time_end - time_middle);
timing_array := integertimingarray(elapsed_run1, elapsed_run2);
RETURN timing_array;
END doBulkInsert4;
END bulkcollpkg;
/


Errata in modified runstats above

Tom Starr, April 18, 2008 - 10:35 am UTC

I need to fix the multi-session version of runstats above to also accept a session
identifier as well as the test identifier, and alter the Java and PL/SQL calls
accordingly to pass it in.

As it is now the results in run_stats_cum are undifferentiable between instances
of the same test.

Feedback gratefully appreciated.

tws

Never mind previous errata

Tom Starr, April 18, 2008 - 11:36 am UTC

Apologies, the code above does seem correct.

Questions:
- Whether the attempted shared pool preload should work (or be necessary).
- If comparing BULK COLLECT from a Materialized View with populating Prepared
Statements may significantly effect results.
- Why the timing between successive identical tests on a dedicated 1GHz PIII with
1.5G RAM running Oracle Linux v4 vary so much (Oracle background processes?).
- Any other mistaken assumptions or errors.

Assumptions

Tom Starr, April 21, 2008 - 12:41 pm UTC

I should be clear the 9 tests above are not intended to be functionally equivalent,
just to compare performance of a few PL/SQL strategies and non-Oracle Batching using
the same record count.

It would be more analogous to pass the whole array with a LIMIT, but I'm concerned
about the effect on the SGA of passing Arrays of unconstrained size from external
Java, or even from the internal JVM.


Rob, May 28, 2008 - 10:27 pm UTC

Tom Starr,

Your Java code is truncated. Would you mind posting the rest of it?

Looking for the same discussion taking concurrency into consideration

Russ Chamberlain, August 29, 2008 - 12:42 am UTC

Hi Tom,

This thread is extremely informative, but I have an unanswered question I believe is very closely related: Do the results above show the same trends under concurrent load?

I fear I may be changing the topic too much by removing the BATCH-ness of it, but I have a customer who is demanding massive, _concurrent_ inserts through our product. Note that the Oracle system, including tables and columns, is _completely_ under the customer's control.

We have performed very simple performance testing on a table with a single integer column, as follows:

1000 batches of 100 rows, 1 client
Type Time (s)
----------------------- -------
direct insert statement 8
stored procedure 43
varray stored proc 11


1000 batches of 100 rows, 10 clients
Type Min, Max, Avg (s)
----------------------- ---- ---- ------
direct insert statement 104, 140, 131
stored procedure 452, 541, 511
varray stored proc 114, 150, 141

NOTES
- The customer needs stored procedures because they sometimes (rarely) want to do something other than an insert, but most of the time, and for most of the operations we process, they want to do inserts -- ie. they want to apply non-trivial business logic.
- The stored procedure technique is using wide operations to a stored procedure that is effectively invoked once per row
- The varray technique is using FORALL (ie. procedure + varray + FORALL)

As expected, concurrency adds contention that reduces throughput somewhat, but our simple results clearly show that using varray with FORALL in a stored procedure is significantly closer to direct insert performance than the simple stored-procedure approach. This is encouraging.

Our customer is asking us to do _significant_ work to change our code to do this, but I need to be sure it is worth the effort. As noted in this discussion thread, it is terribly important to do specific, representative testing on the system in question. Unfortunately, the customer doesn't have the resources to do this testing in advance, so they're asking us to make the big change and send them something to play with.

To me it isn't very surprising that we see good performance with the varray + FORALL technique, because it takes a more efficient path through Oracle. However, I'm not an Oracle expert nor a DBA. Our product uses OCI to execute _customer_ scripts concurrently, so they're the Oracle experts in our deployments. Therefore I am compelled to ask the following question:

Can we always expect procedure + varray + FORALL to perform significantly faster than procedure + wide calls under concurrent load, or are there any gotchas?

I understand that testing and tuning are required, of course, but I need to know the approach is fundamentally sound for concurrent use.

Thanks,

- Russ
Tom Kyte
August 29, 2008 - 11:22 pm UTC

well, the stored procedure could do array processing itself.


...
- The stored procedure technique is using wide operations to a stored procedure
that is effectively invoked once per row
.....

this does not compute, that was a decision made at design time, not a feature of stored procedures.


why could the stored procedures not do the processing itself, save up arrays and bulk insert/update/whatever them.


direct pathing and concurrent are oxymoronic. One session at a time can direct path into a given segement.

running a batch file through pl/sql procedure

krishna, September 25, 2008 - 3:01 am UTC

Hi tom,

I want to run a batch file through pl/sql procedure.
please help me on this.
Tom Kyte
September 25, 2008 - 3:15 pm UTC

and what is your own definition of a "batch file", I'm not aware of the concept.

calling a .bat file from pl/sql procedure

krishna, September 26, 2008 - 12:25 am UTC

Tom,

I want to call a .bat file from a pl/sql procedure.
please tell me the command to call a .bat file.
Tom Kyte
September 26, 2008 - 1:13 pm UTC

there is no "command", but depending on release - there are ways to do it.

9i and before, java stored procedure.
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:952229840241#2717585194628

10g and later, java stored procedure OR dbms_scheduler



Is direct path loading available from Java?

Oleksandr Alesinskyy, February 26, 2009 - 5:40 am UTC

Hi Tom,

it is a new question but it is a tightly related to the contents of this thread, so I hope it is Ok to ask it here.

Oracle contains direct path loading engine accessible from C/C++ code via OCI (with its own set of handle types, functions etc.).

Is there a way to access the same engine from Java without starting external process or resorting to JNI (Java Native Interface)?

Thanks in advance,
Oleksandr
Tom Kyte
March 03, 2009 - 7:37 am UTC

using collections it is always possible:


insert /*+ append */ into t select * from table( cast( ? as yourtype ) );



Oleksandr Alesinskyy, March 03, 2009 - 10:46 am UTC

Not completely understand how your answer my be applied to the loading of a file (or programmatically generated data) from a client. To cast "?" as MyType I have firstly transfer this "?" to the server.
Tom Kyte
March 03, 2009 - 9:00 pm UTC

? is the typical bind placeholder in a jdbc client - isn't it?


bind a collection.

A reader, March 04, 2009 - 6:39 am UTC

Oh, so stupid from me.

Single insert

Slavko Brkic, April 09, 2009 - 9:07 am UTC

Have a question regarding single inserts from java.
We are doing a tranasaction containing a set of operations. 3 inserts and 3 updates (one transaction is fast). All of them are single row operations. Each of them are direct sql calls to the database from the java client.
This needs to be done more than a million times a day.
We are trying to improve performance for this and tried to put it all in a pl/sql procedure with all necessary data as in parameters and then calling this procedure from java.
We ended up with slower performane (about 2 times slower). I think it's because
we are calling direct SQL and dont need the PL/SQL engine. Is this correct (something you would expect as well) or do you think we are doing something wrong.

BR
Tom Kyte
April 13, 2009 - 3:38 pm UTC

I think your test was erroneous somehow, plsql would be faster - sending all of the data into the stored procedure and having it do the six operations as a single call would be more efficient than six round trips from the client.

give us a test framework, you can write a java client to do six modifications, and another to call a stored procedure to do the same. Provide the code so we can see how you interacted with the database and so we can reproduce.

I think your test was flawed in some fashion, we just need to see your code to tell you where.

and remember, this example is TINY, the code you supply should be TINY. Do not post your production application with 10,000 lines of code, this should fit on the back of a cocktail napkin (it is small)

sudhakarReddy, February 09, 2010 - 11:12 am UTC


we are using JDBC Batchupdate for Batch file.so is there any limit of this file storing the data into database.one batch file how much data is storing.
Tom Kyte
February 15, 2010 - 2:21 pm UTC

"this file storing" - what is "file storing"

what is a 'batch file'?


JDBC batching allows the client to send N rows, where N is greater than or equal to one (more than one typically), at a time to the server, it has nothing to do with "files". Not sure what you mean.

You would be limited by memory on the client and common sense.

It would not make sense to batch up 1,000,000 rows - that would be not a smart thing to do. It would make sense to batch up some reasonable number of rows, say between 100 and 1,000, at a time.

setExecuteBatch in Java ArrayBindCount

George Joseph, January 16, 2012 - 10:06 pm UTC

Hi Tom

With regards to the setExecuteBatch command in JDBC is this command the same as the LIMIT value that is being used in PL/SQL bulk collect?
And is the logical equivalent of ArrayBindCount in .NET same as the LIMIT clause in pl/sql.

Also in your previous review you mentioned that the value of batch size N is limited by the memory on the client.Can u explain a bit further on client memory( do u mean the memory associated at the application server side by the process that performs the batch update?)
Tom Kyte
January 17, 2012 - 3:23 pm UTC

it is similar to the LIMIT clause in plsql - but LIMIT works with SELECT/FETCH whereas the execute batch is typically insert/update/delete - so it is more like "forall i in 1 ..N" really.


I don't to .nyet - but it sounds like it is.


Can u explain

second time in two questions this U person has been asked for. I can assure you I am the only one that logs in to answer here - sorry, U is no one I know....


Also in your previous review you mentioned that the value of batch size N is
limited by the memory on the client.,


if you try to array fetch 1,000,000 rows and you haven't memory sufficient to hold 1,000,000 rows - you will "crash".

If you are using an application server, the client is the application server, the client of the database.

RE:

George Joseph, January 17, 2012 - 9:36 pm UTC

Thanks Tom for the response.
Regarding the ArrayBindCount in ODP.NET i couldn't find information about the optimal value to be used.
Even oracle documentation doesn't explicitly say about the usual values to be used.

I found a link
http://dotnetslackers.com/articles/ado_net/BulkOperationsUsingOracleDataProviderForNETODPNET.aspx
where they say...
"To use Array Binding, you simply set OracleCommand.ArrayBindCount to the number of records to be inserted, and pass arrays of values as parameters instead of single values"
It didn't look right to me that they didn't mention about the memory usage at the client side and wanted to confirm my understanding

my sincere apologies about the u :)

Tom Kyte
January 18, 2012 - 7:12 am UTC

if there were an optimal value for any setting, there would be no way to make the setting different as all you would be doing is setting it to a non-optimal value... (why would we let you change it if it was "optimal" :) )

typically, 100 is a nice value for array operations - typically between 100 and 500 (with larger values, greater than 500, only really valid for bulk binds of an insert in a data warehouse type of load)


Not everything will mention everything. Some things sometimes have to be intuited.


In your case, I cannot imagine a human being updating more than 100 records - or dealing with a screen that has 100 records really (seems you would display maybe 25 at a time and let them page up and down wouldn't you...) so it should not be an issue at all - you won't have that many records ever to update since you are relying on a human being to give you the changes....

More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library