Ron, March 21, 2002 - 7:56 am UTC
Tom
I have your book. Im working with a java application that
dynamically generates sql against views using Statements
rather that PreparedStatements as in the example below.
I've used the cursor_sharing=force parameter to try and get some cursor sharing benefits.
SELECT
glnk_link_id_n,
glnk_description_c,
gloc_from_location_id_n,
gloc_from_location_c,
gloc_to_location_id_n,
gloc_to_location_c,
glin_line_full_name_c
FROM
vlnk_link_vw
WHERE
gloc_from_location_id_n = :"SYS_B_0" AND
gloc_to_location_id_n = :"SYS_B_1"
ORDER BY glin_line_full_name_c
Looking at the parse to execute ratio for the above
it is 120 to 120 or 1:1. No benefit from the cursor sharing
=force? Is this a feature of views or java Statements being dynamically generated? Am I wasting my time with cursor_sharing=force in this type of environment ?
March 21, 2002 - 4:10 pm UTC
That is because you are parsing the statement 120 times ( i point this out in my book as well -- I DON'T LIKE cursor sharing. it only gets you PART WAY there and is NOT the end solution, just a temporary bandaid till you write your code properly-- using bind variables).
At least 119 of the 120 are soft parses (not as bad as a hard parse but not NEARLY as good as NO parse). Use tkprof and look at the misses in library cache or use statspack to see the soft parse ratio (covered in chapter 10)
Are you wasting your time? Yes -- if you could be using this time to fix your code to use a prepared statement that you parse ONCE per session and reuse over and over again.
Parsing+Prepared Statement
Ashok, August 01, 2003 - 11:19 am UTC
I am using prepared statement to insert multiple rows(39731 rows)the way i am doing is like this
Parse
loop
bind
execute
end loop
But when i looked into TKPROF i have got 4598 parses and 4598 executes.I was excepting one parses and mutilple executes.I am assming majority of the parses are soft parses.i assumed that if i keep my cursor open(i.e Prepare statement only once) oracle will excute with going for soft parse or softer softer parse.
TKPROF Output:
INSERT INTO tr_ap_ishare_curr
values
(:1,:2,:3,:4)
call count cpu elapsed rows
---- ------ -------- ---------- ---------
Parse 4598 1.29 1.25 0
Execute 4598 2.25 5.33 39731
Fetch 0 0.00 0.00 0
------- ------ -------- ---------- ----------
total 9196 3.54 6.58 39731
Misses in library cache during parse: 1
Optimizer goal: RULE
Parsing user id: 472
Thx
Ashok
August 01, 2003 - 11:37 am UTC
language? code looks like what?
also might try array interface
Jim, August 01, 2003 - 12:45 pm UTC
A lot of the apis to interface with Oracle (oci, oo4o, etc.) allow you to insert multiple rows at one time. Usually this is under the documentation concerning the array interface. I have seen excellent throughput when using the array interface for retrieving or inserting multiple rows at a time.
Parsing+Prepared statement
Ashok, August 01, 2003 - 1:53 pm UTC
COde In Java:
while(lrs_dest.next())
{
if ( lstmt_ins_source == null )
{
lstmt_ins_source = lcon_source.prepareStatement(ls_insinv);
}
lstmt_ins_source.setInt(1,lrs_dest.getInt(1));
lstmt_ins_source.setInt(2,lrs_dest.getInt(2));
lstmt_ins_source.setFloat(3,lrs_dest.getFloat(3));
lstmt_ins_source.setFloat(4,lrs_dest.getFloat(4));
/*DO it in batch so we can avoid context switching*/
lstmt_ins_source.addBatch();
}
lstmt_ins_source.executeBatch();
lstmt_ins_source.clearBatch();
}
Parsing+prepared statement
Ashok, August 01, 2003 - 3:01 pm UTC
>>I don't like that implementation -- just use a normal >>prepared statment and this:
What is wrong in the code.What do you meant by normal prepared statement.Can you please explain it .
August 01, 2003 - 3:08 pm UTC
it is the implementation of the addBatch -- each and every statement added by an addBatch could be different -- therefore, they parse it each time when you batch.
the code is acting as it should (which is unfortunate). using the other method will achieve better results.
Parsing+PreparedStatement ;-Great Tom
Ashok, August 01, 2003 - 3:25 pm UTC
Great that worked as expected,one parse and 3695 executes
I didn't tought that Addbatch was culprit there.
Tom can you please explain why AddBatch Increase the parse count.
August 01, 2003 - 4:28 pm UTC
because you can:
declare a prepared statement
parse stmt1 with it
add it to the batch
parse a different stmt2 with it
add it to the batch
and so on
EVERY statement batched could be a totally different statement
Prepared Statement+Parsing
Ashok, August 01, 2003 - 3:29 pm UTC
If i use Oracle Prepared statement + Batch Updates, will the parse count will be one (i.e "One parse" and Many executes)
I typed previous reviews without reading your feeback on my first review.Thanks for addbatch response.
August 01, 2003 - 4:29 pm UTC
it is all about tkprof.... use it and you'll know ;)
Using generic PreparedStatement is ok
Peter Tran, October 03, 2003 - 4:55 pm UTC
Tom,
Correct me if I'm wrong, but using the generic PreparedStatement should be ok and result in only one parse (hard if it's a new SQL).
E.g. This is one from something you helped me with earlier. My logic here is:
get connection
create PS
while ()
{
ps.setXXX();
ps.addBatch()
}
call ps.executeBatch();
ps.close();
close connection;
--------------
UPDATE
fe_exm_observation_values SET UnconstrainedValue=:1
WHERE
DepthPos=:2 AND OBS_ID=(OBS_ID FROM fe_exm_observations
WHERE LeafKey=:3 AND RowPos=:4 AND ColPos=:5 )
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.49 0.66 17 2753 412 400
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.49 0.66 17 2753 412 400
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 70
Rows Row Source Operation
------- ---------------------------------------------------
400 UPDATE (cr=1153 r=5 w=0 time=73985 us)
400 PARTITION HASH SINGLE PARTITION: KEY KEY (cr=1151 r=3 w=0 time=43456 us)
400 INDEX UNIQUE SCAN FE_EXM_OBSERVATION_VALUES_PK PARTITION: KEY KEY (cr=1151 r=3 w=0 time=42621 us)(object id 150243)
Note, I only do one soft parse, but I update 400 rows, because I call addBatch() 400 times with different bind values. This is using the 9.2.0.3 Oracle driver.
BTW, can you check that link above? I takes me to the documentation for all documents.
Thanks,
-Peter
October 04, 2003 - 9:45 am UTC
clear indication that things change over time. apparently the newest drivers have been optimized somewhat.
the "this" i referring to, before they moved the documentation yet again on me, was to set the "batch" size on a prepared statement -- you can say "you know, only really execute every N times or upon commit, please say N executes up before sending them over to the server"
It is in the java developers guide.
CallableStatement, returning clause, executeBatch
Srikanth Adiga, January 16, 2004 - 4:53 am UTC
I have the following issues with JDBC:
1. Using RETURNING INTO clause: Is CallableStatement the only way to achive this throught JDBC. Unfortunetly, CallableStatement only appears to support batch updates where as it does not.
2. Return value of executeBatch with CallableStatement: I am facing a pecular issue with executeBatch. It returns 2 when the batch had 3 inserts, even when all 3 were inserted successfully? I am running my app on a 9i box and inserted into a remote 8i db.
Btw, I face this on one test machine and it runs fine on another?
Any ideas?
January 16, 2004 - 11:09 am UTC
1) to get outputs -- you use callable statements
2) nope, cause I'm
a) not seeing a simple, small, yet 100% complete concise test case
b) not a java programmer
Nasir, April 23, 2004 - 3:58 pm UTC
We use bind variable in sql statement in our inHouse apps written in java. Does it means that sql statement is loaded in shared pool once and then use it agin when it is called again from memory ie SGA (soft parse)? Is it per session or is is shared between sessions?
Our web based apps is used by many concurrent users.
April 23, 2004 - 7:29 pm UTC
it is hard parsed once.
it is soft parsed and reused (shared) as many times as you parse it after that.
To Shrikant Adiga's query
Menon:), April 23, 2004 - 9:21 pm UTC
"1. Using RETURNING INTO clause: Is CallableStatement the only way to achive this
throught JDBC. Unfortunetly, CallableStatement only appears to support batch
updates where as it does not.
2. Return value of executeBatch with CallableStatement: I am facing a pecular
issue with executeBatch. It returns 2 when the batch had 3 inserts, even when
all 3 were inserted successfully? I am running my app on a 9i box and inserted
into a remote 8i db.
"
Q1 - I am not sure I understand your question completely. However from the ORacle JDBC Developer's Guide and
Referenece" for 9i
"Note: The Oracle implementation of standard update batching does not implement true batching for generic statements and callable statements. Although Oracle JDBC supports the use of standard batching syntax for Statement and CallableStatement objects, you will see performance
improvement for only PreparedStatement objects."
This means that if you want to take advantage of update
batching, you have to use PreparedStatement. CallableStatement does not really support it (it assumes
batching size of 1.)
Q2 Given that the CallableStatement does not support batching - its return value is not really reevant any more. Further clarification can be given if you
provide a small example.
Prepared Statement for generic query
Bijay, May 06, 2004 - 7:49 pm UTC
How can I use prepared statement for generic query where user is given an option to select query condition. For
example:
Following is the basic query.
SELECT EMPNO, ENAME, DEPTNO, JOB, SAL FROM EMP
I would like to give users option to define their own operator and values. For eg. one user might want to get report based on JOB = "MANAGER" whereas other user might want to select SAL>1000 and other ENAME like "AL%".
I am thinking of providing users an HTML interface where users will have following rows in the UI
FIELD OPERATOR VALUE
===========================================
1. EMPNO OPERATOR [TEXTFIELD FOR VALUE]
2. DEPTNO OPERATOR [TEXTFIELD FOR VALUE]
3. JOB OPERATOR [TEXTFIELD FOR VALUE]
4. SAL OPERATOR [TEXTFIELD FOR VALUE]
[SUBMIT BUTTON]
Where operator can be any one of <, <=, !=, LIKE, >, ...
User can decide to give one or more condition for the query.
In this case how can I write Prepared Statement, as prepared statement expects value for each '?'
Thank you always for helping us all.
Bijay
May 07, 2004 - 7:10 am UTC
same way you would build a 'static query with literals'
your "run this query" subroutine (ok, 'method') will get:
string -> 'select ..... from emp'
array of fields
array of operators
array of values
bad code would perform this:
query_string = input_string + " where 1=1 "
for( i = 0; i < number in array; i++ )
{
query_string = query_string + " and " +
fields[i] + operators[i] + "'" + values[i] +"'"
}
good code would:
query_string = input_string + " where 1=1 "
for( i = 0; i < number in array; i++ )
{
query_string = query_string + " and " +
fields[i] + operators[i] + "?"
}
prepare the statement
for( i = 0; i < number in array; i++ )
{
bind the i'th column using values[i];
}
even better code would pass a 4th array that indicated whether you should use to_date or to_number on the string, eg:
query_string = input_string + " where 1=1 "
for( i = 0; i < number in array; i++ )
{
query_string = query_string + " and " +
fields[i] + operators[i];
if ( datatype[i] = 'number' )
query_string = query_string + "to_number(?)"
elseif ( datatype[i] = 'date')
query_string = query_string + "to_date(?, 'dd-mon-yyyy')"
else
query_string = query_string + "?"
}
prepare the statement
for( i = 0; i < number in array; i++ )
{
bind the i'th column using values[i];
}
PreparedStement -- vs CreateStatement
Bijay Tuladhar, May 07, 2004 - 5:19 pm UTC
Thanks for the follow up Tom.
The main idea of having PreparedStatement is that we have precompiled SQL available. Now, if different user selects different attributes and operators, for each new user, the SQL still has to be compiled. I see the advantage in long run if lots of users try to query based on same condition, but for system with moderate number of queries, wouldn't it be more efficient just to use CreateStatement instead of PreparedStatement, as it is more simpler?
Thank you always.
Bijay
May 10, 2004 - 6:28 am UTC
It is not any simplier.
It is
o more buggy (dealing with quotes and any other special characters)
o less secure (google "sql injection")
o about the same amount of code
hmmm.
Seriously, all we are adding is:
for( i = 0; i < number in array; i++ )
{
bind the i'th column using values[i];
}
so tell me, how *is a statement simplier* given that you have to look for quotes and be prepared to deal with them? that you have to validate inputs -- to avoid sql injection issues.
Let me change my mind actually -- it would be LOTS MORE code with statements. If you do it correctly. If you don't, you'll have sporadic failures and be open to a security risk.
I wish jdbc didn't have statements, period.
i would like to ask another
Gabriel H. Manuel, June 01, 2004 - 5:28 am UTC
say i have these methods:
public object getXXX(int id){
return (object)getXXX("id="+id).get(0);
}
public ArrayList getXXX(String where){
/*uses a prepared statement but actually functions more like an ordinary statement*/
sql.append("SELECT blah blah ");
sql.append("WHERE " + where);
PreparedStatement stmt = Conn.prepareStatement(sql);
return stmt.executeQuery();
}
say the first method is called multiple times, would PreparedStatement still work faster than the ordinary statement?
for performance issues, would it be better to have another PreparedStatement in the first method and avoid passing things around?
June 01, 2004 - 8:38 am UTC
I would throw this code away and send the programmer back to class.
no bind variables :(
but short of that, I've no idea what I'm looking at here. don't know what that first subroutine is intended to do and how that could be compared to a prepared statement.
Interesting Thread
Matt, July 28, 2004 - 7:39 am UTC
We have hit an interesting problem with the use of prepareStatement. We are storing SQL in the database, as follows:
SELECT <stuff>
FROM dept_emp_intersect
WHERE dept_id = :dept_id
AND emp_id = :emp_id
Our idea is to then generate the Java code for preparedStatement on the fly:
SELECT <stuff>
FROM dept_emp_intersect
WHERE dept_id = ?
AND emp_id = ?
However at this stage we can't guarantee the position of the binds using:
stmt.setString(1, deptID );
stmt.setString(2, empID );
Does this call always have to be positional for binds? Is there any way we can bind using the original bind names?
July 28, 2004 - 8:29 am UTC
use plsql to return a result from a stored procedure -- is what I would do.
sorry -- just not a big java programmer myself.
To Matt
Menon, July 28, 2004 - 12:40 pm UTC
In 10g you can use Oracle extension methods to bind
by name for both prepared statement and callable statement.
For pre-10g, you can not use bind by names. Beware that
in some cases it seems to work (for both callablestatement
and prepared statement )but oracle doc says that this
is not a feature they support and hence strongly
discourage using this.
Why dont you use callable staetments instead - your
real sql still resides in database and you just
invoke anonymous package procedures from Java...
July 28, 2004 - 1:31 pm UTC
or use an application context..... no need to bind by position there.
To Matt - few othetr things
A reader, July 28, 2004 - 12:50 pm UTC
If you take the approach of storing sql in
db (or in java fr use with prepared stmt)
then a corollary of my note is that you should (read must)
use "?" instead of ":meaningfulname" and use
positional index binds. You can always increase
the readability of the code by using meaningful
constant names for the positional indexes
themselves as:
stmt.setString(DEPT_ID_INDEX, deptID );
stmt.setString(EMP_ID_INDEX, empID );
save the extra round trip.
bob, July 28, 2004 - 1:17 pm UTC
i can't imagine the point of returning sql text to the java app when you can open a ref cursor in pl/sql using the stored sql and return the open ref cursor as a resultset to the java.
Otherwise you make one trip to the db to get the sql, and than other trip to execute the statement and get a result set.
excellent point bob!
Menon, July 28, 2004 - 1:23 pm UTC
I agree - best solution is to use callable statement
or prepared statement..
to tom
Menon, July 28, 2004 - 2:12 pm UTC
hi tom
"or use an application context..... no need to bind by position there."
how would you avoid positional binds in JDBC if you use
application contexts?
Thanx!
thanx!
Menon, July 29, 2004 - 10:43 am UTC
Thanx Tom
"instead of "where x = ?"
"where x = sys-context('my-ctx','x')"
and you plop x into the context -- don't even "bind" procedurally. "
However to "plop x into the context", wont you have
to ultimatelyp bind if you are working from a client
such as JDBC/Pro*C?
July 29, 2004 - 1:01 pm UTC
I'm thinking it would all be done nicely in plsql with java just getting the result set back personally.
I see..
Menon, July 29, 2004 - 1:07 pm UTC
"I'm thinking it would all be done nicely in plsql with java just getting the
result set back personally. "
I guess that may not work in all cases. For example
if I want to bind for a select that gets me details
of employee number 7777 from JDBC, this wont work...
ultimately, looke like, if you have all the info in the database
to get the data then within pl/sql calls you dont need
to use parameters - you can use the sys-context for
that...
thanx!
some questions regarding statement
A reader, August 02, 2004 - 9:55 pm UTC
hi tom
do you the what we mean when we refer to a statement's state, data and metadata?
thanx!
August 03, 2004 - 7:36 am UTC
to what are you referring exactly?
"do you the what we mean"?!?
the only hits on metadata on this page are your followup, so not sure what you are referring to or asking.
sorry...
A reader, August 03, 2004 - 9:53 am UTC
what I meant to ask was:
do you know what we mean by a JDBC statement's state,
its data and its metadata? If so can you give some
examples?
Thanx!
August 03, 2004 - 10:07 am UTC
when you prepare a statment -- you are setting up "an environment" of sorts.
you have a cursor perhaps, bind variable values -- a "state", a set of variables that means something.
Alex Yakovlev, September 19, 2004 - 11:45 am UTC
Oracle JDBC setExecuteBatch does not works with me :(
conn.setAutoCommit( false );
pstm = conn.prepareStatement(
"insert into test_log(LOG_MSG)values(?)" );
((OraclePreparedStatement)pstm).setExecuteBatch(30);
for( int i=0; i<5; i++ ) {
for( int j=0; j<5; j++ ) {
pstm.setString(1,"test "+i+"/"+j);
pstm.executeUpdate();
}
((OraclePreparedStatement)pstm).sendBatch();
conn.commit();
}
pstm.close();
conn.setAutoCommit( true );
I check parse_calls before and after this with:
SELECT sql_text,executions,parse_calls
FROM v$sqlarea
WHERE sql_text LIKE 'insert into test_log%'
And parse_calls increases by 5, not by 1 as expected.
Without setExecuteBatch it increases by 1 :-(
September 19, 2004 - 1:11 pm UTC
gotta love *full* examples, like I give you guys -- sure would make it *easier*
anyway, 9204, linux, test.java:
import java.sql.*;
import oracle.jdbc.driver.*;
import oracle.sql.*;
class test
{
public static void main(String args[])throws Exception
{
DriverManager.registerDriver
(new oracle.jdbc.driver.OracleDriver());
Connection conn=DriverManager.getConnection
("jdbc:oracle:thin:@localhost.localdomain:1521:ora9ir2"
,"ops$tkyte","foobar");
conn.setAutoCommit( false );
Statement stmt = conn.createStatement ();<b>
stmt.execute( "alter session set sql_trace=true" );
</b>
PreparedStatement pstm = conn.prepareStatement(
"insert into test_log(LOG_MSG)values(?)" );
((OraclePreparedStatement)pstm).setExecuteBatch(30);
for( int i=0; i<5; i++ ) {
for( int j=0; j<5; j++ ) {
pstm.setString(1,"test "+i+"/"+j);
pstm.executeUpdate();
}
((OraclePreparedStatement)pstm).sendBatch();
conn.commit();
}
pstm.close();
}
}
ops$tkyte@ORA9IR2> create table test_log( log_msg varchar2(50) );
Table created.
[tkyte@tkyte-pc-isdn j]$ java test
[tkyte@tkyte-pc-isdn j]$
<b>tkprof for me shows:</b>
insert into test_log(LOG_MSG)
values
(:1)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 5 0.01 0.00 0 1 12 25
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 6 0.01 0.00 0 1 12 25
<b>1 parse, 5 executes, 25 inserted rows</b>
ops$tkyte@ORA9IR2> l
1 SELECT sql_text,executions,parse_calls
2 FROM v$sqlarea
3* WHERE sql_text LIKE 'insert into test_log%'
ops$tkyte@ORA9IR2> /
SQL_TEXT
-------------------------------------------------------------------------------
EXECUTIONS PARSE_CALLS
---------- -----------
insert into test_log(LOG_MSG)values(:1)
10 2
ops$tkyte@ORA9IR2> !java test
ops$tkyte@ORA9IR2> /
SQL_TEXT
-------------------------------------------------------------------------------
EXECUTIONS PARSE_CALLS
---------- -----------
insert into test_log(LOG_MSG)values(:1)
15 3
<b>Perhaps you are experiencing "olden days" behavior, as a matter of fact, using the way back machine to test with:</b>
ops$tkyte@ORA817DEV> SELECT sql_text,executions,parse_calls
FROM v$sqlarea
WHERE sql_text LIKE 'insert into test_log%';
2 3
SQL_TEXT
-------------------------------------------------------------------------------
EXECUTIONS PARSE_CALLS
---------- -----------
insert into test_log(LOG_MSG)values(:1)
10 10
ops$tkyte@ORA817DEV> !java test
ops$tkyte@ORA817DEV> /
SQL_TEXT
-------------------------------------------------------------------------------
EXECUTIONS PARSE_CALLS
---------- -----------
insert into test_log(LOG_MSG)values(:1)
15 15
<b>so the olden behavior of the jdbc drivers way back in the last century, was to do that. Suggest you give the current copies a try if this is an identified performance issue</b>
Thanks, Tom, upgrading JDBC drivers really helped!!
Alex Yakovlev, September 21, 2004 - 12:51 am UTC
"use prepared statements EVERYWHERE"... simply not true.
Sam S., April 12, 2006 - 5:36 pm UTC
On page 444 of the "Java Programming with Oracle JDBC" book, they provide testing results for running INSERTS using Statements vs. PreparedStatements. In a nutshell, PreparedStatements less efficient until you reach 40+ inserts.
Results:
1 Insert using STMT: 10 milliseconds
1 Insert using PS: 117 milliseconds
1,000 Inserts using STMT: 2,583 milliseconds
1,000 Inserts using PS: 367 milliseconds
So instead of using "SQL Injections" as your defense (which no good programmer leaves himself exposed to anyway), do you have any concrete facts or numbers to support your "use PS EVERYWHERE" claim?
April 12, 2006 - 7:49 pm UTC
that book is so very very very flawed. it is outrageously flawed. horrifically flawed. really really *bad*. I've seen that book, makes me cringe everytime someone mentions this particular chapter. ugh.
This site - asktom.oracle.com - is a virtual TRIBUTE to the prepared statement and bind variables. I think there are really only about 5 questions to be asked about Oracle (and the others are all variations on a theme).
3 - maybe 4 - of the answers to these five questions is: USE BIND VARIABLES.
Here is an excerpt from one of my books. Benchmarking is good - it is good to publish the benchmarks so people can see them (the code) and see what might be wrong with them. That is what happened here - flawed test.
(as an aside - what really amused me about your posting here was the laughable comment:
So instead of using "SQL Injections" as your defense (which no good programmer
leaves himself exposed to anyway)
hehehe, fell out of my chair when I read that. Must mean by definition "there are no good programmers" since SQL injection is rather insidious and in my polling of audience after audience - maybe 10% (MAYBE 10%) of the audience feels that they could define sql injection. Hard to not expose yourself to something you don't know about!!!!! But thanks much for the laugh, it was a good one)
<quote src=Effective Oracle by Design, chapter 5>
Use Bind Variables with Java and VB
Programmers who use Java with JDBC may wonder if the bind variables recommendation applies to them. Specifically, whether to use PreparedStatement, which allows you to use bind variables, or use Statement, which requires the string concatenation approach. This discussion applies equally to VB programmers using VB with ODBC, because the concept of Statement and PreparedStatement exists in ODBC in the same fashion.
Ask Tom you can search around for this, you are NOT the first person to ask this question...
Please briefly skim this link [link omitted] which gives an excerpt for JDBC performance. It says always use Statement (no bind variables allowed) instead of PreparedStatement because it performs better, without discussing the impact on the database, only in terms of a single app metrics. Is this accurate, or is this information just extremely shortsighted with regards to overall database impact?
Look at the proof from my Runstats testcase closed. Using PreparedStatement with bind variables is absolutely the only way to do it. Later on, I got a follow-up.
For a moment, keep aside shared pool, hard parse, and soft parse and talk about PreparedStatement and Statement as they are the only way to execute statements from Java. I wrote this benchmark code that shows a Statement performs better than a PreparedStatement, unless you execute the same statement a whole lot of times. So, I reproduced the findings of the above link and prove that Statement is better than PreparedStatement.
Well, if we ignore shared pool, hard parse, and soft parse, we totally miss the boat on this topic, because they are the only things to consider. The facts are hard parsing incurs many latches, latches are serialization devices, and serialization is not a scalable thing. Therefore, as you add users, the system that uses Statement instead of PreparedStatement with bind variables will fail. But, even though I quite simply could not observe this request to put aside the shared pool and hard/soft parse, I had to explore it further.
So, starting with their benchmark code, which simply inserted into a database table, I made it a multiuser benchmark to demonstrate the fact that if you expand this simple, single-user benchmark out to a real-world example with multiple users, you will see clearly what the issue is and why you need to avoid Statement. But, an interesting thing happened. I could not reproduce their findings! Running their code, I found that a single statement executed using either Statement or PreparedStatement took the same amount of time initially, and after executing the SQL over and over, PreparedStatement was always much faster. This conflicted with their observations totally. The problem was a flaw in their test. It failed to do an initial load of the PreparedStatement class.
End Ask Tom
An Initial Java Test
To demonstrate the use of Statement versus PreparedStatement, as well as how a seemingly simple, convincing test can be the most misleading thing in the world, well walk through a simple benchmark. Well use the single table TESTXXXPERF, which was created using the script perftest.sql, as follows:
scott@ORA920> drop table testxxxperf;
Table dropped.
scott@ORA920> create table testxxxperf
2 ( id number,
3 code varchar2(25),
4 descr varchar2(25),
5 insert_user varchar2(30),
6 insert_date date );
Table created.
scott@ORA920> exit
The main Java code consists of basically three subroutines: a main routine that connects to the database and then calls a routine to insert into that table using Statement, and then calls a routine to do the same with PreparedStatement.
Here is the main routine:
import java.sql.*;
import oracle.jdbc.OracleDriver;
import java.util.Date;
public class perftest
{
public static void main (String arr[]) throws Exception
{
Connection con = null;
DriverManager.registerDriver(new oracle.jdbc.OracleDriver());
con = DriverManager.getConnection
("jdbc:oracle:thin:@aria-dev:1521:ora920", "scott", "tiger");
con.setAutoCommit(false);
Integer iters = new Integer(arr[0]);
doStatement (con, iters.intValue() );
doPreparedStatement(con, iters.intValue() );
con.commit();
con.close();
}
This routine simply connects to my Oracle9i Release 2 instance as SCOTT/TIGER and disables the autocommit JDBC uses by default. Next, it invokes the subroutine to execute a Statement N times and then a PreparedStatement N times. I set it up to allow us to pass N into the Java routine so we can run multiple simulations.
Next, look at the doStatement routine:
static void doStatement(Connection con, int count)
throws Exception
{
long start = new Date().getTime();
Statement st = con.createStatement();
for (int i = 0; i < count; i++)
{
st.executeUpdate
("insert into testxxxperf " +
"(id, code, descr, insert_user, insert_date)" +
" values (" + i + ", 'ST - code" + i + "'" +
", 'St - descr" + i + "'" + ", user, sysdate ) ");
}
long end = new Date().getTime();
st.close();
con.commit();
System.out.println
("statement " + count + " times in " +
(end - start) + " milli seconds");
}
This routine simply creates a statement object and then loops count times and builds a unique (never before seen) INSERT statement and executes it. It is somewhat scaled back from reality, in that it is not checking for quotes in strings and fixing them, but well let that go for now. Also note that it retrieves the time before and after executing the Statement and prints the results.
The third routine is doPreparedStatement:
static void doPreparedStatement (Connection con, int count)
throws Exception
{
long start = new Date().getTime();
PreparedStatement ps =
con.prepareStatement
("insert into testxxxperf " +
"(id, code, descr, insert_user, insert_date)"
+ " values (?,?,?, user, sysdate)");
for (int i = 0; i < count; i++)
{
ps.setInt(1,i);
ps.setString(2,"PS - code" + i);
ps.setString(3,"PS - desc" + i);
ps.executeUpdate();
}
long end = new Date().getTime();
con.commit();
System.out.println
("pstatement " + count + " times in " +
(end - start) + " milli seconds");
}
}
This is basically the same as the doStatement routine, but it uses PreparedStatement to insert count rows instead of Statement.
Lastly, we use a shell script to execute this code:
!#/bin/csh -f
sqlplus scott/tiger @perftest
java perftest $1
A CMD file for Windows might look like:
sqlplus scott/tiger @perftest
java perftest %1
Now, we run this with inputs of 1 (do one Statement/PreparedStatement), 10, 100, and 1,000. We get the following results:
Rows to Insert Statement PreparedStatement
1 0.05 second 0.92 second
10 0.34 second 1.03 seconds
100 2.69 seconds 2.35 seconds
1,000 26.68 seconds 15.74 seconds
From these results, we might conclude that if we are not going to execute the same statement over and overabout 100 timeswe would best be served by using Statement. The problem is there is a flaw in our test! Lets find out why.
A Multiuser Test
To test the code for a multiuser setup, we need to do a bit of rewriting. In a multiuser test, using System.out.println is not a very scalable testing tool, because it is hard to collect and analyze the results. So, lets set up a database table to hold the timing results. The slightly modified Java code has an extra subroutine named saveTimes to save the timing information into the database.
static PreparedStatement saveTimesPs;
static void saveTimes( Connection con,
String which,
long elap ) throws Exception
{
if ( saveTimesPs == null )
saveTimesPs = con.prepareStatement
("insert into timings " +
"( which, elap ) values "+
"( ?, ? )" );
saveTimesPs.setString(1,which);
saveTimesPs.setLong(2,elap);
saveTimesPs.executeUpdate();
}
Then we modify the doStatement routine like this:
static void doStatement (Connection con,
int count) throws Exception
{
long start = new Date().getTime();
Statement st = con.createStatement();
for (int i = 0; i < count; i++)
{
st.executeUpdate
("insert into testxxxperf " +
"(id, code, descr, insert_user, insert_date)" +
" values (" + i +
", 'ST - code" + i + "'" +
", 'St - descr" + i + "'" +
", user, sysdate ) ");
}
st.close();
con.commit();
long end = new Date().getTime();
//System.out.println( "STMT" + " (" + (end-start) + ")" );
saveTimes( con, "STMT", end-start );
}
We modify the doPreparedStatement routine in the same way. This would simply save the times in a database table.
create table timings ( which varchar2(10), elap number );
This way, we can run a query to get average, minimum, and maximum times for multiple users.
After just commenting out the System.out.println lines and adding a routine to record the time, when we run this test in single-user mode, we get the following results:
Rows to Insert Statement PreparedStatement
1 0.05 second 0.05 second
10 0.30 second 0.18 second
100 2.69 seconds 1.44 seconds
1,000 28.25 seconds 15.25 seconds
Here, we see that there is not only no penalty ever for using a PreparedStatement, but it quickly benefits us, even in single-user mode. Thats very different from what the first test showed. What could be the cause of this surprising difference?
What Changed?
In this example, the code being timed was no differentnot a single byte of code was changed. Sure, we commented out System.out.println and added a call to saveTimes, but that code was never timed before. So, what did change?
It turns out the saveTimes routine is the culprit here. That code uses a PreparedStatement, which warms up the PreparedStatement class. It paid a one-time penalty to load that class (Java dynamically loads classes as you use them). The simple act of connecting did that for the Statement class (it is used during the connection to Oracle). Once the timing of the initial load of the PreparedStatement class was factored out, the tests show that a PreparedStatement is no more expensive to execute than a Statement is in JDBC.
The entire premise of a Statement being lighter weight and more efficient for small numbers of statements is wrong. If you use a single PreparedStatement anywhere in your code, you have paid this load penalty for all PreparedStatement uses.
The basic test itself was flawed because we were timing an unrelated thing. Since most nontrivial Java JDBC programs are going to use a PreparedStatement somewhere, they all pay this load penalty. Not only that, but this load penalty isnt a penalty at all; its simply the price of admission to building a scalable application on Oracle. If you dont use PreparedStatementif you insist on using Statement and gluing in the values, you are not only opening yourself up to the SQL injection security risk and buggy code, but you will also find that your application will not scale as you add users. There is no maybe here; there is no might not scale. Your application will not scale, period.
</quote>
always be aware of absolute statements
Phil Miesle, April 13, 2006 - 5:45 am UTC
Tom, you're of course right to keep 'carrying the torch' for bind variables. The question is really why do absolute statements like these:
o your app will not scale
o your app will run slow
o your app will fail eventually
not take root in people's brains? The answer is that the vast majority of developers are not writing applications for hundreds or thousands of concurrent users, so they don't see the ramifications of this problem.
That and a developer doesn't have to think (God forbid) as much when coding a Statement-based application...on the surface of it, Statements are easier to code. (I think that PreparedStatements are ultimately easier, once you understand them!)
If an app passes 'unit test' and 'system test' it is probably deemed 'fast enough' for a single user. If you only have at most 5 concurrent users, built-in scalability limits will probably never be seen.
Nor, for that matter, is multi-user performance likely to be a problem...a 2 CPU database server with a couple of disks and a gigabyte or two of memory would probably be vast over-kill for such an application with 5 users.
In other words, the developers who use Statement likely never see the implications of their poor decision because there frequently *aren't* any obvious ones. Sure, the app runs a little slower than it could...but nobody realizes it because the app is 'fast enough'.
******
This means that absolute statements (as above) are often not seen as being valid in 'the real world'. ('I used Statement, the app works just fine.')
The people who have been burned by Statement (like you and me) know better, but we're in a small minority.
******
In the absence of education about using Statement versus PreparedStatement, a developer will 99 times out of 100 take the route that appears to be 'easiest' and use Statement.
The reason that Statement appears easier is because that it is quite typically taught first. In school, we were eventually taught that multiplying 5 by 3 is easier than adding 5 + 5 + 5. We were taught addition first because it is fundamental to how multiplication works, but ultimately were taught to see multiplication as a 'faster' way of doing the operation.
Somehow the analogy here seems to carry forward into JDBC education: Statement is taught first (so is considered 'fundamental'), PreparedStatement is taught second. Since PreparedStatement is 'harder' to do and Statement works, developers stick with Statement.
(Never mind the fact that as far as the Database is concerned, PreparedStatement is really the fundamental!)
I blame both lazy authors and lazy readers for the problem. BUT I think we can fix this whole situation with technology. Three options:
1. Java should deprecate Statement from the JDBC spec. I would think these things are bad for all the main-stream databases.
2. Oracle JDBC drivers should implement Statement using PreparedStatement.
3. Oracle JDBC should put a 'sleep' in Statement to make it artificially slower.
If we have to keep Statement around, we should make it slow so that lazy authors end up advocating PreparedStatement, and explicitely state that 'Statement should only be used for education purposes, because it is slow.'
April 13, 2006 - 7:58 am UTC
you don't need hundreds of thousands of users.
You need ONE to start seeing this.
You need TWO to really start seeing this.
If this weren't a problem on systems of any scale - I'd shut up. It would be the exception that this is a problem - not an almost universal RULE.
...
That and a developer doesn't have to think (God forbid) as much when coding a
Statement-based application...
...
absolutely 100% false - the opposite is true. They have to think about SQL Injection (but rarely even understand that they have to), they have to think about special characters (quotes) and string lengths.
The problem is THEY DON'T actually do what they have to - hence sql injection is the fact, not the exception - quotes break programs - and people ask me "why cannot I cut and paste War and Peace into this sql statement - lobs should store 4 gig right".
I loved your #3 at the end though - that was perfect. I've often said we should ship the database with "fast=false" in the init.ora and have little "for i in 1 .. 100000" loops in the code. When fast=true, we skip them.
Phil M's suggestions
Phil Holland, April 13, 2006 - 1:34 pm UTC
#1 Probably wont happen, the Java Community Process teams seem reluctant to deprecate anything these days. The recent JDBC 4.0 specification still doesnt contain any performance consideration comments for Statement vs. PreparedStatement.
#2 This would at least fix the microbenchmarks but you could have the same effect by having the JDBC driver load both the Statement & PreparedStatement classes at the first use of the driver. This would eliminate the classload time associated with the first use of PreparedStatements and at least there would be an apples to apples comparison from the start.
#3 LOL, I can already forsee the posting in a Java forum by a developer who only uses Statements saying I switched from Oracle to DB2/MySQL/Derby/
and saw a 1000% increase in performance.
My preference is for a nice big warning message to be logged everytime a Statement is used, this is optionally switched off with a new driver method setStatementWarningMessages(false) which itself logs a nice disclaimer when called.
As to why were still preaching the "use binds" message? My take on it is that the modern developer seems reluctant to read documentation; they plow ahead in their all-encompasing IDEs with tunnel vision. Autocompletion gives them the library methods they need without any of the associated context for the correct use. If it compiles and passes unit-tests then it is often seen as "good enough." Thats why logging a warning message might work, as it will exist inside their IDE world. (If we log it to stderr it might even be in red!)
yeah, but!
Phil Miesle, April 14, 2006 - 6:38 am UTC
When I say "the developer doesn't have to..." I mean this in the sense of "it isn't required to make the program functionally do what it is trying to do." Of course, with Statement the program can do a lot more than it was ever designed to do (SQL injection!), but the program would work (-ish) as advertised, at least so far as a unit test is concerned.
A single user *can't* be used to show a scaleability problem, and I'd say one would be somewhat hard-pressed to demonstrate that Statement causes a scaleability problem at less than 10 concurrent users *WHEN* the test application is paced like a typical real-world app is paced. My copy of your book hasn't made it out of storage from my move, and since you didn't put your multi-user test in this thread I can't really pass a remark on your multi-user test.
[It is hard to comprehensively prove a negative though...]
Certainly hundreds OR thousands of concurrent users (hundreds of thousands of concurrent users would be something!) will show Statement to be an absolute non-starter with regards scaleability.
A single user can most definitely show a performance problem, as you illustrated:
===
Rows to Insert Statement PreparedStatement
1 0.05 second 0.05 second
10 0.30 second 0.18 second
100 2.69 seconds 1.44 seconds
1,000 28.25 seconds 15.25 seconds
===
I will do something to add to this discussion in the Tom Kyte way, give an example. One of my pet peeves is not using bulk operations...in other words, you should not only use PreparedStatement but you should also use bulk operations (update batching, in JDBC lingo).
Make the following changes to the Java code:
Add a new method to drop/create perfxxxtest each time:
static PreparedStatement resetTablePs;
static String sql;
static void resetTable(Connection con)
throws Exception {
sql = "drop table testxxxperf";
resetTablePs = con.prepareStatement(sql);
resetTablePs.executeUpdate();
sql = "create table testxxxperf";
sql += " ( id number,";
sql += " code varchar2(25),";
sql += " descr varchar2(25),";
sql += " insert_user varchar2(30),";
sql += " insert_date date )";
resetTablePs = con.prepareStatement(sql);
resetTablePs.executeUpdate();
}
(See! DDL with PreparedStatement...easy!)
Now, add a new method that will use bulk processing techniques:
static void doPreparedStatementBulk (Connection con, int count)
throws Exception
{
long start = new Date().getTime();
PreparedStatement ps =
con.prepareStatement
("insert /* bulk */ into testxxxperf " +
"(id, code, descr, insert_user, insert_date)"
+ " values (?,?,?, user, sysdate)");
((OraclePreparedStatement)ps).setExecuteBatch(100);
for (int i = 0; i < count; i++)
{
ps.setInt(1,i);
ps.setString(2,"PS - code" + i);
ps.setString(3,"PS - desc" + i);
ps.executeUpdate();
}
((OraclePreparedStatement)ps).sendBatch();
long end = new Date().getTime();
con.commit();
saveTimes(con, "PSTMTBULK",end-start);
}
Note that I added a "/* bulk */" comment there to force a hard parse. Note also the casting of the PreparedStatement object to an OraclePreparedStatement, and the fact that I set the batch to 100 on the ps object (you can do this on the connnection object but it would ruin my comparison).
Also, the sendBatch() call is optional, the batch would be sent with the commit() call...but if I did it implicitly the batch wouldn't ever be sent within the timer and I'd get near-zero times for small batches.
I need to import a few other things:
import oracle.sql.*;
import oracle.jdbc.*;
And I have a new main() method:
public static void main (String arr[]) throws Exception
{
Connection con = null;
DriverManager.registerDriver(new oracle.jdbc.OracleDriver());
con = DriverManager.getConnection("jdbc:oracle:thin:@mspperf02:1522:AIPSAND", "scott", "tiger");
con.setAutoCommit(false);
Integer iters = new Integer(arr[0]);
resetTable(con);
doStatement (con, iters.intValue() );
resetTable(con);
doPreparedStatement(con, iters.intValue() );
resetTable(con);
doPreparedStatementBulk(con, iters.intValue() );
con.commit();
con.close();
}
I also have ended up using a slightly different .sh script:
#!/bin/ksh
sqlplus scott/tiger <<-EOF
drop table timings;
create table timings ( which varchar2(10), elap number );
EOF
for i in 1 2 3 4 5 6 7 8 9 0; do
java perftest $1
done
sqlplus scott/tiger <<-EOF
select which, avg(elap) from timings group by which;
EOF
So, what does that do for the result:
(note I'm running on a different machine that Tom! IBM p690/Power4/1.3 GHz (4 CPU LPAR, AIX 5.3), and am also running RDBMS 10.2.0.1 and Java 1.4.2 with the 10gR2 JDBC drivers):
Rows Statement PreparedStatement PSBulk
1 0.02 s 0.02 s 0.02 s
10 0.43 s 0.04 s 0.02 s
100 1.42 s 0.17 s 0.16 s
1000 11.11 s 1.22 s 0.08 s
Now, that last cell isn't a typo...it's why I ended up doing a 10-test average in my .sh script!
Some VERY interesting things here. First, in Tom's example to 1000 row sample was about 2x slower using Statement than PreparedStatement. In my test, the difference was 10x. Is this a difference on the database, in the JDBC drivers, or a combination of the two...dunno. BUT there's basically a 10x difference across the board between Statement and PreparedStatement. Maybe the server guys went the other route from my #3 and just made PreparedStatement damn fast!!!
Second, using 'JDBC update batching' is never slower than not using it, and is generally faster. It is basically a transparent mechanism, except for rollback situations, and you need to think a little bit more about operation ordering (you can't insert in one statement and then select the value in the next, since the value may not be in the database).
Third, and the one I still can't get my head around...it is *faster* to insert 1000 rows in bulk than to insert 100 rows in bulk. It beggars belief. The batch size seems to have some bearing:
Batch Size PSBulk-1000
50 0.12 s
100 0.08 s
1000 0.21 s
But isn't really correlated either. I'm on the cusp of learning something new here and it is frustrating because I actually have to get some work done before getting on a plane to a client in Australia...
--
Use of Statement *is* indefensible, I'm simply suggesting that by its very existence and seeming simplicity it is used when it should NEVER be used.
Your campaign for bind variables is very persuasive, strong use of test cases strenghtens your argument. But without a technical fix that forces the issue, people will still write books like "Java Programming with Oracle JDBC" that get published, read, and (worse) believed.
The fix to this problem *must* be technical in order to get 100% compliance within the development community. And just maybe the server guys have figured out a way?!?!
April 14, 2006 - 12:36 pm UTC
since latches are locks...
and locks are serialization devices...
and serialization devices inhibit scalabiliy...
I would argue that yes, you can in fact demonstrate that one approach will scale "less well" than another approach by measuring the latching required by each approach.
couldn't let it go
Phil Miesle, April 14, 2006 - 10:25 am UTC
Just couldn't do it...the 'finding' flies in the face of logic.
I've stripped down the Java program to do *only* the bulk operations and here is the result (again a 10-test average):
Rows PSBulk
100 0.05 s
1,000 0.61 s
10,000 0.86 s
100,000 2.15 s
1,000,000 11.38 s
Which is a bit more reasonable, if not completely consistent. Will need to find a quiet corner (and quiet hardware, which could well be the cause of the previous 'spurious' finding?) to have at this one, but at least the world is now a bit more right.
Using bulk operation with PreparedStatement we can handle 1,000,000 records in roughly the same time Statement takes to handle 1,000.
excepting that...
Phil Miesle, April 14, 2006 - 4:38 pm UTC
---
since latches are locks...
and locks are serialization devices...
and serialization devices inhibit scalabiliy...
I would argue that yes, you can in fact demonstrate that one approach will scale "less well" than another approach by measuring the latching required by each approach.
---
Absolutely, measuring latching will show that the *potential* for scaleability problems exist (and in fact there demonstrate a built-in scaleability problem). But that is quite different from demonstrating an *actual* problem.
Slow Application = Actual Problem
Fast Enough Application = Finished Application
Potential Problem = Spend More Money
Potential Problem in Fast Enough Application = Finished Application
(It is a very myopic view, but one commonly shared by a large percentage of managers.)
Do it Right the First Time = No Potential Problem
No Potential Problem = Fast Enough Application
Fast Enough Application = Finished Application
There's no question that PreparedStatement will scale (much) better than Statement. An overwhelming body of evidence says so.
But if I have one (or two or three or a small number) concurrent users I will probably never "see" this scaleability problem in any meaningful way. By 'meaningful' I mean to say "users won't complain about a slow system."
The problem doesn't 'appear' to exist, therefore it is not really a problem (until, of course, the app is expanded and more and more users are concurrent!)
Which is why developers ulimately get away with using Statement...it works in enough cases to make it seem viable.
Bind Variable are Mandatory to Fight SQL Injection
Security DBA, August 21, 2011 - 11:54 pm UTC
Tom repeatedly states in this article (over a number of years) that Bind Variables (Prepared Statements) are the way to go. As a former DBA who has been dealing with Information Security for nearly a decade, I can't emphasize enough how right he is, and how critical this thread remains today (nearly a decade after it was started). Bind Variables stop SQL Injection security attacks-- period. Input validation is error-prone and frequently ineffective (for example, did you block every possible representation of a single quote?). Prepared Statements also need to be used in database code (stored procedures, user defined functions, packages) as well to prevent SQL Injection vulnerabilities. You don't want to be the one who didn't use Prepared Statements and whose code was compromised by a SQL Injection attack that led to the compromise of your employer's or customer's database. The CEO, Board of Directors, etc. won't understand and won't care why your code was insecure when they are faced with a public breech of their system.
August 30, 2011 - 2:03 pm UTC
I've given up trying to convince developers to use bind variables anymore.
I just talk to their managers and explain how their code is subject to SQL injection if they do not - and sql injection is the #1 way sites are compromised these days.
Fear works wonders.
Gayatri, November 11, 2024 - 5:46 pm UTC
Analyze the performance impact of using Statement versus PreparedStatement in a
large-scale JDBC application. Write a program that compares their execution times.
November 12, 2024 - 11:32 am UTC
We're not a code writing service - if you want help you'll need to show us what you've tried.
On Gayatri's request
Stew Ashton, November 13, 2024 - 9:18 am UTC
When comparing the performance of Statement versus PreparedStatement in JDBC, two things come to mind:
1) The very first time a PreparedStatement object is created, a fair bit of code has to be loaded, so comparing just one access of each kind will "prove" that PreparedStatements are much slower.
2) PreparedStatement performance depends on what software manages the connection pool and the PreparedStatement caches and what the proper configurations are. These configurations are often done by admins who are not experts in data access.
PreparedStatements give a performance boost when the PreparedStatement is in the cache already. Each connection has its own cache. If a connection is closed, its cache disappears.
If there is a firewall between the JDBC client and the database, the firewall may not allow a connection to be reused if it has been idle for awhile. Again, firewall configurations and connection pool configurations may not be well coordinated because the admins are not the same and may not understand data access requirements.
As a result, any meaningful performance tests would need to be done using the specific full stack of Gayatri's shop, ideally using a real-time example. The result would probably be the discovery that various configurations could be much improved.
Best regards,
Stew
November 14, 2024 - 1:49 pm UTC
Good points Stew.