A reader, May 30, 2003 - 9:31 am UTC
It pays to cache PreparedStatements
Peter Tran, May 30, 2003 - 10:36 am UTC
We have the same issue. We wrote a test program to show the performance impact of having to do hard parse vs soft parse vs parse once in JAVA.
Here's the result.
# records | 64 | 128 | 256 | 512 | 1024 | 2048 | 4096 | 8192 | 16384
--------------------------------------------------------------------------
Cached PS | 0.3 | .21 | 0.38 | 0.58 | 1.09 | 2.08 | 4.20 | 8.51 | 16.69
PS | 0.4 | .65 | 1.16 | 2.03 | 3.54 | 6.62 | 13.39 | 26.65 | 58.16
Statement | 0.4 | .67 | 1.29 | 2.44 | 4.51 | 9.04 | 18.19 | 35.70 | 74.47
--------------------------------------------------------------------------
(I would print out more, but it didn't fit very well. If you graph the above numbers, you'll see that the Statement (hard parse) starts to show an almost exponential runtime growth compared to using the cached PreparedStatement (parse once) as we processed more records. Using the PreparedStatement and closing it each time (soft parse) isn't as bad as using the Statement, but it still pretty expensive.)
The test itself was pretty simple.
* Create a test table inserting 100,000 rows of data.
* Retrieve N number of records for each iteration
and call the 3 test methods below.
* Create three test methods
* Statement using literal (hard parse)
* PreparedStatement using bind variable (soft parse)
* Cached PreparedStatement (parse once)
The first two test methods:
* Gets a connection
* Creates a Statement or PreparedStatement
* Executes a query against test table
* close PS
* close connection (puts back into pool)
The third test method:
* Gets a cached connection
* Gets the cached PreparedStatement associated with
the above connection
* Exectes a query agaist test table
* Put PS back into cache
* Put connection back into cache (not into pool)
The PreparedStatement using bind variable is currently how we code it in our application, because we too use connection pooling. With connection pooling, JAVA developers are taught to get the connection whenever they need it (cheap), create the PreparedStatement, use it, and then close all resources.
In our applications, we have CORBA servers that do a lot of batch processing, but they get their unit of work in spurts. Within those unit of work, the server batch up and reuse the PS as much as possible, but then it closes all open resources before returning to the client. So, if a client makes 1,000,000 request to the server sending 1000 records to process each time, it still doing 1,000,000 soft parses.
What we don't see is doing a lot of soft parsing can really add up quick and impact performance.
As Tom says, with connection pooling you have to "roll" your own caching scheme.
We did implement our own caching scheme, but here's some issues you should be aware of if you're going to do the same thing.
1) A cached PreparedStatment (PS) is tied to a connection, so you have to also cache the connection. That means you need a way to get the same connection all the time from the connection pool. If your application is using Global Temporary Tables (GTT), then you cannot share the connections.
2) Caching the connection and PreparedStatement reduce the number of available cursors. The limit is set by the open_cursors configuration. If you exceed this value, then you will get "ORA-01000 maximum open cursors exceeded." Thus, you may need to limit what gets cached to SQLs that are executed a lot and are really expensive to parse. (You'll need to profile your code and study statspack to find these SQLs.)
3) Your caching scheme needs to handle cleaning up all open resources (PS, connection). A couple of options are: have the client invoke a cleanup task at the end of the processing, or have a daemon thread run in the background and auto-close the PS and connections (e.g. GC thread in the JVM).
4) If you're caching scheme requires you to write wrapper classes that implements the java.sql.* interfaces, then you'll have issues with trying to support both JDK 1.3 and JDK 1.4 at the same time, because the interfaces changed in JDK 1.4 to support JDBC 3.0.
5) If you application is multi-threaded, then you need to handle the situation when two threads can possibly ask for the same connection and the same PreparedStatement.
May 30, 2003 - 10:51 am UTC
to make that test more interesting (i have a very very similar benchmark I use and will be publishing) make it multiuser!
not only are statements slower in single user, they will KILL YOU -- seriously dead -- in multi-user environment.
thanks for the independent outside confirmation.
multi-thread the test
Peter Tran, May 30, 2003 - 1:05 pm UTC
I need to validate that the wrapper classes can handle multiple threads.
Using multiple threads isn't true a multi-user environment (from the CORBA server perspective), but it's pretty close to simulating one. To the database, the multiple threads will look like multiple users.
I'll publish updated results when I get them.
caching prepared callable statements in java
R Menon, May 30, 2003 - 4:40 pm UTC
Thanx for the reponses, Tom and Peter Tran - appreciate it!
Peter, I would be curious to see the updated results of
the multithreaded version of tests.
Tom I had a quick related question. Till now
I associated cursor with the ResultSets. Now I see
that the cursor is just a state - it could be associated
with a resutlset or a statement itself (the statement's
state would consists of anything that can be
set using the "statement.set" methods. Do you agree
or do you want to comment on this?
On my subsequent investigation I found that with JDBC 3.0
you can now cache statements using JDBC features.
If you use implicit caching then regardless of
whether you do connection pooling or not, your
statements get cached by the database (apparently
there is a cache per connection)
Not sure about the implications on the max cursors
reached problem.
Here is a segment from Oracle's JDBC book.
"Statement caching improves performance by caching executable statements that are
used repeatedly, such as in a loop or in a method that is called repeatedly. JDBC 3.0
defines a statement-caching interface.
Statement caching can:
- Prevent the overhead of repeated cursor creation
- Prevent repeated statement parsing and creation"
More interesting details are in the jdbc doc at
</code>
http://download-west.oracle.com/docs/cd/B10501_01/java.920/a96654/toc.htm <code>
In particular see chapter 14
May 31, 2003 - 11:40 am UTC
a cursor is just a pointer to some sql statement -- be it an insert/update/delete/merge/plsql block whatever.
we have many tools and techniques to allow developers to be lazy (session cached cursors for example, cursor sharing, etc) however, none of them can ever approach the performance you would get by doing it right in the first place.
You want fast, do it right.
It pays to cache PreparedStatements
J. Smucr, June 04, 2003 - 3:43 am UTC
Excellent thread.
Peter Tran from Houston, TX USA, wrote:
<quote>
1) A cached PreparedStatment (PS) ... If your application is using Global Temporary Tables (GTT), then you cannot share the connections.
</quote>
Could you please tell us more about this?
Many thanks.
June 04, 2003 - 8:01 am UTC
if they are "ON COMMIT DELETE ROWS" i don't see the relevance of that statement either.
Sharing a connection and GTT
Peter Tran, June 04, 2003 - 9:21 am UTC
Tom,
Correct me if I'm wrong, but if two threads share the same connection and get the same cached PreparedStatement, then won't you run into problems because the two threads will start seeing each other's data?
I thought GTT requires each thread to have its own session.
-Peter
June 04, 2003 - 10:03 am UTC
when using connection pooling, it is imperative that you commit or rollback before giving the connection back (end the transaction) -- else the next guy is in the middle of your mess.
So, you always commit/rollback.
If you use "ON COMMIT DELETE ROWS", the gtt's empty themselves transactionally using the equivalent of "truncate".
You'll not see anyones data.
Caching connection
Peter Tran, June 04, 2003 - 10:31 am UTC
Tom,
My comments were in regards to my caching scheme.
A cached PreparedStatement is associated with a connection, which requires the connection to also be cached. In my caching implementation, I cache both the connection and all cached PreparedStatement associated with that connection. Thus, when I get a connection from the pool, I don't give it back to the pool until the client completely exits.
The cached connection is stored in a java.util.Map. To get the connection, you have to provide a key name. The value of the map is a java.util.Stack of cached connections. I did it this way to support the ability for multiple threads requesting the same key name to get its own connection.
-Peter
caching prepared callable statements in java
A reader, June 05, 2003 - 10:00 am UTC
This question is for Peter and Tom.
Why would you still need your own caching scheme
since JDBC (at least JDBC 2.0 and above) seems to
provide its own per connection statment caching scheme
(As per my note earlier) - Is n't that the best way to solve
this problem?
Repeating the link below for your convenience.
</code>
http://download-west.oracle.com/docs/cd/B10501_01/java.920/a96654/toc.htm <code>
In particular see chapter 14
June 05, 2003 - 1:16 pm UTC
as long as tkprof says "it was parsed once", i would be 100% for it.
Why we don't use ORACLE built-in cache
Peter Tran, June 06, 2003 - 9:31 am UTC
R Menon,
Here's the reason why we implemented our own caching scheme.
1) We can use any JDBC driver with our own caching scheme. We're not dependent on ORACLE. Not all JDBC implementation have a built-in caching scheme.
2) We didn't want to cast our classes to an ORACLE implementation. We wanted our JDBC access to be database ignorant. (Great for portability, sucks for performance).
3) A cached PreparedStatement is associated with a connection. With connection pooling, you're not guaranteed to get the same connection. Therefore, even if you went with ORACLE's built-in cache, you'll need some way to hold the same connection.
[quote from your link]
Because each physical connection has its own cache, multiple caches can exist if you enable statement caching for multiple physical connections. When you enable statement caching on a pooled connection, all the logical connections will use the same cache. If you try to enable statement caching on a logical connection of a pooled connection, this will throw an exception.
[/quote]
From the above quote, [Tom - correct me if I'm wrong], this will break using GTT if two threads gets the same logical connection and access the same cached Statement. You can't control this if you use all the ORACLE built-in caching features.
4) There are other things we can do with the wrapper classes that is outside the cache. E.g. When an exception is thrown from using a PreparedStatement, typically you don't get any information about the actual SQL string. We can trap for this and log out more information, etc.
5) I haven't tested the ORACLE 9.2.0.3 driver implementation Statement caching, but the 9.0.1 implementation had some serious bugs.
6) Not sure how the ORACLE implementation is going handle the issue when all available cursors have been exhausted because there's so many cached Statement.
-Peter
June 06, 2003 - 9:57 am UTC
1) that is not a benefit to a database guy (any database guy). guess what happens if you take the data away? you've got a bunch of bits that do nothing called code. guess what happens if you take the code away? you have all of your data -- you can start over if you like.
2) pay attention to that last part.
3) no, you use gtt's with ON COMMIT DELETE ROWS. you cannot -- repeat -- cannot have transactions shared by more then one person so a connection grab will grab a connection -- do a transaction and either commit or rollback. you would not use SESSION based gtts
Comment on Tom's Followup:
Peter Tran, June 06, 2003 - 10:54 am UTC
1) If you take the data away, then it's a mute point whether I use Oracle's implementation or my own caching scheme. I wouldn't need to access the data because there is none. The wrappers are written so they implement the java.sql.* APIs, so technically, my application code using the wrapper classes don't have to change. I would only have to disable my own caching scheme and use the native driver version.
We have to evaluate the entire system's architecture when we make a decision to use a tool or go with a certain design and weigh portability, maintainability, performance, etc. - there are trade-offs. In this case, the trade off was to do a little more work on our side to maintain the portability benefit.
R Menon - It was only in JDBC 3.0 specification that mentions anything about Statement caching, but the implementation is still left up to the driver provider. You'll find that not all ORACLE JDBC providers have implemented the JDBC 3.0 specification, so if you decide to use a non-ORACLE JDBC implementation driver, you may have to provide your own caching implementation.
2) Trust me, I know. Generic implementations are great (sometimes), but more often than not, performance takes a hit.
3) I understand what you're saying. In other words, if the application programmer decides to create multiple threads and share the same logical connection between these threads, then the programmer is doing something stupid and therefore deserves all the headache that comes along with doing something stupid.
Thanks,
-Peter
Comment on Peter's comments titled" Why we don't use ORACLE built-in cache "
R Menon, June 08, 2003 - 6:47 pm UTC
in my case 1 and 2 are not a concern - Also, I Am a little
skeptical of sacrificing performance for portability
since if you are switching a database then may be you would have to worry about even more important things - but
mostly you dont change your database often. What I am
saying is that many of us end up sacrificing performance
for nothing (I am not saying that is the case with you, Peter)
Regarding the third point, i did not quite understand
it - may be i did not understand the quote itself:)
Are you saying that if an application is using GTT
then the connection would have to be saved between
calls? I thought the whole point is to not worry
about saving connections since each connection would
have its own cache (which has to be the case in any
implementation since you can not share statements across
connections anyways.)
I am also curious to know about the point 6 you raise about
how Oracle deals with the fact that it may have lots
of cursors opened due to the caching scheme. I suspect
it deals with it in a typical cache management where
the cache size itself is lower than the max cursors
limit and then lru is used to recycle cache entries?
Reply to R Menon
Peter Tran, June 10, 2003 - 11:15 pm UTC
You don't have to switch database to have problems. Just switching driver implementation will bring its own set of headaches. E.g. if you use the ORACLE implementation for caching, then either your variable handles will be Oracle classes, or you will need to cast to an Oracle implementation inside your class. Now imagine changing from Oracle implementation to say Merant driver (another JDBC implementation for ORACLE). How many different places in your code will you have to refactor?
With my wrapper, I only have to do it in a couple of classes.
As to the third point, the problem I brought up shouldn't be an issue if the GTT was created with ON COMMIT DELETE ROWS as Tom points out. A connection (even if it's cached) shouldn't be shared across different threads, so each thread should commit before releasing the connection back to the cache.
Oracle's implementation allows the client to limit the number of PreparedStatements that is cached. The client is allowed to control the maximum number of cached PS, but it doesn't do anything if the client sets it to a high value and exhausts all available cursors.
Unless you want to do a lot of bookkeeping, there's very little you can do if the client goes berserk and starts to cache everything.
reply to peter Tran
R. Menon, June 11, 2003 - 3:27 pm UTC
Thanx Peter
You are right - the driver change could itself
be a problem. I have worked only with Oracle's driver!
Anyways, in my case I dont expect even the driver to
change.
Btw, does a driver change mean that
Oracle specific extensions provided by Oracle's
JDBC driver (that could improve performance considerably)
will go away? I hope not!
The problem of the cache size that you mention may not
be really an issue - it is upto the client to set
the cache size properly and managing that in itself
should not be a big deal.
Reply to R Menon
Peter Tran, June 11, 2003 - 9:12 pm UTC
[quote]
Anyways, in my case I don't expect even the driver to change.
[/quote]
I didn't either. You'll never know what's going to happen, but I have to admit, Oracle's JDBC implementation have gotten better. The earlier implementation (e.g. 9.0.1) were notoriously buggy.
[quote]
Btw, does a driver change mean that Oracle specific extensions provided by Oracle's JDBC driver (that could improve performance considerably) will go away? I hope not!
[/quote]
Yep. There's no guarantee that the other driver implementation will use the same extensions. Why would do they have to?
[quote]
The problem of the cache size that you mention may not be really an issue - it is upto the client to set the cache size properly and managing that in itself should not be a big deal.
[/quote]
Imagine a really large client/server application where the servers are CORBA services written by different developers. Each developer is going to try to get the best performance out of his/her component, so will probably cache as many PreparedStatements as possible. Now you have a N CORBA servers x M CORBA services x O PreparedStatements, and you can exhaust the database resources really quick.
reply to Peter Tran
R Menon, June 19, 2003 - 11:18 am UTC
Hi Peter
"Yep. There's no guarantee that the other driver implementation will use the
same extensions. Why would do they have to?
"
They dont - It is just that since in my case, I use
many of the performance enhancing extensions that
are Oracle specific, I was thinking that would
be a disadvantage. But again, the assumption
(in my case) is that the driver wont change.
"Imagine a really large client/server application where the servers are CORBA services written by different developers. Each developer is going to try to get
the best performance out of his/her component, so will probably cache as many PreparedStatements as possible. Now you have a N CORBA servers x M CORBA services x O PreparedStatements, and you can exhaust the database resources really quick."
I get that - I guess there needs to be a central control
over such cases. It would be more of a guideline
unfortunately - something that says that each developer
can not use more than "x" number of entries in theprepared statements cache. Not very elegant though.
Thanx!
Native dynamic query under PL/SQL and avoiding soft parse
Piotr Jarmuz, November 04, 2003 - 1:31 pm UTC
Hello Tom,
First, sorry for not being 100% thread compatible.
In the system that I am building I need to use dynamic queries with predicates qlued at runtime with object views. The queries are built from object views and so called selectors. The selectors can be nicely combined with OR and AND and are automagically inherited because if they are defined and attached to some base class then any derived class sees them right away. Selectors are associated with indices if indexing make sense for them. There could be potentially hundreds of thousends of them (queries) so it is not possible (practical) to code them all statically but a specific application would use only small fraction of that potential < 1000 definitely. This glueing is part of business logic and polymorphism, however and is placed in PL/SQL packaged procedures. This on the other hand causes the query to be softparsed each time it is executed. I can't use dbms_sql package because I need to return refcursors to clients. As far as I know there is currently no way to do it in PL/SQL. So I plan to use a little trick. The stored procedure would not return the refcursor but rather the query string with embedded bind variables and info how many parameters it needs to the client. Client library knows the parameters for binds. And then at client site buried in library code I would use custom statement cache in JDBC or built-in in OCI and Perl DBI to avoid the soft parse altogether. I see no performance issues at all in current implementation however I'd like to squeeze every possible cycle out of the CPU. Another reason is Perl (our primary rapid development langauge of choice) and DBD driver problem with refcursor performance. It does not prefetch on such cursors so it is order of magnitude slower than straight select and this trick would also fix that. I'd still control any possible SQL in one centralized place so no problem with hunting down the culprit clients for bad SQL.
Could you please comment on that? Is it worth the candle?
It would be nice if native dynamic execute could also use a statement handle cache per DB session.
Thank you in advance.
Regards,
Piotr
November 04, 2003 - 5:23 pm UTC
just use session_cached_cursors=100 and let us do a softer soft parse.
CData Datatype in Oracle
Prasanna, September 14, 2005 - 5:42 am UTC
Hi Tom,
Im Prasanna, I need to create a table with query_string as one of the column. The datatype which I want to give is Varchar2(4000).
In Java(front end) need to convert as CData. I didnt find any CDATA datatype in Oracle. Do you know the Datatype which is similar to the CDATA in oracle?
Is it possible to convert Varchar2(4000) to CDATA in Java? if yes then can you explain in detail.
September 14, 2005 - 8:38 am UTC
CDATA is an "xml thing"? don't get the tie in to a java datatype?
practical way to cache prepared statements?
Ryan, October 18, 2005 - 3:43 pm UTC
Has anyone you know come up with a practical way to cache prepared statements? We have a background process that runs with as many as 10 threads that all essentially use the same prepared statements over and over again. They process 1 record at a time based on rules.
When one record is done, the thread goes back to the queue to grab the next record. When the queue is empty the thread ends.
Every 10 minutes or so, the main queue process wakes up and grabs some more records and then the 1- threads are started all over again. We have decision points so some records may require certain queries, etc... However, if you take it overall they are processing a series of about 200 queries total.
All basically the same queries. We get 1 execution per query now.
October 18, 2005 - 5:26 pm UTC
Is session_cached_cursors effective for Pro*C ?
Jan, November 06, 2005 - 11:59 am UTC
Hi Tom,
just to be sure if I understood session_cached_cursors parameter correctly:
1) Is session_cached_cursors effective for both PL/SQL and Pro*C, if "reusable" SQL statement is used (bind variables)?
2) Assuming 1) is true, how do session_cached_cursors and HOLD_CURSOR+RELEASE_CURSOR work together?
If Pro*C program is holding its cursors open (cached on the client side), there should be no need to cache the cursors on the server side?
Thank you.
Jan
November 06, 2005 - 12:37 pm UTC
1) session cached cursors works as documented for *any client*.
In plsql it has the added function of controlling the side of the cursor cache used by plsql in 9205 and above.
it works for all clients.
it does an extra thing for plsql in 9205 and above.
2) see #1, pro*c has its own conrols.
Stored procedures cached
Andrew, November 06, 2005 - 4:17 pm UTC
Hi Tom
I presume that there is not much point in caching stored procedure calls since SQL generated in them would automatically get cached since its server-based?
Thanks
November 06, 2005 - 5:55 pm UTC
there is only one good kind of parse - NO PARSE...
a client application that calls a store proc parses sql (well a plsql) statement and that should be done as little as possible by the client (once is best)
How can i prevent Statements being closed automatically...
Amit Desai, January 19, 2006 - 1:39 pm UTC
Well I tried to cache CallableStatements,
It works perfect for the first call. But for subsequent
calls it fails because when if finishes earlier call
it closes the connection automatically even though
I am not calling cstmt.close();
Please someone explain is there any way of achieving this ?
Thanks in advance.
January 19, 2006 - 1:57 pm UTC
without simple example, not a chance.
To Amit
Menon, January 19, 2006 - 4:30 pm UTC
A simple example showcasing the problem would be great as Tom so elegantly puts:)...I am assuming you are using implicict caching. By very definition of (implicit) caching, you need to issue cstmt.close() for it to get cached (cos that is when internally, JDBC driver caches the statement.) Physical closing of statement happens only when
. the connection is closed, or
. cache reaches its limit and the least recently used statement object needs to be closed to make way for the current statement, or
. you close a statement on which statement caching is disabled.
implicit caching Question
pasko, January 20, 2006 - 4:09 am UTC
Hi Menon,
i had posted this somewhere else but was not seen by you i guess :)
i have been trying to use ImplicitStatement Caching with Oracle JDBC Drivers ,
but it seems the Implicit Statement Caching Feature is not working properly:
For Example:
import oracle.jdbc.OracleConnection;
import oracle.jdbc.pool.OracleDataSource;
OracleDataSource oracleDataSource = new OracleDataSource();
oracleDataSource.setUser( dbUser );
oracleDataSource.setPassword(dbPwd );
oracleDataSource.setURL(dbUrl);
OracleConnection conn = (OracleConnection) oracleDataSource.getConnection() ;
// Enable implicit Statement cache
conn.setImplicitCachingEnabled( true);
System.out.println("Implicit Caching = "
+ conn.getImplicitCachingEnabled()
);
Here conn.getImplicitCachingEnabled() always returns false.
Why is that because i expected this to return true.
January 20, 2006 - 10:23 am UTC
it was seen and someone pointed out for you that you sort of need to close the statement in order to cache the statement.
In fact I responded to you there as well saying "umm, sort of need an example please"
To Pasko
Menon, January 20, 2006 - 10:16 am UTC
Hi Pasko,
You need to also set the statement cache size to an appropriate positive number ..
e.g.
conn.setStatementCacheSize(10);
----
import java.sql.SQLException;
import oracle.jdbc.OracleConnection;
import oracle.jdbc.OracleTypes;
import book.util.JDBCUtil;
class TestImplicitCaching
{
public static void main(String args[]) throws SQLException
{
OracleConnection conn = null;
try
{
// get connection
conn = (OracleConnection) JDBCUtil.getConnection("benchmark", "benchmark", "ora92");
System.out.println("implicit caching enabled (Default): " + conn.getImplicitCachingEnabled() );
System.out.println("cache size (default): " + conn.getStatementCacheSize() );
conn.setImplicitCachingEnabled( true );
System.out.println("implicit caching enabled after setting enable flag: " +
conn.getImplicitCachingEnabled() );
conn.setStatementCacheSize( 10 );
System.out.println("implicit caching enabled after setting cache size: " +
conn.getImplicitCachingEnabled() );
System.out.println("cache size: " +
conn.getStatementCacheSize() );
}
finally
{
// release resources associated with JDBC in the finally clause.
JDBCUtil.close( conn );
}
}
}
----
When you run the above (JDBCUtil is a generic utility class used to get and close connection), you get
T:\jdbc>java TestImplicitCaching
URL:jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(PORT=1522)(HOST=rmeno
n-lap))(CONNECT_DATA=(SID=ora92)))
implicit caching enabled (Default): false
cache size (default): 0
implicit caching enabled after setting enable flag: false
implicit caching enabled after setting cache size: true
cache size: 10
wrong person?
Menon, January 20, 2006 - 10:35 am UTC
"In fact I responded to you there as well saying "umm, sort of need an example please"
Did you mean those comments to "Amit" not "Pasko" perhaps who posed that particular question ?
January 20, 2006 - 10:48 am UTC
could be, but if someone simply says
... i had posted this somewhere else but was not seen by you i guess :) ....
well, don't know what to say (hard to tie it all together :)