Skip to Main Content
  • Questions
  • Cursor sharing, bind variables, histograms and data warehousing

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Andre.

Asked: December 29, 2000 - 11:00 pm UTC

Last updated: December 16, 2004 - 11:45 am UTC

Version: 8.1.6

Viewed 1000+ times

You Asked

I was reading about cursor sharing in Oracle8i, and Oracle recommends that -- in a Data Warehousing environment -- we avoid using bind variables, because the CBO may make a wrong assumption when deciding on the selectivity of a column.

Questions:

1) When the statement is issued and it finds an a already parsed version in the shared pool -- may the excution plan be re-generated depending on the values of the bind variables and the literals in a where clause (for example, when based on a histogram an index search occurs instead of a FTS) ? Or would the statement have to be entirely reparsed to generate a new execution plan ? Would you please explain what happens in fact ?

2) What is the difference between a "reparse", a "soft parse" and a "hard parse" in this context ?

3) In what step are the bind variables replaced with their assigned values ? Wouldn't a statement with assigned values to its bind variables have the same effect as literals, since in both cases there are values on which the CBO could be based to used a histogram to make the most appropriate execution plan ? Or does using bind variables makes histograms not be used by definition ?

4) Are histograms generated by all ANALYZE options, or only by the FOR COLUMNS clause ?

5) Does DBMS_STATS have an option for genarating histograms ?

6) Does DBMS_STATS use the same data dictionary tables and columns to store the results as the ANALYZE command ?


Thanks !


and Tom said...

I would like to clarify the "generalization" that you should not use bind variables in a data warehouse. I'll try to be very exact. I wish they hadn't printed that "rule of thumb".

You should avoid a bind variable in the very unique situation where you have

o a column that you have generated histograms for via the analyze command AND

o it is known by you that the values in the column are very skewed AND

o people frequently query on that column AND

o you desire different plans to be generate for different values.

For example, the column in question contains the numbers 1 .. 100. 0.01% of the rows have a value of 1, 0.01% have a value of 2 (and so up upto 99). 99.01% of the rows have a value of 100 in this column. In this very special case, you would like the optimizer to use an index if the predicate contains "where that_column = some_number" and some_number is between 1 and 99. You DO NOT want the optimizer to use an index if some_number is 100 as scanning the index would be painfully slow in this case. If you just submit a query:

select * from that_table where some_column = :x

the optimizer has to guess and would either generate a plan that used an index or not -- making this plan optimial only for queries with x between 1 and 99 OR x = 100 but not both. In this VERY special case, you would attempt to avoid the use of bind variables.

Short of this very very special case, you want to use bind variables. This very very special case is not limited to a data warehouse -- it can happen in any database but it might be more common to a data warehouse. Also -- if your data warehouse runs tons of queries -- you might consider using bind variables even in this case for the queries in the range x between 1 and 99 and having the special case "100" be different to avoid chewing up space in your shared pool.


Now, to answer the questions...

1) no, the whole concept of the shared pool is to avoid the regeneration of query plans -- we "shared" them, many people are concurrently executing them. If we changed the plan based on the bind variables -- we could not be sharing a single image of the plan, we would each need our own.

You need an entirely new statement (or different environment with the same statement, like if you submit the query from a session with optimizer goal set to all_rows and someone else submits it with the optimizer goal set to first_rows, you might get 2 copies of the statement in the shared pool with different plans).

2) a hard parse is a parse of a query that was not found in the shared pool. We have to parse, qualify (figure out the emp means scott.emp), authorize (can you select from scott.emp), and finally optimize.

a soft parse is when we find the query in the shared pool. We hash the sql statement, search by the hash, verify the query applies to us and reuse it. We skip many of the steps above as they have already been done for us.

a reparse I guess is when you parse the same statement over and over. It is wasteful in a single program and could be avoided. For example, instead of coding:

for i in 1 .. 10 loop
execute immediate 'insert into t values ( :x)' using i;
end loop;

you could use dbms_sql to:

open_cursor
parse the insert
for i in 1 .. 10 loop
bind i
execute the statement
end loop
close the cursor

the first loop would have upto 10 soft parses. The second has 1.

3) at the bind step. when we goto execute. It happens AFTER optimization. Histograms can be used but their value is limited with bind variables.

4) only when you ask for it with the for columns.

5) see the method_opt parameter to the GATHER_* routines. You can specify this yes.

6) dbms_stats can be used to populate the data dictionary tables used by the CBO or it can populate tables outside of the data dictionary (to be transfered into the data dictionary later)

Rating

  (5 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

one question on histograms

A reader, July 02, 2003 - 8:57 pm UTC

I understand your lucid explanation for the case
against using bind variables. Could you please explain
what you meant by the following statement?
"Histograms can be used but their value is limited with bind variables."

Thanx!



Tom Kyte
July 03, 2003 - 8:54 am UTC

when using bind variables:

select * from t where x = :y;


the optimizer will use bind variable "peeking" in 9i the first time the query is hard parsed (it'll peek at the binds and optimize the query as if the literal value was there). The plan you get depends on the bind variable first used -- the histograms won't be used afterwards. they are of "limited use".

in 8i, it'll use "heuristics" to make assumptions about how much data would be returned. the histograms are of very limited use here -- since the optimizer makes no assumption as to the value of :y


if you use

select * from t where x = 1;
....where x = 2;
and so on...

then histograms come into play all of the time.

On the Topic of Bind Variables

Brian, July 03, 2003 - 11:52 am UTC

On the topic of when to use them:

I have read through your book, and I understand that I should use bind variables. But I'm not clear about two things:

1) What the syntax is exactly? I realize that in sql*plus
you can type in variable x number, and it's a bind
variable. Then, the x shows up as a :x. But what if
you're writing a procedure? How do you bind the
parameters that will later be used in a where
statement? Or are they already bound? And that leads
me to:

2) When do bind variables already exist implicitly (without
declaration or special syntax)?

I find the Oracle documentation around bind variables to not be very clear on the above.

Thanks


Tom Kyte
July 03, 2003 - 7:59 pm UTC

1) that is the pure and sheer beauty of plsql -- all variables in SQL in PLSQL are in fact binds:

create procedure foo( p_input in varchar2 )
is
begin
for x in ( select * from t where column = p_input )
loop
....


p_input *is* a bind...

2) correct -- PLSQL is so tightly coupled with SQL that the distinction blurs.

How do I make this work

sonali, January 20, 2004 - 3:22 pm UTC

As I want to see the values returned by the select statement I am doing into but as because there are multiple rows returned by this query I get this error.. How do i write this query then ?

Server Output ON
SQLWKS> DECLARE
2> Category NUMBER := 1669;
3> Legal_type NUMBER := 3;
4> Legal_locale varchar2(10):= 'en-us';
5> p_Legal_text varchar2 (25);
6> BEGIN
7>
8>
9>
10> EXECUTE IMMEDIATE 'select legal_text from mweblegal where legal_category = :bind_category
11> And legal_type = :bind_type and legal_locale = :bind_locale' into p_Legal_text
12> using category, legal_type, legal_locale ;
13> dbms_output.put_line('legal_text '|| p_Legal_text);
14>
15> END;
16>
17>
ORA-01422: exact fetch returns more than requested number of rows
ORA-06512: at line 10


Tom Kyte
January 20, 2004 - 3:45 pm UTC

declare
....
type rc is ref cursor;
l_cursor is rc;
begin
open l_cursor for 'select ....' using bind1, bind2;

loop
fetch l_cursor into .......;
exit when l_cursor%notfound;
....
end loop;
close l_cursor;
end;
/

which are the advantages of dind variables in a DWH environnement?

Maurice, April 23, 2004 - 8:46 am UTC

Hi Tom,

In a DWH environnement the probability that a same query is exectued twice within a short amount of time is really really small. Also most DWH are not perfectly optimized and it often happen that a query takes from several minutes to several hours. So even if the optimzer has to reparse twice a same statment, the lost time, related to the total time of the query, is insignificant. So I just don't see, in a DWH environnement, any advantages of using bind variables.

However I noticed 2 disadvantages (on 9iR2):
- with bind variable the CBO dosent make a as precise estimation as when using litterales
- the optimizer will not do a query rewrite if a MV contains litterals which are replaced by bind variables in the query.

Conclusion:
In a DWH environnement I see no advantages of using bind variables but I noticed 2 disadvantages. So why should I use bind variables?

Regards
Maurice

Tom Kyte
April 23, 2004 - 1:30 pm UTC

that is exactly what I've written before -- in fact, in a DW, certain optimizations (star transformation for example) are not available if you bind.

so, ad-hoc queries in DW's -- don't necessarily bind them.

Binds w/ CONTAINS

Andy, December 16, 2004 - 11:07 am UTC

Hi Tom,

Assuming that my data warehouse doesn't fall into the category you described in your original answer (i.e. I DO want to use binds with my CONTAINS queries), how can I do that in such a way as not to "hide" what operators have been used?

For example:

mires@WS2TEST> var x varchar2(20)
mires@WS2TEST>
mires@WS2TEST> select count(*) from fulltext where contains(akteinh, 'NEAR((test, entry))', 1) > 0

COUNT(*)
----------
2

mires@WS2TEST> begin :x := 'NEAR((test, entry))'; end;
2 /

PL/SQL-Prozedur wurde erfolgreich abgeschlossen.

mires@WS2TEST> select count(*) from fulltext where contains(akteinh, :x, 1) > 0;

COUNT(*)
----------
2

More than just the variable is bound - the NEAR operator is as well. Is there any way to bind just the variable, and not other information that I may want in my trace file?

Tom Kyte
December 16, 2004 - 11:45 am UTC

no, because the query to text is itself just a string.

suppose you could this tho:


ops$tkyte@ORA9IR2> create table t ( x varchar2(50) );
 
Table created.
 
ops$tkyte@ORA9IR2> insert into t values ( 'how now brown cow' );
 
1 row created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create index t_idx on t(x) indextype is ctxsys.context;
 
Index created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> variable x1 varchar2(20)
ops$tkyte@ORA9IR2> variable x2 varchar2(20)
ops$tkyte@ORA9IR2> exec :x1 := 'how'; :x2 := 'now'
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select *
  2   from t
  3   where contains( x, 'near((' || :x1 || ',' || :x2 || '))', 1 ) > 0;
 
X
--------------------------------------------------
how now brown cow
 

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.