Database, SQL and PL/SQL

On Sharing, Splitting, and Deleting

 

Our technologist shares cursors, splits large tables, and deletes securely.

By Tom Kyte

January/February 2006

 

I am a bit puzzled by something you have said: "If you use CURSOR_SHARING=SIMILAR, you might reduce the number of plans generated—then again, you might have exactly the same number of plans."

What are the factors that influence the number of plans generated? I thought that if CURSOR_SHARING was set to SIMILAR, the optimizer would replace all literals with :SYS_B_?—this is what we see in our database.

CURSOR_SHARING is a parameter Oracle Database uses to control whether it will "auto-bind" a SQL statement. Oracle Database can take a query of the form SELECT * FROM TABLE WHERE COL = 'literal' and replace the 'literal' with a bind value—so the predicate will become WHERE COL = :"SYS_B_0" . This permits the reuse of the generated query plan, perhaps leading to better utilization of the shared pool and a reduction of hard parses performed by the system. The CURSOR_SHARING parameter can have one of three values:

  • EXACT : This is the default setting. With this value in place, the query is not rewritten to use bind variables.

  • FORCE : This setting rewrites the query, replacing all literals with bind values and setting up a one-size-fits-all plan—a single plan for the rewritten query. I'll demonstrate what that implies in a moment.

  • SIMILAR : This setting also rewrites the query, replacing the literals with bind variables, but can set up different plans for different bind variable combinations. This last point is why CURSOR_SHARING=SIMILAR might reduce the number of plans generated. Because multiple plans may be generated, the setting of SIMILAR may or may not reduce the number of actual plans you observe in the shared pool.

Let's first look at what might happen if I execute the same set of queries with these three settings. The queries will simply be SELECT * FROM DUAL WHERE DUMMY = <something> , and I'll use 'A' and 'B' for <something> . Then I'll look in the shared pool via the V$SQL dynamic performance view and see how many cursors were set up for each query. Listing 1 sets the three CURSOR_SHARING values, runs SELECT queries, and looks at the content of the SQL_TEXT column in V$SQL to see the actual SQL used in the queries.

Code Listing 1: EXACT, FORCE, and SIMILAR for CURSOR_SHARING

SQL> alter session set cursor_sharing=exact;
Session altered.
SQL> select * from dual CS_EXACT where dummy = 'A';
no rows selected
SQL> select * from dual CS_EXACT where dummy = 'B';
no rows selected
SQL> alter session set cursor_sharing=force;
Session altered.
SQL> select * from dual CS_FORCE where dummy = 'A';
no rows selected
SQL> select * from dual CS_FORCE where dummy = 'B';
no rows selected
SQL> alter session set cursor_sharing=similar;
Session altered.
SQL> select * from dual CS_SIMILAR where dummy = 'A';
no rows selected
SQL> select * from dual CS_SIMILAR where dummy = 'B';
no rows selected
SQL> select sql_text
  2   from v$sql
  3   where sql_text like 'select * from dual CS% where dummy = %'
  4   order by sql_text;
SQL_TEXT
---------------------------------------------------------
select * from dual CS_EXACT where dummy = 'A'
select * from dual CS_EXACT where dummy = 'B'
select * from dual CS_FORCE where dummy = :"SYS_B_0"
select * from dual CS_SIMILAR where dummy = :"SYS_B_0"

As you can see in Listing 1, with CURSOR_SHARING=EXACT (the default), every unique SQL statement I submit will create a new entry in V$SQL , it will be hard-parsed, and an execution plan will be created just for it. There can be hundreds or thousands of very similar queries in the shared pool that differ only in the literals used in the SQL statement itself. This implies that the application itself is not using bind variables, and that implies that the database is forced to hard-parse virtually every query, which, in turn, not only consumes a lot of CPU cycles but also leads to decreased scalability. The database just cannot hard-parse hundreds or thousands of SQL statements concurrently—the application ends up waiting for the shared pool to become available. One of the major scalability inhibitors in the database is not using bind variables . That was the motivation behind adding CURSOR_SHARING=FORCE in Oracle8i Release 2 (8.1.6)—to help alleviate this performance and scalability inhibitor.

With CURSOR_SHARING=FORCE in place in Listing 1, the database generated only one shareable query in the shared pool—it replaced 'A' and 'B' with :"SYS_B_0" and made the cursor shareable by as many sessions as would need it. In general, just one query plan would be reused by all sessions. This would turn the hard parse into a soft parse, which would consume fewer resources and simultaneously increase the scalability of the system, by allowing for more concurrent work, because a soft parse needs to "latch" (use a certain type of lock on) the shared pool less than a hard parse.

However, looking at the example in Listing 1 might lead you to assume that the settings of FORCE and SIMILAR are the same—the results certainly seem that way right now, because both resulted in a single plan. So what is the difference between these two settings? I'll need another example to show that, but I can describe it first. When CURSOR_SHARING is set to SIMILAR , Oracle Database will replace all literals with bind variables, just as FORCE would, but the SIMILAR value does one other thing—it looks at each literal it replaces and asks, "Could different values for this bind variable lead, in turn, to different plans?" For example, if the predicate WHERE X=6 implies that I would want to use a full scan but the predicate WHERE X=5 implies that I would want to use an index range scan, the database would recognize that and set up different plans for me. In the case of different plans, you mark the bind variable as unsafe and add its value to the signature of the query, so to reuse this cursor, you must not only have the same SQL statement but also the same value for that particular bind variable.

That is why SIMILAR might reduce the number of plans you see in the shared pool but, then again, might not. In order to let you observe this and really see what is happening, I'll set up a table with some very skewed data—so skewed that when I query WHERE ID=1 , Oracle Database will want to use an index on ID , and when I query WHERE ID=99 , Oracle Database will not want to use an index. Listing 2 creates the skewed data and index and returns the execution plans for the skewed data.

Code Listing 2: Creating table, index, and plans for skewed data

SQL> create table t
  2  as
  3  select decode(rownum,1,1,99) ID,
  4         all_objects.*
  5    from all_objects
  6  /
Table created.
SQL> create index t_idx on t (id);
Index created.
SQL> begin
  2          dbms_stats.gather_table_stats
  3          ( ownname    => USER,
  4            tabname     => 'T',
  5            method_opt => 'for all indexed columns size 254',
  6            cascade      => TRUE
  7          );
  8  end;
  9  /
PL/SQL procedure successfully completed.
SQL> set autotrace traceonly explain
SQL> select * from t where id=1;
Execution Plan
----------------------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1 Bytes=96)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=2 Card=1 Bytes=96)
   2    1     INDEX (RANGE SCAN) OF 'T_IDX' (INDEX) (Cost=1 Card=1)
SQL> select * from t where id=99;
Execution Plan
---------------------------------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=197 Card=48028 Bytes=4610688)
   1    0   TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=197 Card=48028 Bytes=4610688)

Table T in Listing 2 contains a column ID , which is very much skewed—most of the values are 99, with one record containing a value of 1. After I index and gather statistics on the table (generating histograms on that indexed column, so the optimizer knows that the data is skewed), I can see that the optimizer prefers an index range scan over a full scan when ID=1 is used and vice versa for ID=99 .

Now, let's query that table with different ID= predicates, using CURSOR_SHARING=FORCE and then SIMILAR , as shown in Listing 3 (I already know what to expect with CURSOR_SHARING=EXACT —a single plan for each unique ID value supplied).

Code Listing 3: FORCE, SIMILAR, and skewed data

SQL> alter session set cursor_sharing=force;
Session altered.
SQL> select * from t CS_FORCE where id = 1;
1 row selected.
SQL> select * from t CS_FORCE where id = 50;
no rows selected
SQL> select * from t CS_FORCE where id = 99;
48029 rows selected.
SQL> select * from t CS_FORCE where id = 1;
1 row selected.
SQL> select * from t CS_FORCE where id = 50;
no rows selected
SQL> select * from t CS_FORCE where id = 99;
48029 rows selected.
SQL> alter session set cursor_sharing=similar;
Session altered.
SQL> select * from t CS_SIMILAR where id = 1;
1 row selected.
SQL> select * from t CS_SIMILAR where id = 50;
no rows selected
SQL> select * from t CS_SIMILAR where id = 99;
48029 rows selected.
SQL> select * from t CS_SIMILAR where id = 1;
1 row selected.
SQL> select * from t CS_SIMILAR where id = 50;
no rows selected
SQL> select * from t CS_SIMILAR where id = 99;
48029 rows selected.
SQL> select sql_text
  2   from v$sql
  3   where sql_text like 'select * from t CS% where id = %'
  4   order by sql_text;
SQL_TEXT
------------------------------------------------
select * from t CS_FORCE where id = :"SYS_B_0"
select * from t CS_SIMILAR where id = :"SYS_B_0"
select * from t CS_SIMILAR where id = :"SYS_B_0"
select * from t CS_SIMILAR where id = :"SYS_B_0"

As you can see in Listing 3, when CURSOR_SHARING=FORCE was true, one—and only one—plan was generated. It was, in fact, the one-size-fits-all plan, and in this case, the plan would use the index range scan (because the first query I parsed used ID=1 and the optimizer would use that bind variable value to generate the plan).

However, when CURSOR_SHARING=SIMILAR was true in Listing 3, three plans were generated, because the optimizer detected that a different value used when searching against the ID column could lead to a different plan (the statistics generated in Listing 2 gave it that information). Hence the actual bind variable value was added to the signature of that query plan and only a query with exactly the same signature could reuse the plan. That was the purpose of running each of the queries twice: to show that cursor reuse is possible. There were not six queries in V$SQL , just four. With CURSOR_SHARING=SIMILAR cursor reuse is not guaranteed, by design .

So, does that mean that for any unique set of literals, CURSOR_SHARING=SIMILAR will generate a new plan? No, I already saw that demonstrated with the DUAL table in Listing 1 when using WHERE DUMMY='A' and WHERE DUMMY='B' . It is only when the bind variable substitution is deemed unsafe that CURSOR_SHARING=SIMILAR will generate a new plan. Using the example in Listing 2, the only unsafe binding is against the ID column—if I query against that column and some other column but keep the ID column constant, I'll see cursor reuse, as shown in Listing 4.

Code Listing 4: CURSOR_SHARING=SIMILAR

SQL> alter session set cursor_sharing=similar;
Session altered.
SQL> select * from t CS_SIMILAR where id=1 and object_id=1;
no rows selected
SQL> select * from t CS_SIMILAR where id=1 and object_id=2;
no rows selected
SQL> select * from t CS_SIMILAR where id=1 and object_id=3;
no rows selected
SQL> select sql_text
  2    from v$sql
  3   where sql_text like 'select * from t CS% where id = % and object_id=%'
  4   order by sql_text;
SQL_TEXT
-------------------------------------------------------------------------
select * from t CS_SIMILAR where id = :"SYS_B_0" and object_id=:"SYS_B_1"

I varied the literals used to search against the OBJECT_ID column—but not the ID column—in Listing 4. The optimizer recognized that OBJECT_ID is safe; it would not generate different plans based on different inputs for that column, so it did not add OBJECT_ID to the signature of the cursor. Only when a different value is used against ID will a new plan be generated.

So, that shows that CURSOR_SHARING=SIMILAR might reduce the number of entries you see in the shared pool. If this application were to vary the literals used against the ID column and use hundreds or thousands of unique values, CURSOR_SHARING=SIMILAR would not have a significant impact on the shared pool utilization. On the other hand, if the application used only two values against the ID column, CURSOR_SHARING=SIMILAR could have a dramatic, positive effect on shared pool utilization.

Splitting Up a Large Table

I would like to partition a range of values into balanced sets. Initially I figured that one of the analytics functions might be useful for this and decided to look into these and learn more about them. The question I had in mind was, "For an ordered list of values, how can we 'chop' them into ranges and then list the first and last value for each range?" For example,

Range        Start        End
------       ------       -----
0            1            1000
1            1001         2000
2            2001         3000

This is something I've done frequently to implement what I call "do-it-yourself parallelism." The concept is that you have a large table to process and would like to break it up into some number of nonoverlapping ranges so you can start up some number of concurrent processes against it in parallel.

With the built-in NTILE() function, this is quite easy. If you wanted to break the dictionary view ALL_OBJECTS into eight nonoverlapping ranges with about the same number of rows in each, you could use the following code:

SQL> select min(object_id) min,
  2         max(object_id) max,
  3         count(*) cnt,
  4         nt
  5    from
  6  (
  7  select object_id,
  8         ntile(8) over
  9        (order by object_id) nt
 10    from all_objects
 11  )
 12  group by nt;
     MIN       MAX      CNT     NT
   -----     -----    -----    ----
       2      6811     6005      1
    6812     13117     6005      2
   13118     19122     6005      3
   19123     25127     6005      4
   25128     31132     6005      5
   31133     37142     6004      6
   37143     44620     6004      7
   44621     98225     6004      8
8 rows selected.

As I've said many times on asktom.oracle.com: Analytics rock, analytics roll. They are the best thing to happen to SQL since the introduction of the SELECT keyword.

Cascading-Delete Issue

Let's say I have two tables, T1 (parent) and T2 (child), in a REFERENCES T1(X,Y,Z...) ON DELETE CASCADE relationship. Rows can be deleted from T1, in which case child rows will also be deleted from T2. Similarly, rows can be directly deleted from T2 but without corresponding parent row(s) being deleted from T1.

I want to prevent the second situation from happening. That is, I want to ensure that only deletes issued against the parent table get executed (consequently deleting the child records, too) and that standalone deletes against the child table do not. The trouble is that deletes on the child records are always executed first, followed by deletes on the parent. So, if I try to achieve this by using a trigger on the child table, that won't help, because during a delete operation on the child table, there's no way of knowing whether it'll be followed by a delete on the parent. I would prefer a simple solution (there must be lots), because solutions involving exotic, optional (nonstandard) features may not be implementable at our location.

 

To me, this is quite simply a security thing. You don't want deletes against the child—therefore, do not grant DELETE on the child. It can be as easy as

SQL> create table p
  2  ( x int primary key );
Table created.
SQL> create table c
  2  ( x references p
  3      on delete cascade );
Table created.
SQL> insert into p
  2  values ( 1 );
1 row created.
SQL> insert into c
  2  values ( 1 );
1 row created.
SQL> grant select, delete
  2  on p to scott;
Grant succeeded.
SQL> grant select
  2  on c to scott;
Grant succeeded.
SQL> connect scott/tiger
Connected.
SQL> delete from ops$tkyte.c;
delete from ops$tkyte.c
                      *
ERROR at line 1:
ORA-01031: insufficient privileges
SQL> delete from ops$tkyte.p;
1 row deleted.
SQL> select * from ops$tkyte.c;
no rows selected

Now, you might say, "Yeah, but a DBA could still log in and just delete. . . ." I'll just say, "Yeah, but a DBA can get around anything you put in place—anything, so what?" The owner of the schema can, too; so you just use security and make it so that no one (applicationwise) has the ability to delete.

Are there ways to do this via triggers? Yes, but they are messy and tricky to write (and can still be circumvented).
Security does this cleanly.

In fact, if you wrote your transactions in PL/SQL and never granted INSERT/UPDATE/DELETE on the tables at all, think how "secure" you could make this!

Next Steps

ASK Tom
Oracle Vice President Tom Kyte answers your most difficult technology questions. Highlights from that forum appear in this column.
 asktom.oracle.com

READ
more about cursor sharing
Oracle Database Performance Tuning Guide
more about analytics
Oracle Database SQL Reference

more Tom
Expert Oracle: 9i and 10g Programming Techniques and Solutions



 

DISCLAIMER: We've captured these popular historical magazine articles for your reference. Links etc contained within these article possibly won't work. These articles are provided as-is with no guarantee that the information within them is the best advice for current versions of the Oracle Database.