Thanks for the question, Andrew.

Asked: December 13, 2007 - 2:45 pm UTC

Last updated: September 20, 2011 - 6:45 pm UTC

Version: 10.2.0.3

Viewed 10K+ times! This question is

You Asked

Tom,
Do you have any details about what the optimization techniques are for the plsql_optimize_level parameter?
I just read an article that says "this optimisation turns cursor-for-loops into implicit bulk fetches". Very nice to know that.
I'd like to know what else.

Checked the reference manual and for level 2 it says, "Applies a wide range of modern optimization techniques beyond those of level 1 including changes which may move source code relatively far from its original location."

Thanks
Andrew

and Tom said...

the entire suite of code enhancements is not published anywhere but includes things like dead code elimination, code moving eg:

for i in 1 .. 100000
loop
   d:=42;
   ... other code, nothing that modifies d tho....
end loop;

the d:=42 will be moved outside of the loop. Avoidance of repeated implicit conversions between datatypes during assignments and so on.

Just like using different optimizer levels with any compiler - a whole host of techniques are employed at the various levels - sometimes resulting in slower compiles - at the benefit of improved runtime performance.

11g has added many other things like inlining of functions (instead of calling the function, the code for the function is inlined)
http://docs.oracle.com/cd/B28359_01/server.111/b28279/chapter1.htm#FEATURENO05587

Here is a quick example showing some of the other optimizations:

ops$tkyte%ORA10GR2> create or replace procedure P
  2  as
  3    a positiven := 1 /* binary_integer subtypes */;
  4    b positiven := 1;
  5    c positiven := 1;
  6    d positiven:=1;
  7    t0 integer;
  8  begin
  9    t0 := Dbms_Utility.GET_CPU_TIME();
 10    for j in 1..1000000
 11    loop
 12      b := j /* pls_integer to binary_integer conversion */;
 13      d := 42 /* constant assignment within a loop */;
 14      c := d + b /* can be combined... */;
 15      a := b + c /* ...c not used except here */;
 16    end loop;
 17    Dbms_Output.Put_Line(Dbms_Utility.GET_CPU_TIME()-t0);
 18  end P;
 19  /

Procedure created.

ops$tkyte%ORA10GR2> alter session set plsql_optimize_level =0;

Session altered.

ops$tkyte%ORA10GR2> alter procedure p compile;

Procedure altered.

ops$tkyte%ORA10GR2> exec for i in 1 .. 5 loop p; end loop;
36
36
36
37
36

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> alter session set plsql_optimize_level =2;

Session altered.

ops$tkyte%ORA10GR2> alter procedure p compile;

Procedure altered.

ops$tkyte%ORA10GR2> exec for i in 1 .. 5 loop p; end loop;
31
31
31
30
31

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>

Rating

(10 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Good performance gains

Andrew Markiewicz, December 14, 2007 - 2:44 pm UTC

Tom,
Very nice optimization features.

The 11g link is interesting and brings up a question.

1.1.11.11 PL/SQL Inlining Optimization

Modern PL/SQL software includes many procedures frequently called in various programs. The power of the PL/SQL optimizing compiler is increased by incorporating the inlining optimization (replacement of a procedure call with a copy of the procedure body). The revised code executes more quickly because call overhead has been eliminated and because further optimizations are possible when the compiler knows the actual values with which the inlined procedure is called.

Performance gains can be substantial.

Does the code replacement apply to package procedures as well? If that's true, I am just wondering what that would be like to have essentially duplications of the code compiled all over. One of the benefits of a package was the reuse of the compiled code that all other programs ran.
Now there would be multiple copies of the code.

Also, how would invalidations work when the package body is recompiled? Would that invalidate a procedure that uses the package which would not have been invalidated without the inlining optimization?
(Assuming again this applies to packages.)

Secondly, in the spirit of your example, I tried to test the performance gain on cursor for loops that are converted to bulk binds. Huge improvement with CPU time from ~460 to ~35.

sql->
sql->drop table tmp_table1;

Table dropped.

Elapsed: 00:00:00.41
sql->
sql->create table tmp_table1 as
  2  select object_id, object_type
  3    from dba_objects
  4  /

Table created.

Elapsed: 00:00:01.24
sql->
sql->exec dbms_stats.gather_table_stats(user, 'TMP_TABLE1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.75
sql->
sql->select count(*) from tmp_table1
  2  /

   COUNT(*)
-----------
      47067

Elapsed: 00:00:00.02
sql->
sql->create or replace procedure gettmptable1
  2  is
  3      cursor c1 is
  4      select *
  5        from tmp_table1
  6      ;
  7      t0 integer;
  8  begin
  9      t0 := Dbms_Utility.GET_CPU_TIME();
 10      for c1_rec in c1
 11      loop
 12          null;
 13      end loop;
 14      dbms_Output.Put_Line(Dbms_Utility.GET_CPU_TIME()-t0);
 15  end;
 16  /

Procedure created.

Elapsed: 00:00:00.22
sql->show err
No errors.
sql->
sql->alter session set plsql_optimize_level=2;

Session altered.

Elapsed: 00:00:00.00
sql->alter procedure gettmptable1 compile;

Procedure altered.

Elapsed: 00:00:00.10
sql->show err
No errors.
sql->
sql->begin
  2      for i in 1..10
  3      loop
  4          gettmptable1;
  5      end loop;
  6  end;
  7  /
35
34
35
34
34
35
34
36
36
33

PL/SQL procedure successfully completed.

Elapsed: 00:00:03.47
sql->
sql->
sql->alter session set plsql_optimize_level=0;

Session altered.

Elapsed: 00:00:00.01
sql->alter procedure gettmptable1 compile;

Procedure altered.

Elapsed: 00:00:00.09
sql->show err
No errors.
sql->
sql->begin
  2      for i in 1..10
  3      loop
  4          gettmptable1;
  5      end loop;
  6  end;
  7  /
456
459
465
455
462
475
460
456
459
457

PL/SQL procedure successfully completed.

Elapsed: 00:00:46.04
sql->

Wow. Good job Oracle.
Thanks for your time.
Andrew

December 14, 2007 - 3:31 pm UTC

... One of the benefits of a package was the reuse of the compiled code that all other programs ran. ...

I don't agree with that. Whether the procedure/function was

a) standalone
b) in a package

it was reusable by all other programs.

... Now there would be multiple copies of the code. ...
that is exactly the point of inlining - yes.

ops$tkyte%ORA11GR1> create or replace package foo
  2  as
  3          procedure bar( x in int, y in int );
  4  end;
  5  /
Package created.

ops$tkyte%ORA11GR1> create or replace package body foo
  2  as
  3
  4  procedure hidden
  5  is
  6  begin
  7          null;
  8  end;
  9
 10
 11  procedure bar( x in int, y in int )
 12  is
 13  begin
 14          hidden;
 15  end;
 16
 17  end;
 18  /

Package body created.

ops$tkyte%ORA11GR1> create or replace procedure y
  2  as
  3  begin
  4          null;
  5  end;
  6  /

Procedure created.

ops$tkyte%ORA11GR1> create or replace procedure p
  2  as
  3          t0 number;
  4          procedure x
  5          is
  6          begin
  7                  null;
  8          end;
  9  begin
 10          t0 := Dbms_Utility.GET_CPU_TIME();
 11          for i in 1 .. 100
 12          loop
 13                  foo.bar(2,2);
 14                  x;
 15                  y;
 16          end loop;
 17          dbms_Output.Put_Line(Dbms_Utility.GET_CPU_TIME()-t0);
 18  end;
 19  /

Procedure created.

ops$tkyte%ORA11GR1> alter package foo compile body
  2    PLSQL_Warnings = 'enable:all'
  3    PLSQL_Optimize_Level = 3
  4    PLSQL_CCFlags = 'Simple:false'
  5    PLSQL_Code_Type = native
  6    reuse settings
  7  /

SP2-0811: Package Body altered with compilation warnings

ops$tkyte%ORA11GR1> show errors
Errors for PACKAGE BODY FOO:

LINE/COL ERROR
-------- -----------------------------------------------------------------
4/1      PLW-06006: uncalled procedure "HIDDEN" is removed.
4/11     PLW-06010: keyword "HIDDEN" used as a defined name
14/2     PLW-06005: inlining of call of procedure 'HIDDEN' was done
ops$tkyte%ORA11GR1> alter procedure p compile
  2    PLSQL_Warnings = 'enable:all'
  3    PLSQL_Optimize_Level = 3
  4    PLSQL_CCFlags = 'Simple:false'
  5    PLSQL_Code_Type = native
  6    reuse settings
  7  /

SP2-0805: Procedure altered with compilation warnings

ops$tkyte%ORA11GR1> show errors
Errors for PROCEDURE P:

LINE/COL ERROR
-------- -----------------------------------------------------------------
4/2      PLW-06006: uncalled procedure "X" is removed.
14/3     PLW-06005: inlining of call of procedure 'X' was done
ops$tkyte%ORA11GR1>

the inlining happens within the body or block of code (procedure)

Andrew Markiewicz, December 14, 2007 - 3:51 pm UTC

True, a stored procedure is just as reusable as a package.
But a package would prevent the invalidations if only the body was compiled.

And from your examples it looks like the inlining will only happen within a package or subprograms of a procedure.
That is, in your example, foo.bar() will never be inlined in procedure p or any other stored code that may use foo.bar.

Thanks
Andrew

December 14, 2007 - 4:16 pm UTC

... But a package would prevent the invalidations if only the body was compiled. ...

and in 11g - so will a procedure/function :)

ops$tkyte%ORA11GR1> create or replace procedure p1
  2  as
  3  begin
  4          dbms_output.put_line( 'hey' );
  5  end;
  6  /

Procedure created.

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> create or replace procedure p2
  2  as
  3  begin
  4          p1;
  5  end;
  6  /

Procedure created.

ops$tkyte%ORA11GR1> select object_name,status from user_objects where object_name in ('P1', 'P2' ) order by 1;

OBJECT_NAME                    STATUS
------------------------------ -------
P1                             VALID
P2                             VALID

ops$tkyte%ORA11GR1> create or replace procedure p1
  2  as
  3  begin
  4          dbms_output.put_line( 'hello world' );
  5  end;
  6  /

Procedure created.

ops$tkyte%ORA11GR1> select object_name,status from user_objects where object_name in ('P1', 'P2' ) order by 1;

OBJECT_NAME                    STATUS
------------------------------ -------
P1                             VALID
P2                             VALID

ops$tkyte%ORA11GR1> create or replace procedure p1( x in number default 0 )
  2  as
  3  begin
  4          dbms_output.put_line( 'hello world' );
  5  end;
  6  /

Procedure created.

ops$tkyte%ORA11GR1> select object_name,status from user_objects where object_name in ('P1', 'P2' ) order by 1;

OBJECT_NAME                    STATUS
------------------------------ -------
P1                             VALID
P2                             INVALID

as long as you don't change the signature of the procedure/function - you can now recompile it without cascading the invalidation...

11g.....

Andrew Markiewicz, December 14, 2007 - 4:39 pm UTC

Nice. 11g looks like it has much to offer. We just upgraded to 10g so it will likely be a while before we get to 11g. But all the optimization features are enticing. Throw in function result caching (which I think is a great feature) and stored plsql code really gets a performance boost.

Thanks much Tom.
Andrew

Inlining Results

Martin Hrivnak, October 26, 2008 - 3:25 pm UTC

Tom,

Is it possible to see the code resulting from the optimizer optimization steps? I guess it makes sense only for interpreted code.

Kind regards
Martin

October 26, 2008 - 7:12 pm UTC

no - because the code is not rewritten and then compiled, the code is "compiled and optimized" simultaneously.

recompiling

Ben, October 30, 2008 - 9:38 am UTC

being a fan of native compile in 9i & 10g, I'm excited to try out 11g's version. Question : When you use the inline optimization & have a procedureA compiled into procedureB, procedureC, and procedureD. What happens when you update & recompile procedureA? Does Oracle know to recompile proceduresB-D, or do we have to manually recompile? Automatic would be preferable, but I could see situations where this could cause chains of plsql procedures to be recompiled by just altering one.

October 30, 2008 - 9:57 am UTC

in 10g and above, if you do not change the signature of procedureA (the inputs and outputs) we no longer cascade the invalidation.

In 9i and before, if you modified procedureA in any way - all procedures that invoked it would invalidate and so on and so on.

but in either case, if your goal was to have happen:

"since procedureA was compiled with compiler flags for native compilation - proceduresB, C and D should be since they call A"

that never happened - you would have to explicitly compile anything you wanted to have be compiled natively.

So, if you are a big fan of native compilation - what sort of return on investment numbers have you seen? What is the net decrease in run time.

ROI for native

Ben, October 30, 2008 - 10:15 am UTC

in 10g I've seen an average of 30% increase in speed and a variable decrease in database CPU usage (depends on application)

I'm working currently on increasing the performance of an application that has been broken down into hundreds of tiny packages "for flexiblity" An extended sql trace run through Oracle Trace Analyzer shows lots of procedure calls, which I expect & with each call, there's a (small) wait. Adding up thousands of calls & well...

I think I was not exact enough in my question as I'm confused by your response. From my understanding, Inline compile takes procedureA during pre-compile of proceduresB-D & inserts it's code into procedureB-D then compiles them all. The inputs of procedureA do not change, but the algorithm does. By signature, do you mean the input, or does Oracle do a hash of the procedure code & watch for code changes & when procedureA's signature hash changes, all procedures that have procedureA inline compiled are marked as invalid & get recompiled?

November 02, 2008 - 2:36 pm UTC

in 10g I've seen an average of 30% increase in speed and a variable decrease in
database CPU usage (depends on application) ...

I'd be interested (truly) in the numbers if you have them - I've not seen anywhere near that, single digit percentages maybe. Can you describe more what you do here? This is "not standard"

is that from the inline or the native compilation? It seems you indicate "native", but truly - I've not see anything near that in the field.

how does extended sql trace show function calls? sql trace is - sql... you'd need a dbms_trace for the function call information?

the pragam inline (which I guess is what you mean - to inline the procedure), is done inside of B-D. It "inlines" - puts the plsql code in there - then compiles. If you change the code - the optimization I talked of would not be possible anymore.

but I would really love to back up a bit. Can you confirm - absolutely - 30%. That seems really really high.

timing

A reader, November 03, 2008 - 9:02 am UTC

Native compile turning PL/SQL into dll files on host. I've got limited space, so I've purged all of my older reports that had 30% improvement. The only ones I can find show 15%. I'm working on a test of 11g & should have new numbers of this same test after that. The change from 30% to 15% is partly due to the initial runs were using exclusivly explicit cursors & I've been able to get some of the key ones changed to implict (going native on explicit is _almost_ as good, but still time wasted on open/fetch/close)

http://www.dbspecialists.com/files/presentations/native_plsql.ppt slide 48. During their testings of Null loops, they actually get a 67% savings. Much of the app I've been working on tuning is looping through cursors, but we don't get this kind of improvment.

I use extended sql trace and Oracle Trace Analyzer. It shows the hash values of distinct sql & pl/sql statements. Comparing pl/sql execute times per hash (so I know I'm comparing the same SQL), I can see the isolated time increase for the execute/insert/update/etc. In the example numbers I've got handy, the actual performance increase is only about a 15%. This is still significant if you consider it is a bonus 15% for no code change.

(data not formated per OTA report, too much info for a post here)
Interpreted
Type: pl/sql. Execs: 464. Elapsed time: 61.938. CPU Time: 61.938
Type: sql. Execs 17621. Elapsed time: 28.370. CPU Time:27.109

Native
Type: pl/sql. Execs: 464. Elapsed time: 53.554. CPU Time: 52.609
Type: sql. Execs: 17621. Elapsed time: 31.595. CPU Time: 31.047

This subsection of an OTA report shows the native execution of the pl/sql runs in 84.9% of the time of the interpreted. The SQL shows a performance increase, but this is the exception, due I believe to other users on system accessing data right before the test run (my physical reads dropped from 20 to 6) During most of my testing, when possible I have all other users out of database & restart the DB (at the insistance of others)

Your mantra of "do it in a single SQL statement, if at all possible." indicates your pl/sql will actually be dominatly sql statements, so I'm not surprised by a lack of performance difference.

November 10, 2008 - 3:07 pm UTC

... The change from 30% to 15% is partly due to the
initial runs were using exclusivly explicit cursors & I've been able to get
some of the key ones changed to implict (going native on explicit is _almost_
as good, but still time wasted on open/fetch/close)
....

probably - you would see no difference between native and non-native.

When you were doing explicit, you were doing row by row (slow by slow)
When you went implicit, you got an implicit array fetch of 100 rows at a time!

You are comparing apples to flying toaster ovens, you changed more than one thing, you cannot point therefore to anyone thing and say "that did it"

ops$tkyte%ORA10GR2> alter session set events '10046 trace name context forever, level 12';

Session altered.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> declare
  2          cursor c is select * from stage slow_by_slow;
  3          l_rec c%rowtype;
  4  begin
  5          for x in ( select * from stage implicit ) loop null; end loop;
  6          open c;
  7          loop fetch c into l_rec; exit when c%notfound; end loop;
  8          close c;
  9  end;
 10  /

PL/SQL procedure successfully completed.

SELECT * FROM STAGE IMPLICIT

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.02          0          1          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch      501      0.38       0.39          0       1185          0       50005
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      503      0.40       0.42          0       1186          0       50005

Rows     Row Source Operation
-------  ---------------------------------------------------
  50005  TABLE ACCESS FULL STAGE (cr=1185 pr=0 pw=0 time=250272 us)
********************************************************************************
SELECT * FROM STAGE SLOW_BY_SLOW

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          1          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    50006      0.93       0.74          0      50010          0       50005
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    50008      0.94       0.75          0      50011          0       50005

Rows     Row Source Operation
-------  ---------------------------------------------------
  50005  TABLE ACCESS FULL STAGE (cr=50010 pr=0 pw=0 time=401436 us)

so, unless you test the same code both ways - I still doubt your numbers are due to ncomp'ing

a test that test's 100,000,000 null loops doesn't impress.

I can show you a procedure that gets a 5x decrease in runtime. It is horrific code, not anything you would actually ever really code. It would be a bad example of native compilation speed ups.

I'd be interested in a real compare of the same base code...

Martin Rose, September 17, 2011 - 8:36 am UTC

What's the array size used when auto-converting to bulk collects?

I guess the coder has no control over this?

September 17, 2011 - 11:24 am UTC

what is auto-converting to bulk collects?

for x in ( select ... ) ...

that'll array fetch 100 rows at a time (or 1 if you turn off plsql optimization).

open c
fetch c into l_rec;

will fetch row by row - one row at a time.

open c;
fetch c BULK COLLECT into l_records LIMIT N;

will fetch N rows at a time, you tell us N

select * bulk collect into l_records;

will fetch all records in one call - number of records varies based on query.

Martin Rose, September 17, 2011 - 12:13 pm UTC

I meant when PLSQL_OPTIMIZE_LEVEL >= 2, what value of LIMIT is being implicitly set by the compiler to render a row-by-row cursor FOR LOOP into a BULK COLLECT one ?

September 19, 2011 - 5:37 pm UTC

did you test it? Just turn on sql_trace=true and review the results in a tkprof? It takes a lot less time than asking a question on a forum :)

It is either 100 or 1 - nothing else. If you disable optimization, it is 1. If you let optimization be on - it is 100.

ops$tkyte%ORA11GR2> create table t as select * from all_objects;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> @trace
ops$tkyte%ORA11GR2> alter session set events '10046 trace name context forever, level 12';

Session altered.

ops$tkyte%ORA11GR2> alter session set PLSQL_OPTIMIZE_LEVEL=3;

Session altered.

ops$tkyte%ORA11GR2> begin for x in (select * from t level3) loop null; end loop; end;
  2  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> alter session set PLSQL_OPTIMIZE_LEVEL=2;

Session altered.

ops$tkyte%ORA11GR2> begin for x in (select * from t level2) loop null; end loop; end;
  2  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> alter session set PLSQL_OPTIMIZE_LEVEL=1;

Session altered.

ops$tkyte%ORA11GR2> begin for x in (select * from t level1) loop null; end loop; end;
  2  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> pause


<b>tkprof says....</b>

SELECT * FROM T LEVEL3

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          1          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch      723      0.19       0.25        771       1749          0       72224
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      725      0.19       0.25        771       1750          0       72224
********************************************************************************
SELECT * FROM T LEVEL2

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          1          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch      723      0.12       0.16          0       1749          0       72224
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      725      0.12       0.16          0       1750          0       72224
********************************************************************************
SELECT * FROM T LEVEL1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          1          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    72225      0.22       0.33          0      72229          0       72224
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    72227      0.22       0.33          0      72230          0       72224

so, level 3, 2 = 100, level 1 = 1

A reader, September 20, 2011 - 7:59 am UTC

> did you test it? Just turn on sql_trace=true and review the results in a tkprof? It takes a lot less time than asking a question on a forum :)

Ahh, well with all due respect, just because the example you gave uses 100, doesn't mean that this is simply hard-coded as such.

There could have been a complex formula being used that increases the batch size according to the number of rows being processed, or one that takes into account the current free space in the PGA.

But thanks for your answer anyway. 100 it is.

September 20, 2011 - 6:45 pm UTC

ok, true that is.

it is either 1 or 100 for the implicit cursors.

100 is pretty much a sweet spot for array fetching.

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Good performance gains

11g.....

Inlining Results

recompiling

ROI for native

timing

More to Explore

PL/SQL demos

PL/SQL docs

What does plsql_optimize_level parameter entail?

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Good performance gains

11g.....

Inlining Results

recompiling

ROI for native

timing

More to Explore

PL/SQL demos

PL/SQL docs