Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Aman.

Asked: June 25, 2008 - 1:26 am UTC

Last updated: March 04, 2013 - 4:11 pm UTC

Version: 10.2.0

Viewed 10K+ times! This question is

You Asked

Hi sir,
How can I use TKPROF to find the problematics queries?The only point that I know is that when the elapsed time is more higher that means that query spent more time waiting on something.But still I am not surehow to use TKPROf for the query tuning?I have read teh book of yours too.BUt somehow I couldn't formulate the idea.
Can you please help in using tkprof for understandingthe problem of a query?
Rgards
Aman....

and Tom said...

you use tkprof to identify queries in your application that consume "lots of resources" - cpu time, elapsed time.

say you do this:

ops$tkyte%ORA10GR2> exec dbms_monitor.session_trace_enable( waits=>true );

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> select count(subobject_name) from big_table.big_table;

COUNT(SUBOBJECT_NAME)
---------------------
               688256



Tkprof will give a report that looks much like this:


exec dbms_monitor.session_trace_enable( waits=>true );
select count(subobject_name) from big_table.big_table;



select count(subobject_name)
from
 big_table.big_table


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.01          0          3          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     99.36     262.10    1840758    1840797          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     99.36     262.11    1840758    1840800          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 344

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=1840797 pr=1840758 pw=0 time=262104893 us)
128000000   TABLE ACCESS FULL BIG_TABLE (cr=1840797 pr=1840758 pw=0 time=384004887 us)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  db file scattered read                      14425        0.22        195.87
  db file sequential read                        13        0.01          0.06
  SQL*Net message from client                     2        0.00          0.00




Now, we have a bunch of FACTS

fact: we did a lot of physical IO
fact: we did a lot of logical IO
fact: we used about 100 cpu seconds
fact: it took about 262 seconds elapsed time, therefore we waited lots of seconds for something
fact: we waited for db file scattered read for a long time, this is the read of a full scan
fact: we full scanned a huge table
fact: we can see our query
fact: we have the plan that was used

Now, what can we do with these facts? We can use our knowledge of

a) the data
b) how oracle works

to "tune". What are some obvious things to think about there?

Well, we see the query result was about 700,000 rows out of 128,000,000
that gives us another fact or two

fact: we really needed a very small subset of the rows from this table to answer this query
fact: the tables high water mark it probably OK - meaning, the table is very well packed. If there had only been 1,000,000 rows in the table, I'd be worried since the table is so large. So, shrinking or reorging the table isn't going to do anything

(we can use facts to rule out things - that is as useful as using a fact to rule "in" something)


Ok, so what are some possible options

a) make full scan faster - maybe compress the table
b) we need about 0.54% of the tables rows to answer our query (count the non-null occurrences of subobject_name). Hey, maybe an index will help.

Let's try B first, least impact.



big_table%ORA10GR2> create index big_table_so_idx on big_table(subobject_name);

Index created.



and now again:

ops$tkyte%ORA10GR2> exec dbms_monitor.session_trace_enable( waits=>true );

PL/SQL procedure successfully completed.


ops$tkyte%ORA10GR2> select count(subobject_name) from big_table.big_table;

COUNT(SUBOBJECT_NAME)
---------------------
               688256


and tkprof shows:

select count(subobject_name)
from
 big_table.big_table


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.04          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.28       0.52       3342       3355          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.28       0.56       3342       3355          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 344

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=3355 pr=3342 pw=0 time=520528 us)
 688256   INDEX FAST FULL SCAN BIG_TABLE_SO_IDX (cr=3355 pr=3342 pw=0 time=730570 us)(object id 161278)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  db file scattered read                         42        0.01          0.27
  SQL*Net message from client                     2        0.00          0.00



Now in real life, it'll be more complicate than this of course, you'll have a multi-table query to deal with, complex relationships. To tune that query you either:


a) submit it to a tool, let it process it (tuning/performance pack of Enterprise manager). It will apply the rules and tell you what indexes, materialized views, query rewrites, (and in 11g partitioning schemes) would be useful for that query.

b) use the FACTS you can derive from tkprof and other sources and apply YOU KNOWLEDGE of 1) the data, the data patterns and 2) Oracle - the indexes it has to offer, the schema structures (clusters, IOTs, heap tables, partitioning, etc) has to offer, the SQL it supports (you can rewrite a query more efficiently in many cases)


So, to use tkprof for query tuning, you are getting half of what you need - the FACTS. The other half is all about you - and your knowledge of the database. If you don't have or don't want to have that other half, you will be using tools such as the performance pack in enterprise manager.

Rating

  (33 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A reader, June 25, 2008 - 11:38 am UTC

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.01 0 3 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 99.36 262.10 1840758 1840797 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 99.36 262.11 1840758 1840800 0 1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 344

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=1840797 pr=1840758 pw=0 time=262104893 us)
128000000 TABLE ACCESS FULL BIG_TABLE (cr=1840797 pr=1840758 pw=0 time=384004887 us)


If the total time is 262 seconds - then how come it took 384004887 micro seconds(384 seconds) just for the full table scan ? Are these 'time' mentioned in each step cumulative ? Even then it does not make sense

BA
Tom Kyte
June 25, 2008 - 12:03 pm UTC

the timings are done at different levels.

start_time := now;
loop alot
start_time2 = now
do something
elapsed2 = elapsed2 + (now-start_time2)
end loop
elapsed = (now-start_time)

elapsed2 would be different form elapsed2 - because they count at different levels, and the more times you iterate the larger the potential for errors to creep in.



A reader, June 25, 2008 - 2:04 pm UTC

Are these 'time' at each step supposed to be cumulative ?

BA
Tom Kyte
June 25, 2008 - 3:38 pm UTC

yeah, they rollup - but they rollup 'better' typically because of the timings at the different levels (measuring longer and longer events)

Suman Roy, June 26, 2008 - 2:57 am UTC

Sir, which command you used to get this tkprof output? I have used 'tkprof <trace file> a.txt sys=no sort=prsela exeela fchela' but not getting the 'Rows Row Source Operation
' portion. Please advise.

Regards
SR
Tom Kyte
June 26, 2008 - 4:05 pm UTC

cursors need to be closed to get that in at least 10gr2 and before


if you;

a) enable trace
b) run stuff
c) disable trace
d) THEN CLOSE CURSORS

the stat records will not be output (the row source bits)

it is not tkprof hiding this, tkprof would always show it, it is not in the trace file

make sure cursors are closed - best bet is to exit the application, NOT disable trace - to ensure all traced cursors have their stat records written out.

STATISTICS_LEVEL setting effects roll-ups

Karen Morton, June 28, 2008 - 12:22 am UTC

I believe the root cause of the differences between the total elapsed time and the time for the full table scan step in the plan is due to the setting of STATISTICS_LEVEL. It appears the parameter is set to TYPICAL (the default). When set to TYPICAL, the rowsource/plan execution statistics are disabled.

I can confirm the setting of STATISTICS_LEVEL as follows:
SQL> select value from v$parameter where upper(name) = 'STATISTICS_LEVEL' ;

VALUE
-----------------------------------------------------------------------------
TYPICAL

1 row selected.

Then, I can check to see what information is enabled at this setting with the following query:
SQL> SELECT session_status,
  2         system_status,
  3         activation_level,
  4         session_settable
  5  FROM   v$statistics_level
  6  WHERE  statistics_name = 'Plan Execution Statistics'
  7  /

Session    System     Activation Session
Status     Status     Level      Settable
---------- ---------- ---------- ----------
DISABLED   DISABLED   ALL        YES

1 row selected.


So, I can see that currently the plan execution statistics are DISABLED. I next ran the test after creating the index (FYI - my big_table is much smaller - only 3.1 million rows) and I get this plan output (from my tkprof report):
Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=120 pr=0 pw=0 time=11988 us)
  28736   INDEX FAST FULL SCAN BIG_TAB_SO_IDX (cr=120 pr=0 pw=0 time=287550 us)(object id 52645)

Note that the time=11988 on the top row of the plan. That's correct and approximately matches the elapsed time shown by the tkprof output. However, the 2nd row of the plan shows a time=287550. Since child steps roll up into their parents, there is no way that these numbers are correct as the child time (step 2) is much greater than the parent time (step 1). Something is amiss.

Now, if I alter this parameter to ALL (alter system set statistics_level = ALL) and run the test again, here's what I get (I'm showing the whole tkprof output for completeness):
select count(subobject_name) 
from
 big_tab


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.14       0.14        113        120          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.14       0.15        113        120          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 54  

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=120 pr=113 pw=0 time=148169 us)
  28736   INDEX FAST FULL SCAN BIG_TAB_SO_IDX (cr=120 pr=113 pw=0 time=93170 us)(object id 52645)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  db file scattered read                         16        0.00          0.01
  SQL*Net message from client                     2        0.00          0.00


Now, everything is correct. The child step (#2) has time=93170 and the parent step (#1) has time=148169. That makes sense. The parent time includes the child so it must be greater than or equal to the child...and it is.

So, regardless of whatever measurement errors are caused by where Oracle places the timing calls, if you have statistics_level set to ALL, the rollups are correct. It just doesn't make any sense that a child could legitimately be greater than its parent *if* it's a hard and fast rule that children must rollup their values into their parents. And since the values seem to always correctly roll up if ALL is the setting, then you either need to use that level to get meaningful child values or realize that those values are not correct and only refer to the top level parent step values when you review the plan statistics.

So, in the end, the nagging question for me is what are those child numbers (i.e. where do they come from) when the level is set to TYPICAL? They appear to just be junk and I'd think if they were not being used for rollups, that Oracle would just list them as null. I've wondered about this for a while but just added it to my list of things I know how to interpret and how to read what I see (i.e. ignore the child steps execution stats), but I don't know why. But, I'd really like to know what's really going on under there. :)




So how do you...

Stephan, July 02, 2008 - 9:51 am UTC

What would you look at here?
Update FIN$_TRAN_ACTION Set RESORT = '000100' 
WHERE
 RESORT LIKE '030000' and rownum <= 2000


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse      708      0.28       1.38          0          0          0           0
Execute    708    148.90     260.96    1228682   13123465      72331     1413086
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     1416    149.18     262.35    1228682   13123465      72331     1413086

Misses in library cache during parse: 1
Optimizer mode: FIRST_ROWS
Parsing user id: 50  (OPERA)   (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  UPDATE  FIN$_TRAN_ACTION (cr=47 pr=49 pw=0 time=46250 us)
   2000   COUNT STOPKEY (cr=47 pr=49 pw=0 time=29740 us)
   2000    TABLE ACCESS FULL FIN$_TRAN_ACTION (cr=47 pr=49 pw=0 time=25770 us)


Rows     Execution Plan
-------  ---------------------------------------------------
      0  UPDATE STATEMENT   MODE: FIRST_ROWS
      0   UPDATE OF 'FIN$_TRAN_ACTION'
   2000    COUNT (STOPKEY)
   2000     TABLE ACCESS   MODE: ANALYZED (FULL) OF 'FIN$_TRAN_ACTION' 
                (TABLE)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file sequential read                      1210        0.04          0.80
  db file scattered read                      82656        0.41        118.64
  latch: cache buffers chains                    16        0.00          0.02
  latch: cache buffers lru chain                  1        0.00          0.00
  log file switch completion                      2        0.04          0.05
  latch free                                      1        0.00          0.00
********************************************************************************


The table is 2M records, the resort column is not indexed. I ran the same in a similar - not identical by any stretch of the imagination - environment, and had far lower times. What do I look at next?

Thanks, Tom
Tom Kyte
July 07, 2008 - 7:10 am UTC

think about this

you have 2 million rows.

you ask "find the first 2k where condition, update them to not be that condition"

...
Update FIN$_TRAN_ACTION Set RESORT = '000100'
WHERE
RESORT LIKE '030000' and rownum <= 2000
........

Now, just to make the example interesting - let us say that 2% of the rows match your criteria before you start. So, 2,000 out of every 100,000 rows match.

The first time you execute it, we have to full scan 100,000 rows.
The second time you execute it, we have to full scan 100,000 + 100,000 rows.
....
The Nth time you execute it, we have to full scan N*100,000 rows!!!!

You start over at the "top" of the table each and every time (this is one of the big reasons I hate this concept of "we have to update in small chunks in order to appease the 'undo gods' and not generate undo in large chunks).


Every time you update, you full scan the data you already updated - looking for data to update - of which you will find none because you just updated it. sigh....


either

a) use parallel dml, update the table in parallel in many parallel execution server sessions, each of which can use it's own undo segment.

b) use "diy parallel", break the table into rowid ranges and update using:

Update FIN$_TRAN_ACTION Set RESORT = '000100'
WHERE
RESORT LIKE '030000' and rowid between :X and :Y;

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:10498431232211



I'd ask why you're doing the UPDATE that way...

Karen Morton, July 03, 2008 - 6:56 pm UTC

If you look at the plan for a single UPDATE it looks OK. It's only accessing 49 blocks from the table to update 2000 rows and the table has over 2M rows. That's what ROWNUM is doing for you...it "short circuits" the full table scan. So, even though the operation is a full scan, it really isn't full scanning, it's only scanning enough blocks to find 2000 rows that need to be updated. But, the complete stats show that this UPDATE was executed 708 times. Uh....why?

And since the report shows a recursive depth of 1, that should indicate that the UPDATE is a child of some other piece of code. Is this UPDATE in a trigger perhaps?

Again, my question is why the code would be trying to update 2000 rows at a time 708 times? Why not do this in one pass? The final result was that 1,413,086 rows were updated. That's around 70% of the total rows in the table. It just seems like a single UPDATE statement without ROWNUM would do what you want more efficiently. Since you don't even have an index on the RESORT column, it must not be very selective (i.e. there are only a few distinct values of RESORT), so once again, my question about be why split up the update into so many small chunks when one update would do the job?

Another thing you can tell is that a big chunk of the time is spent doing physical reads (db file scattered read time is 118.64).

A reader, September 05, 2008 - 2:02 pm UTC

In your answer to the original question, you stated:

"fact: the tables high water mark it probably OK - meaning, the table is very well packed. If there had only been 1,000,000 rows in the table, I'd be worried since the table is so large. So, shrinking or reorging the table isn't going to do anything"

How did you draw the conclusion that the table is very well packed?

Tom Kyte
September 05, 2008 - 4:43 pm UTC

.. 128000000 TABLE ACCESS FULL BIG_TABLE (cr=1840797 ...

Math.

full scan, 128,000,000 rows and only 1,840,797 IOs, about 69-70 rows per block, seems right for the row width - not a lot, if any, 'extra' space in there.

Now, if there were 1,000,000 rows - then there would be less then 1 row per block, that would indicate the table was mostly empty...

confused on recursive calls

Mike, January 30, 2009 - 4:53 pm UTC

We have package prod_wer1, and it's a cust Oracle Apps package. We run it with trace of bind and waits.

BEGIN prod_wer1(:errbuf,:rc,:A0,:A1,:A2,:A3); END;

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 5415.21 5639.98 13293 6838772 1399904 1
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 5415.21 5639.98 13293 6838772 1399904 1

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 6 0.00 0.00 0 0 0 0
Execute 7 5415.33 5640.13 13307 6839011 1399929 7
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 13 5415.33 5640.13 13307 6839011 1399929 7

Misses in library cache during parse: 1
Misses in library cache during execute: 1

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 2 0.00 0.01
log file sync 5 0.00 0.01
SQL*Net message to client 9 0.00 0.00
SQL*Net message from client 9 0.00 0.00
SQL*Net message to dblink 22060 0.00 0.05
SQL*Net message from dblink 22060 0.02 11.33
SQL*Net more data from dblink 15 0.00 0.00
cursor: mutex S 1 0.00 0.00
library cache pin


OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 23079 4.82 4.85 40 393 5 0
Execute 1223471 1150.78 1266.89 10687 207816 1399886 137059
Fetch 964662 363.10 552.86 2881 6648345 23 1996898
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2211212 1518.70 1824.61 13608 6856554 1399914 2133957

The non-recursive calls were for stats of the total of the package. And the recursive alls should be the stats of the queries and oracle internal of which are relevant to the package when the package executed. the elapsed time of the non-recursive was 5640.13 vs. 1824.61 of recursive calls. why so much different? The recursive calls should be close to the non recursive call for this particular case, correct?

Tom Kyte
January 30, 2009 - 8:55 pm UTC

no, they are mutually exclusive.


ops$tkyte%ORA10GR2> create or replace procedure p( p_do_query in boolean )
  2  as
  3          l_x number;
  4  begin
  5          for i in 1 .. 250000
  6          loop
  7                  l_x := ln(i);
  8          end loop;
  9
 10          if ( p_do_query )
 11          then
 12                  for x in (select count(*) from all_objects)
 13                  loop
 14                          null;
 15                  end loop;
 16          end if;
 17  end;
 18  /

Procedure created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> @trace
ops$tkyte%ORA10GR2> alter session set events '10046 trace name context forever, level 12';

Session altered.

ops$tkyte%ORA10GR2> exec p( FALSE )


tkprof shows:

BEGIN p( FALSE ); END;


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1     12.73      12.43          0          0          0           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2     12.73      12.43          0          0          0           1


so, without sql, about 12/13 cpu seconds...

Now

ops$tkyte%ORA10GR2> alter session set events '10046 trace name context forever, level 12';

Session altered.

ops$tkyte%ORA10GR2> exec p( TRUE )



tkprof shows:

EGIN p( TRUE ); END;


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1     12.73      12.45          0          0          0           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2     12.74      12.45          0          0          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 493

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  SQL*Net message from client                     2        0.00          0.00
********************************************************************************

SELECT COUNT(*)
FROM
 ALL_OBJECTS


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      2.99       2.92          0     204456          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      2.99       2.92          0     204456          0           1



so, still 12/13 seconds for "procedure" but 3 seconds for "query"

and the non-recursive stuff matches the procedure by itself (no sql) call



OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        5      0.00       0.00          0          0          0           0
Execute      5     12.74      12.45          0          0          0           4
Fetch        2      0.00       0.00          0          0          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       12     12.75      12.46          0          0          0           5

Misses in library cache during parse: 1

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       7        0.00          0.00
  SQL*Net message from client                     7        0.00          0.00


OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse     1254      0.03       0.03          0          0          0           0
Execute   2551      0.52       0.50          0          0          0           0
Fetch     2551      4.18       4.09          0     221009          0        2551
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     6356      4.74       4.63          0     221009          0        2551





thanks for the clarification on recursive vs. non-recursive calls

Mike, January 30, 2009 - 10:51 pm UTC

Tom,
can we say non-recursive calls are for those non-sql codes and recursive calls for sql statements?
Tom Kyte
January 31, 2009 - 8:33 am UTC

... can we say non-recursive calls are for those non-sql codes and recursive calls
for sql statements? ...


non-recursive statements are statements issued by the client to the server.

recursive statements are statements executed by that non-recursive call.


non-recursive statements can be sql, or plsql - they are just "a statement submitted by client to server"

recursive statements can be plsql, or sql - they are just "a statement executed by the statement submitted by the client to the server"




Mike, January 31, 2009 - 9:15 am UTC

Tom,
OK, Thanks so much for making me understood of the non/recursive calls.

I issued
sql>exec prod_wer1
tkprod showing:

BEGIN prod_wer1(:errbuf,:rc,:A0,:A1,:A2,:A3); END;

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 5415.21 5639.98 13293 6838772 1399904 1
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 5415.21 5639.98 13293 6838772 1399904 1

If I understood correctly all of the sql statements within the package should be also shown seperate stats in the output of tkprof. Thus plsql of the package itself used CPU of 5415.21 seconds, which NOT Including any of the sqls in the package and those oracle internals as the recursive calls invoked by this package.
Am I wrong about?
THanks!
Tom Kyte
January 31, 2009 - 9:53 am UTC

in current releases - yes (10g and up)

in earlier releases - it was reported differently, they tended to "roll up" the numbers.

Run this procedure with tracing on:

create or replace procedure p
as
begin
    for x in (select * from all_objects)
    loop
        null;
    end loop;
end;
/



and if the time for P is near 0, and the time for the SQL is not near zero, it is not being rolled up. If the time for P is near the time for the SQL, then it is being rolled up.

Mike, February 01, 2009 - 9:06 am UTC

Tom,
I have run your both packages on 10.2.0.1 and confirmed that stats of non-recursive and recursive are mutual-exclusive.
The package is a cust(3rd party) and it also calls a few Oracle APIs. Unfortunately, the tkprof only shows the stats of our package as non-resursive calls which summing up all non-sql in the main package plus non-sql in Oracle APIs. No showing those Oracle APIs' non-sql stats as seperate.
I think this because tkprof only shows the stats of "SQL" statements, such as
begin main_pk end;
in our case, tkprof show the non-recursive calls for non-sql. and recursive calls for all relavent SQL statements which including those from Oracle APIs.

How could we obtain the stats for non-sql parts in the APIs called? We cannot read the main package since it is wraped and we do not know what those APIs are. To obtain the info from the 3rd party will take very looong and they may not cooperate.
Tom Kyte
February 02, 2009 - 10:49 am UTC

tkprof shows statements executed. If you compile a procedure p1 that calls p2 that calls p3 - the statement executed is "p1" - p2 and p3 are just part of the code.

However, if p1 used "execute immediate 'begin p2; end;';" it would be a statement submitted to the server to execute, it wouldn't be just a procedure call, it would be p1 running a STATEMENT.


Now, to answer your question - you can use dbms_profiler, search this site for that.

TKPROF report for packages

Parag G, February 19, 2009 - 11:10 am UTC

I generated a TKPROF report for a package in oracle 10g. But it did not show any individual query execution in the package . The package was executed in an anoynomous pl/sql block. Can you please let me know if i missed something?

I had set trace for 10046 at level 8.


Tom Kyte
February 19, 2009 - 1:43 pm UTC

in older releases like 10g and before - the cursor would only appear in the tkprof report when it was closed

and plsql nicely caches our sql.

so probably you

a) enabled trace
b) ran package
c) disabled trace
d) exit session

well, the cursors did not close until you exited the session - but you were not tracing anymore.

do this:

a) enable trace
b) run package
c) exit

then tkprof

Why row source is bigger than the total rows in table?

Tony, May 27, 2010 - 7:45 am UTC

Hi To,

I have a (part of) tkprof as below:

435120 TABLE ACCESS BY INDEX ROWID EDS_T (cr=881804 pr=2 pw=0 time=5139443 us)
435120 INDEX UNIQUE SCAN EDS_IDX1 (cr=446684 pr=0 pw=0 time=2793555 us)(object id 29894)

The row source shows more row than the actual rows in the table. Why? Please explain.


Tom Kyte
May 27, 2010 - 7:50 am UTC

because it was inside of a nested loop join or something?

without "context", it is really hard "to guess"



ops$tkyte%ORA11GR2> create table t1
  2  as
  3  select 1 x, level y
  4    from dual
  5  connect by level <= 10;

Table created.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> create table t2 (x int primary key, z int );

Table created.

ops$tkyte%ORA11GR2> insert into t2 values ( 1, 0 );

1 row created.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> exec dbms_stats.set_table_stats( user, 'T1', numrows => 1000000, numblks => 10000 );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> exec dbms_stats.set_table_stats( user, 'T2', numrows => 1000000, numblks => 10000 );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> @trace
ops$tkyte%ORA11GR2> alter session set events '10046 trace name context forever, level 12';

Session altered.

ops$tkyte%ORA11GR2> select /*+ first_rows(10) */ *
  2    from t1, t2
  3   where t1.x = t2.x
  4  /

         X          Y          X          Z
---------- ---------- ---------- ----------
         1          1          1          0
         1          2          1          0
         1          3          1          0
         1          4          1          0
         1          5          1          0
         1          6          1          0
         1          7          1          0
         1          8          1          0
         1          9          1          0
         1         10          1          0

10 rows selected.



t2 has one row, but we'll have hit it 10 times in the nested loop:

select /*+ first_rows(10) */ *
  from t1, t2
 where t1.x = t2.x

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          1         18          0          10
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.00       0.00          1         18          0          10

Misses in library cache during parse: 1
Optimizer mode: FIRST_ROWS
Parsing user id: 316

Rows     Row Source Operation
-------  ---------------------------------------------------
     10  NESTED LOOPS  (cr=18 pr=1 pw=0 time=0 us)
     10   NESTED LOOPS  (cr=8 pr=1 pw=0 time=2799 us cost=12 size=520 card=10)
     10    TABLE ACCESS FULL T1 (cr=4 pr=1 pw=0 time=684 us cost=2 size=26000000 card=1000000)
     10    INDEX UNIQUE SCAN SYS_C0022732 (cr=4 pr=0 pw=0 time=0 us cost=0 size=0 card=1)(object id 101452)
     10   TABLE ACCESS BY INDEX ROWID T2 (cr=10 pr=0 pw=0 time=0 us cost=1 size=26 card=1)




so, did I guess right?

You are correct Tom. Thanks

Tony, May 27, 2010 - 8:24 pm UTC


Why the CPU time is more than total elapsed time?

Vincent, May 31, 2010 - 2:49 am UTC

Hi To,

The below tkprof shows that CPU time is more than the elapsed time. How is it possible?

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 5075.55 4974.84 1641 43414120 266745 1
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 5075.56 4974.84 1641 43414120 266745 1


Tom Kyte
May 31, 2010 - 8:32 am UTC

because things are counted at different levels - with sometimes different levels of precision.

We might be getting cpu in 1/100s of a second and elapsed time in 1/1000s of a second.

Or more likely, we are getting the results at slightly different times - eg: what if you:

start_cpu = get_cpu_time
start_ela = get_ela_time
do something
total_ela = get_ela_time - start_ela
total_cpu = get_cpu_time - start_cpu


Now, suppose you did that and you ran with all resources (no waiting on cpu), you would expect to have the same number for cpu and ela - but of course you cannot. Your cpu time included the time to get and compute the ela times. Your ela times only capture a fraction of that.

No big deal so far. But if you do that 43,414,120 a small error can easily add up and up and up....

And remember, the time slices are digital, not analog. If you are timing something with a precision of 1/1000 of a second - then each observation could be off by 9/10000 of a second.


ops$tkyte%ORA11GR2> select 41414120 * 9/10000 from dual;

41414120*9/10000
----------------
       37272.708



so, because the times are computed at slightly different times (have to be), and because the times might not have the same precision (depending on OS implementation), and because each time itself is only accurate to some precision - whenever you do something millions of times - the times will be "off".

they are better than good enough - they are very good indeed - but they are never going to be 'perfect'

Can we ignore if CPU time is more than elapsed time?

Vincent, May 31, 2010 - 10:27 pm UTC

Tom,

My understanding of your reply is that we can ingore (as it doesn't imply anything) if CPU time is more the elapsed time as it is due to calculation error. Am I right?

Is it right to assume that if CPU time is less than the elapsed time, then there are some wait events causing the increased elapsed time? Thanks.



Tom Kyte
June 08, 2010 - 7:59 am UTC

not a calculation error, it is a quantization error....


... Is it right to assume that if CPU time is less than the elapsed time, then
there are some wait events causing the increased elapsed time? Thanks. ...

think about that critically for a moment. If CPU could be more than Elapsed due to quantization errors - could not the reverse be true?


It is probable, highly probable, that if

a) there are millions of events observed and elapsed is significantly higher than CPU

or

b) there are very few events observed and elapsed is higher than CPU

that you had a wait event for something, be it CPU, IO, lock, whatever.

TKProf

Tracy, June 01, 2011 - 5:13 am UTC

How can the overall totals elapsed times be less than the sum of the Event waited on times when the elapsed times are supposed to include waiting on events. See this snippet from a tkprof report:

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 0 0.00 0.00 0 0 0 0
Execute 353158 106.00 324.60 16575 735549 8181425 340591
Fetch 253375 11.08 170.39 13253 970839 0 365791
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 606533 117.08 495.00 29828 1706388 8181425 706382

Misses in library cache during parse: 0

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message from client 429954 0.58 571.63
SQL*Net message to client 429954 0.00 0.17
db file sequential read 29828 0.63 381.78
latch: cache buffers lru chain 8 0.00 0.00
SQL*Net more data to client 6183 0.00 0.13
log file switch completion 4 0.04 0.09
latch: cache buffers chains 1 0.00 0.00


OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 64 0.00 0.00 0 0 0 0
Execute 751040 10.28 10.33 0 919 421709 205157
Fetch 546021 1.43 1.35 11 582 0 546737
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1297125 11.71 11.69 11 1501 421709 75189
4

Misses in library cache during parse: 7
Misses in library cache during execute: 7

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 11 0.00 0.04
buffer busy waits 2 0.00 0.00
latch: shared pool 1 0.00 0.00

16 user SQL statements in session.
205302 internal SQL statements in session.
205318 SQL statements in session.



The total elapsed time for non-recursive is 495 seconds yet this is supposed to include waits of 571.63 and 381.78.

Can you explain?


Tom Kyte
June 01, 2011 - 2:32 pm UTC

because the counting of sqlnet message from client is rather "odd". It is a strange event. Your SQL didn't really "wait" on it, it is time we spent waiting to be told to do something.

It isn't part of the elapsed time of your SQL statement, it is a wait by the database on the client application - we were not executing anything during that time really.

tkprof misleading

Tracy, June 09, 2011 - 3:54 am UTC

Yes, that makes sense. It means that the tkprof output is slightly misleading with regard to "SQL*Net message from client" waits when it says:
"Elapsed times include waiting on following events"

Presumably in my example the "db file sequential read" waits are genuinely part of the sql elapsed time while the "SQL*Net message from client" waits aren't.

I think that it would be better if tkprof made a distinction between waits that are included in the elapsed time and those that aren't, to avoid confusion.

tkprof and AWR

A reader, October 31, 2011 - 2:34 am UTC

Hi Tom,

I am working on tuning of a vendor app having high CPU util. App just schedules jobs calling wrapped procedures so can't get much details on sql from v$session. On tracing found that SQLs are not using bind variables and hard parse most of them.
Now my problem is that the total CPU consumption as mentioned in tkprof is very low to what is present in AWR report (for the session/job traced).

Below are the snaps from tkprof and AWR. AWR show me the real cpu consumption(which is quite high) but tkprof doesn't. I am guessing excessive hard parsing as culprit (see library cahce miss) but problem is I can't see high CPU Parse time in tkprof.

TKPROF Footer

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 0 0.00 0.00 0 0 0 0
Execute 0 0.00 0.00 0 0 0 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 0 0.00 0.00 0 0 0 0

Misses in library cache during parse: 0

OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 2153 8.02 7.80 0 0 0 0
Execute 6395 3.04 2.88 0 0 0 0
Fetch 6771 4.64 4.51 35 195171 0 7591
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 15319 15.70 15.21 35 195171 0 7591

Misses in library cache during parse: 719
Misses in library cache during execute: 21

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 35 0.00 0.04
latch: shared pool 1 0.00 0.00

1437 user SQL statements in session.
1082 internal SQL statements in session.
2519 SQL statements in session.
********************************************************************************
Trace file: ././xxxx_j009_19673.trc
Trace file compatibility: 10.01.00
Sort options: default

1 session in tracefile.
1437 user SQL statements in trace file. (Above mentioned 740 are libarary cache miss)
1082 internal SQL statements in trace file.
2519 SQL statements in trace file.
736 unique SQL statements in trace file.
110564 lines in trace file.
5608 elapsed seconds in trace file. (Above we can see 15.21 is the total sql elapsed time and 15.70s is total cpu time)


AWR report
Below is the job which I traced.

Elapsed CPU Elap per % Total
Time (s) Time (s) Executions Exec (s) DB Time SQL Id
---------- ---------- ------------ ---------- ------- -------------
10,885 10,920 2 5442.6 5.4 44ybz0jhbpwcy
begin xxxxxx.xxxxxxx(:v1,:v2,:v3,:v4,:v5, sysdate,:v6,:v7,:v8,:v9,:v10,
:v11,:v12); end;


Tom Kyte
October 31, 2011 - 11:24 am UTC

wrapping code does NOTHING to hide the sql from you - not a thing. the sql is 100% visible - always - no way around it - in v$sql. So, I don't know what you mean in your first paragraph at all.

AWR is system wide, a tkprof is a single session


I cannot read any of your stuff since you didn't use the code button and it is all proportional fonts. Please use the code button so things LINE UP and are readable.

and tell us how you took this tkprof - how did you get tracing on for it


and - there were at least two executions of this during this window. And if there were others still executing - they could be contributing to these values as well.


TKPROF and AWR

A DBA, November 01, 2011 - 12:38 am UTC

Hi Tom,

Sorry for improper post. Here are all the details with code tags.

1) Tom all I see in v$session is the wrapped procedure being executed. I can't view the code. I can't see individual sql's fired from the wrapped body waiting for events. V$sql may have all the sql but don't know which sql fired from which wrapped body and in what order.

2) Yes AWR is system wide but the detail I pasted is of a particular job CPU consumption.

  Elapsed      CPU                  Elap per  % Total
  Time (s)   Time (s)  Executions   Exec (s)  DB Time    SQL Id
---------- ---------- ------------ ---------- ------- -------------
    10,885     10,920            2     5442.6     5.4 44ybz0jhbpwcy
begin xxxx.YYYYYY(:v1,:v2,:v3,:v4,:v5, sysdate,:v6,:v7,:v8,:v9,:v10,
:v11,:v12); end;


3) Yes there are two executions in AWR but cpu is 5k/execution atleast.
I used dba_jobs_running to find sid and used dbms_monitor.session_trace_enable to enable tracing in that session (with wait and binds) to find the sqls' and order in which they are fired, but there(tkprof) the overall cpu time consumed is mere 15sec. Tkprof was run for 5k wall clock sec.
OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS 

call  count    cpu  elapsed    disk    query  current    rows 
------- ------ -------- ---------- ---------- ---------- ---------- ---------- 
Parse    0    0.00    0.00      0      0      0      0 
Execute    0    0.00    0.00      0      0      0      0 
Fetch    0    0.00    0.00      0      0      0      0 
------- ------ -------- ---------- ---------- ---------- ---------- ---------- 
total    0    0.00    0.00      0      0      0      0 

Misses in library cache during parse: 0 

OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS 

call  count    cpu  elapsed    disk    query  current    rows 
------- ------ -------- ---------- ---------- ---------- ---------- ---------- 
Parse  2153    8.02    7.80      0      0      0      0 
Execute  6395    3.04    2.88      0      0      0      0 
Fetch  6771    4.64    4.51      35  195171      0    7591 
------- ------ -------- ---------- ---------- ---------- ---------- ---------- 
total  15319  15.70    15.21      35  195171      0    7591 

Misses in library cache during parse: 719 
Misses in library cache during execute: 21 

Elapsed times include waiting on following events: 
Event waited on                  Times  Max. Wait Total Waited 
----------------------------------------  Waited ---------- ------------ 
db file sequential read                35    0.00      0.04 
latch: shared pool                    1    0.00      0.00 

1437 user SQL statements in session. 
1082 internal SQL statements in session. 
2519 SQL statements in session. 
******************************************************************************** 
Trace file: ././xxxx_j009_19673.trc 
Trace file compatibility: 10.01.00 
Sort options: default 

    1 session in tracefile. 
  1437 user SQL statements in trace file. (Above mentioned 740 are libarary cache miss) 
  1082 internal SQL statements in trace file. 
  2519 SQL statements in trace file. 
  736 unique SQL statements in trace file. 
110564 lines in trace file. 
  5608 elapsed seconds in trace file. (Above we can see 15.21 is the total sql elapsed time and 15.70s is total cpu time) 



4) So in tkprof I am seeing on 15sec DB time and 15sec CPU time in 5k wall clock elapsed time but AWR is showing 5k/execution cpu and elapsed time.
Tom Kyte
November 01, 2011 - 3:07 am UTC

1) Tom all I see in v$session is the wrapped procedure being executed. I can't view the code. I can't see individual sql's fired from the wrapped body waiting for events. V$sql may have all the sql but don't know which sql fired from which wrapped body and in what order.

you wouldn't know that if the code was not wrapped either would you. You need a sql_trace for that *regardless* of the wrapped-ness of the code.



2) Yes AWR is system wide but the detail I pasted is of a particular job CPU consumption.

*no*, it is not. It is system wide, you cannot "awr" just a single job.


3) Yes there are two executions in AWR but cpu is 5k/execution atleast.

No, no it is not. It could easily be 10k for one and 1 second for the other. It doesn't have to be "the same", it is not 'fair' that way.



3) Yes there are two executions in AWR but cpu is 5k/execution atleast.
I used dba_jobs_running to find sid and used dbms_monitor.session_trace_enable to enable tracing in that session (with wait and binds) to find the sqls' and order in which they are fired, but there(tkprof) the overall cpu time consumed is mere 15sec. Tkprof was run for 5k wall clock sec.


that probably explains it - you started tracing AFTER the job was running - so you only see some of the sql and you do not see the plsql block itself. Do you see your plsql block in the trace, the *originating* plsql block. If not, you are just seeing the times for the sql cursors that were opened and closed during the time the tracing was taking place.

That might mean the cpu time is spent in the plsql code, not in the sql executed by the plsql code.

Cannot you run one of their jobs from the sqlplus command line so you can

a) turn on sql trace
b) run the plsql
c) exit (close the session, close all cursors)
d) then tkprof?






Yes PL/SQL code may be culprit

A reader, November 07, 2011 - 6:05 am UTC

Tom,

We don't have much data in dev environment to run the job manually and trace them as suggested. I am trying to get the DB populated but its stuck in approval cycles.

I know for sure that SQL's which are executed aren't consuming significant CPU time but the job on the whole is consuming (checked cpu_time from v$sqlarea for job sql_id and its 3hrs of cpu time but the trace which I ran for 90 min just had 15sec of sql execution). Its most probably related to PL/SQL code which is wrapped, probably searching in pl/sql table etc operation. Is there a way we can find what my session is executing in PL/SQL when its not firing sqls. Don't think WRAPPED code can be profiled.

Thanks.
Tom Kyte
November 07, 2011 - 12:05 pm UTC

google

unwrap plsql


see what you see

RAW SQL Trace "tim" column

Rahul, November 11, 2011 - 5:26 am UTC

Hi Tom,

While reading a raw trace file I found below details which I am not able to understand. I have put "##########" on lines where I found the anamoly

PARSING IN CURSOR #47 len=326 dep=3 uid=33 oct=3 lid=33 tim=28446736188995 hv=2449322735 ad='7a4ab410'
/***  QUERY  ***/
END OF STMT
PARSE #47:c=0,e=42,p=0,cr=0,cu=0,mis=0,r=0,dep=3,og=1,tim=28446736188989
BINDS #47:
kkscoacd
 Bind#0
  oacdty=01 mxl=32(07) mxlc=00 mal=00 scl=00 pre=00
  oacflg=13 fl2=0001 frm=01 csi=871 siz=96 off=0
  kxsbbbfp=ffffffff79743288  bln=32  avl=07  flg=05
  value="DOV1001"
 Bind#1
  oacdty=01 mxl=32(29) mxlc=00 mal=00 scl=00 pre=00
  oacflg=13 fl2=0001 frm=01 csi=871 siz=0 off=32
  kxsbbbfp=ffffffff797432a8  bln=32  avl=29  flg=01
  value="9999,CISUSER.[J@MVQOFMOE9952%"
 Bind#2
  oacdty=01 mxl=32(01) mxlc=00 mal=00 scl=00 pre=00
  oacflg=13 fl2=0001 frm=01 csi=871 siz=0 off=64
  kxsbbbfp=ffffffff797432c8  bln=32  avl=01  flg=01
  value="U"
EXEC #47:c=0,e=636,p=0,cr=0,cu=0,mis=0,r=0,dep=3,og=1,tim=28446736190867
FETCH #47:c=0,e=61,p=0,cr=2,cu=0,mis=0,r=0,dep=3,og=1,tim=28446736191026
STAT #47 id=1 cnt=0 pid=0 pos=1 obj=0 op='NESTED LOOPS  (cr=2 pr=0 pw=0 time=119 us)'
STAT #47 id=2 cnt=0 pid=1 pos=1 obj=459351 op='TABLE ACCESS BY INDEX ROWID TAGS (cr=2 pr=0 pw=0 time=110 us)'
STAT #47 id=3 cnt=0 pid=2 pos=1 obj=459356 op='INDEX RANGE SCAN TAGSIDX3 (cr=2 pr=0 pw=0 time=102 us)'
STAT #47 id=4 cnt=0 pid=1 pos=2 obj=3074 op='TABLE ACCESS BY INDEX ROWID TAGDEFS (cr=0 pr=0 pw=0 time=0 us)'
STAT #47 id=5 cnt=0 pid=4 pos=1 obj=3075 op='INDEX UNIQUE SCAN TAGDEFS_PK (cr=0 pr=0 pw=0 time=0 us)'
FETCH #29:c=10000,e=150,p=0,cr=1,cu=0,mis=0,r=1,dep=3,og=1,tim=28446736191670
BINDS #23:
kkscoacd
 Bind#0
  oacdty=01 mxl=128(40) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1206001 frm=01 csi=871 siz=160 off=0
  kxsbbbfp=ffffffff799cf588  bln=128  avl=34  flg=05
  value="9999,CISUSER.[J@MV]LLWEG001344.001"
 Bind#1
  oacdty=01 mxl=32(20) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1206001 frm=01 csi=871 siz=0 off=128
  kxsbbbfp=ffffffff799cf608  bln=32  avl=07  flg=01
  value="DOV1001"
EXEC #23:c=0,e=351,p=0,cr=0,cu=0,mis=0,r=0,dep=3,og=1,tim=28446736192423
FETCH #23:c=0,e=36,p=0,cr=1,cu=0,mis=0,r=1,dep=3,og=1,tim=28446736192551
BINDS #22:
kkscoacd
 Bind#0
  oacdty=01 mxl=128(40) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1206001 frm=01 csi=871 siz=160 off=0
  kxsbbbfp=ffffffff799cf588  bln=128  avl=34  flg=05
  value="9999,CISUSER.[J@MV]LLWEG001344.001"
 Bind#1
  oacdty=01 mxl=32(20) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1206001 frm=01 csi=871 siz=0 off=128
  kxsbbbfp=ffffffff799cf608  bln=32  avl=07  flg=01
  value="DOV1001"
EXEC #22:c=0,e=383,p=0,cr=0,cu=0,mis=0,r=0,dep=3,og=1,tim=28446736193090
FETCH #22:c=0,e=360,p=0,cr=6,cu=0,mis=0,r=1,dep=3,og=1,tim=28446736193534
BINDS #49:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
  oacflg=13 fl2=206001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=ffffffff799cf610  bln=22  avl=04  flg=05
  value=39030100
EXEC #49:c=0,e=229,p=0,cr=0,cu=0,mis=0,r=0,dep=3,og=1,tim=28446736193946
FETCH #49:c=0,e=552,p=0,cr=19,cu=0,mis=0,r=1,dep=3,og=1,tim=28446736194547
BINDS #38:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
  oacflg=13 fl2=206001 frm=00 csi=00 siz=48 off=0
  kxsbbbfp=ffffffff799cf5f8  bln=22  avl=04  flg=05
  value=39030100
 Bind#1
  oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
  oacflg=13 fl2=206001 frm=00 csi=00 siz=0 off=24
  kxsbbbfp=ffffffff799cf610  bln=22  avl=02  flg=01
  value=9
EXEC #38:c=0,e=342,p=0,cr=0,cu=0,mis=0,r=0,dep=3,og=1,tim=28446736195083


######### FETCH #38:c=0,e=52,p=0,cr=2,cu=0,mis=0,r=0,dep=3,og=1,tim=28446736195184 ########### 
*** 2011-10-10 07:03:54.615
BINDS #40:
kkscoacd
 Bind#0
  oacdty=01 mxl=32(20) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1206001 frm=01 csi=871 siz=32 off=0
  kxsbbbfp=ffffffff7974e760  bln=32  avl=11  flg=05
  value="F3Z_MC_IOQC"
#########EXEC #40:c=0,e=633,p=0,cr=0,cu=0,mis=0,r=0,dep=3,og=1,tim=28449786653016#########


FETCH #40:c=0,e=89,p=0,cr=2,cu=0,mis=0,r=1,dep=3,og=1,tim=28449786653245
BINDS #63:
kkscoacd
 Bind#0
  oacdty=01 mxl=32(20) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1206001 frm=01 csi=871 siz=64 off=0
  kxsbbbfp=ffffffff7974e740  bln=32  avl=11  flg=05
  value="F3Z_MC_IOQC"
 Bind#1
  oacdty=01 mxl=32(20) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1206001 frm=01 csi=871 siz=0 off=32
  kxsbbbfp=ffffffff7974e760  bln=32  avl=07  flg=01
  value="DOV1001"


1) If we see the tim values on the two highlighted lines above we see that there is a difference of 3000sec ("tim=28446736195184"-"tim=28449786653016"/1000000) but no where it is reflected in value of "ela" column
2) From highlighted lines I get below information.
At time 28446736195184 fetch was performed from cursor #38 using < 100 micro seconds time (elap).
At time 28449786653016 execute of cursor #40 was performed using <650 micro seconds time Now between these two timestamp there is no information of what work happened. does this mean no SQL Query was fired.
Tom Kyte
November 11, 2011 - 8:22 am UTC

you were not tracing waits here. Do it again with waits on if you want to see ELA's.

You were likely waiting on "sql net message from client", meaning you were just sitting there idle in the database.


Really Puzzling seeing your example

Andy, December 28, 2011 - 12:47 pm UTC

Tom - I was really confused when I tried your first example on our environment or rather tried to simulate but had different resulting TKPROF. So you showed us an example of :

select count(subobject_name)
from
big_table.big_table

So I've a HUGE table and I create a EMPTY table which looks just like HUGE table with all triggers,constraints the whole works on this EMPTY table mimicking the HUGE table. Now I tried to :

1. Started up trace

2.
INSERT INTO HUGE_TABLE
select 123456 id
from HUGE_TABLE
where id=123456;

3.
INSERT INTO EMPTY_TABLE
select 123456 id
from HUGE_TABLE
where id=123456;

4. Stopped trace.

The trace shows that the it took more CPU and or DISK read's (db file sequential read) in case of the INSERT into HUGE_TABLE than the INSERT into EMPTY_TABLE. I would have expected the same behavior as it's reading from the same HUGE_TABLE like you showed in your example where once you indexed it the db read went way down. But in my case I already had the index ?

Tom Kyte
December 29, 2011 - 11:07 am UTC

give the full example just like I did so we can see what you did step by step and replicate it.


but in short, it will take more IO to insert into an existing huge index than it would an empty index.

Really Puzzling seeing your example

A reader, December 29, 2011 - 3:55 pm UTC

Ok, so here is the steps :

1. existing HUGE table :
CREATE TABLE BD
(
BID NUMBER(15),
DID NUMBER(15),
ORD NUMBER(10) DEFAULT 0,
TSTAMP TIMESTAMP(6) DEFAULT localtimestamp,
USERID NUMBER(15)
)
LOGGING
NOCOMPRESS
NOCACHE
NOPARALLEL
MONITORING;

This has 2 billions rows.PK on BID and DID.

CREATE UNIQUE INDEX BD_IX0 ON BD
(BID, ORD, DID)
TABLESPACE INDX
NOLOGGING
NOPARALLEL;

CREATE UNIQUE INDEX BID_IX1 ON BD
(DID,BID)
TABLESPACE INDX
NOLOGGING
NOPARALLEL;

CREATE INDEX BD_IX2 ON BD
(BID,USERID,TSTAMP)
TABLESPACE INDX
NOLOGGING
NOPARALLEL;

2. Create EMPTY_TABLE as HUGE TABLE

CREATE TABLE BD_TEST
(
BID NUMBER(15),
DID NUMBER(15),
ORD NUMBER(10) DEFAULT 0,
TSTAMP TIMESTAMP(6) DEFAULT localtimestamp,
USERID NUMBER(15)
)
LOGGING
NOCOMPRESS
NOCACHE
NOPARALLEL
MONITORING;

EMPTY table.PK on BID and DID.Same index's as the HUGE table

3. START TRACE
4. INSERT INTO BD (BID,DID)
select 123456 Bid
,DID
from BD
where bid=323456356;

3.
INSERT INTO BD_TEST (BID,DID)
select 123456 Bid
,DID
from BD
where bid=323456356;

4. Stop Trace.

Trace for HUGE table :

INSERT INTO bd
( bid
,did)
select 123456 BID
,DID
from BD
where bid=323456356

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 209.92 1356.32 838044 264545 14217193 1789191
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 209.92 1356.32 838044 264545 14217193 1789191

Misses in library cache during parse: 1
Optimizer goal: ALL_ROWS
Parsing user id: 2871 (###########)

Rows Row Source Operation
------- ---------------------------------------------------
1789191 INDEX RANGE SCAN BD_IX0 (cr=10174 pr=224 pw=0 time=1789350 us)(object id 388960)


TRACE from EMPTY TABLE :

INSERT INTO bd_test
( bid
,did)
select 123456 BID
,DID
from BD
where bid=323456356

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 40.77 67.12 11584 179902 8856931 1789191
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 40.78 67.12 11584 179902 8856931 1789191

Misses in library cache during parse: 1
Optimizer goal: ALL_ROWS
Parsing user id: 2871 (###########)

Rows Row Source Operation
------- ---------------------------------------------------
1789191 INDEX RANGE SCAN BD_IX0 (cr=10174 pr=2514 pw=0 time=1789374 us)(object id 388960)
Tom Kyte
December 30, 2011 - 10:22 am UTC

this is relatively easy to explain.

when you insert into the table with lots of data - it already has two "huge" indexes sitting out there on disk. As you insert more rows into this table - it will have to ultimately read all three of those indexes into memory (probably, most of them will be read in) to maintain them. The extra physical and logical IO's you see are for reading those existing indexes into memory.

All we have to read into memory index wise for the empty table are the empty blocks to hold the data we are inserting - that is a LOT less than the existing big indexes.


which is basically what I already said :)

but in short, it will take more IO to insert into an existing huge index than it would an empty index.

Bad query

Jrun, February 26, 2013 - 6:19 pm UTC

Tom,
Db:11.2.0.3
We have a 3rd party app and the web app runs very slow. We make the 3rd to fix the issue. for the app login process, I did an AWR , found the problem query it runs 10 mins. Then I did the sqltrace here is it:
select clndr_id , count(*)
from
task where (clndr_id = :"SYS_B_0") group by clndr_id union select clndr_id ,
count(*) from project where (clndr_id = :"SYS_B_1") group by clndr_id


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.01 0.00 0 0 0 0
Fetch 2 53.32 612.03 81650 58920 0 2
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 53.34 612.04 81650 58920 0 2

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 34 (PRIVUSER)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
2 2 2 SORT UNIQUE (cr=58923 pr=81650 pw=22868 time=113329109 us cost=58277 size=24 card=2)
2 2 2 UNION-ALL (cr=58923 pr=81650 pw=22868 time=113329001 us)
1 1 1 SORT GROUP BY NOSORT (cr=58330 pr=81070 pw=22868 time=104312437 us cost=58128 size=7 card=1)
5589739 5589739 5589739 VIEW index$_join$_003 (cr=58330 pr=81070 pw=22868 time=619784236 us cost=57240 size=38875249 card=5553607)
5589739 5589739 5589739 HASH JOIN (cr=58330 pr=81070 pw=22868 time=617373467 us)
5590158 5590158 5590158 INDEX RANGE SCAN NDX_TASK_CALENDAR (cr=21676 pr=21676 pw=0 time=113637058 us cost=11057 size=38875249 card=5553607)(object id 24749)
6673774 6673774 6673774 INDEX FAST FULL SCAN NDX_TASK_PROJ_RSRC (cr=36651 pr=36526 pw=0 time=213370625 us cost=21921 size=38875249 card=5553607)(object id 217274)
1 1 1 SORT GROUP BY NOSORT (cr=593 pr=580 pw=0 time=9016527 us cost=149 size=17 card=1)
136390 136390 136390 INDEX FAST FULL SCAN NDX_PROJECT_CALENDAR (cr=593 pr=580 pw=0 time=165434 us cost=132 size=2315876 card=136228)(object id 154409)


Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
2 SORT (UNIQUE)
2 UNION-ALL
1 SORT (GROUP BY NOSORT)
5589739 TABLE ACCESS MODE: ANALYZED (BY INDEX ROWID) OF 'TASK'
(TABLE)
5589739 INDEX MODE: ANALYZED (RANGE SCAN) OF
'NDX_TASK_CALENDAR' (INDEX)
5590158 SORT (GROUP BY NOSORT)
6673774 INDEX MODE: ANALYZED (RANGE SCAN) OF
'NDX_PROJECT_CALENDAR' (INDEX)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
Disk file operations I/O 2 0.00 0.00
db file sequential read 22235 1.61 138.66
direct path write 1620 3.25 177.42
db file scattered read 2313 1.89 238.98
direct path read 385 1.72 19.52
SQL*Net message from client 2 0.11 0.21

Please make comments on the query. we will go to get the 3rd party attention.
Tom Kyte
February 27, 2013 - 7:59 am UTC

oh great, you not only have performance issues but a huge security risk in using this 3rd party application.

It uses string concatenation to put everything together - no bind variables, you are using cursor sharing = force/similar to attempt to fix the problem, but that fix only marginally affects scalability, it does nothing (and can do nothing) to protect you from the inherit security bugs in the 3rd party application. Be careful with this application....

Your average physical IO is taking 0.1 seconds. That is beyond horrible. You have a serious misconfiguration there - an IO cannot be taking 1/10th of a second can it? really???? They should be taking on the order of 5ms (0.005) - the time spent waiting for db file scattered read should be around 11 seconds, not 238.

then look at your write times, you have a write that too OVER THREE SECONDS - seriously? What are you using for IO - stone tablets?



that said, I cannot really comment on the query, it is obvious from the plan that one of your tables is a view - we don't know the table structure, we don't know what indexes are in place, we don't know what the counts (the answer to the query) is, we cannot judge whether the fast full scans are sensible or not.


but get on your IO there - just look at those numbers. When you see an individual IO taking over one second - you *know* for a fact you have a serious issue outside of the database, unrelated to the database.

1 second is three orders of magnitude greater than where these numbers should be - 3 orders of magnitude is astronomically *huge*

Doubt about your last answer

A reader, February 27, 2013 - 12:22 pm UTC

Tom, your analysis regarding the I/O response times is to the point and precise, but I had one doubt : the cause for the abysmal I/O performance could not be due an awnful lot of horrible SQLs (in other sessions but inside the database nonetheless) doing so many many I/Os that their I/O bandwidth in the server is simply satured (the I/O subsystem just canĀ“t cope with all the requisitions and thus each and all I/Os must enfilerate and wait and wait to be serviced) ? Or no, something in the tkprof and/or in the scenary above denies this possibility and thus leads you to point to an outise-db issue, and if so, what is that ?

Regards,

J. Laurindo Chiappa

Tom Kyte
February 27, 2013 - 1:36 pm UTC

that means they have a horrible configuration issue.

too many people doing too many things on too few hardware resources. Everyone suffers - they should back off greatly on the degree of concurrency if this is the case.



So, either hardware is totally broken

or - if it could perform well

overloaded


either way, it is a misconfiguration. If overloaded they either need to

a) get more hardware resources, you cannot get blood from a stone
b) have less concurrency - everyone will run faster if they limit the number of active sessions to a lower number

see:
http://www.youtube.com/watch?v=xNDnVOCdvQ0

for evidence that (b) is a true statement.



either way, it is outside the database - misconfigured connection pools in the app tier, lack of use of CMAN if you are client server.

Bad query

Jrun, February 28, 2013 - 5:31 pm UTC

"oh great, you not only have performance issues but a huge security risk in using this 3rd party application.

It uses string concatenation to put everything together - no bind variables,
"
Could you please elaborate?: 1) security issue of the query 2) why it does not use bind variable?
Tom Kyte
February 28, 2013 - 5:53 pm UTC

it is obvious is does not use bind variables, I can see clearly that cursor sharing=force/similar is being used based on:

task where (clndr_id = :"SYS_B_0")


*we* so the :"SYS_B_0" stuff - that is cursor sharing = force/similar at play. It means the developer built a unique sql statement using string concatenation - and their statement looked something like:

task where (clndr_id = 4321423) ...


with literals. when you do that, you are subject to sql injection.


1) security issue = sql injection risks on every single sql statement

2) not using bind variables = they didn't use bind variables, they used literals and are trying to mask that fact by using cursor sharing = force/similar.

Bad query

Jrun, March 01, 2013 - 5:52 am UTC

"
2) not using bind variables = they didn't use bind variables, they used literals and are trying to mask that fact by using cursor sharing = force/similar.
"
how much differences the faking vs enforced Bind variables ?

Learned that they are not willing to look into the bind variable issues.

Tom Kyte
March 01, 2013 - 7:08 am UTC

tell them that they have security bugs then - sql injection. better yet, take a little time and find one :) then tell them that they need to fix it or you might write a blog or something showing the world how to attack their application (and not just fix the ONE occurrence, but all of them)



The difference is:

a) they turned a hard parse problem into a soft parse problem. That is good on the face of it, but still not where they should be. A well written system that parses only when it needs to will be able to do almost 50% more work than a system that softparses like mad. (a system that hard parses will be able to achieve about 10% of the workload of a well written system that minimizes parsing).

http://www.youtube.com/watch?v=1oddFEyUAjs


b) they have a security bug, sql injection

c) they have negatively impacted the entire database - not just their database - by demanding that cursor sharing be set to a non-default value. Every parse takes longer now because of them, all of the issues inherit with cursor sharing=force/similar are thrust upon even the correctly written applications.

Bad Query

Jrun, March 02, 2013 - 8:39 am UTC

Tom,
Can we say a no-band-variables app is subjected 100% to sql injections?
Thanks,
Jrun
Tom Kyte
March 04, 2013 - 4:11 pm UTC

You can say that an application that uses string concatention to build SQL has the opportunity to have sql injectable code and every bit of it must be inspected to death to see if it has a bug.

in my experience, if you have many developers working on a project and they are not using bind variables, you have sql injection bugs - it just takes one developer making one tiny mistake to have it.

tkprof information not consistent with actual execution time

A reader, December 05, 2013 - 6:22 am UTC

hi Tom,

below is the matrix from my 10046 trace file:

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.46 0.58 0 0 0 0
Execute 1 0.74 8.23 0 88 0 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 1.21 8.82 0 88 0 0


as per my understanding the total "execution time" shown here is (8.82+1.21= approx 10 seconds). but the query is taking 5 minutes for completion.

am i calculating the "execution time" in wrong way i.e is the caculation of 10 seconds(approx) wrong?


more data

A reader, December 05, 2013 - 8:52 am UTC

TKPROF EBPAPROD1_ora_72133_trends.trc trends.prf SORT = (PRSDSK, EXEDSK, FCHDSK) PRINT = 10

is the TKPROF command used to format trace file.

Difference between cpu time and elapsed time

fjlee, July 10, 2014 - 3:10 pm UTC

According to a previous entry in this page, if statistics_level=all, there should be not much difference between cpu time and elapsed time in a tkprof report with the 10046 trace running a SQL statement.

I set the statistical_level=all and created a tkprof report with the 10046 trace for a test session. A test was run on an AIX LPAR running RDBMS 11.2.0.3.

Looking at the tkprof output, I see high difference between the cpu time and elaplsed time for some statements. E.g. SQL ID: 1mnuj5ybbuz3d:

call count cpu elapsed disk query current
rows
------- ------ -------- ---------- ---------- ---------- ----------
----------
Parse 1 0.00 0.00 0 0 0
0
Execute 4 89.91 214.43 282 532640 4722
12758
Fetch 0 0.00 0.00 0 0 0
0
------- ------ -------- ---------- ---------- ---------- ----------
----------
total 5 89.91 214.43 282 532640 4722
12758

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 282 0.03 0.57

I.e. CPU (89.91) + Waits (0.57) = 90.48, which is only 42% of the elapsed time of 214.43 secs.

I don't know why it is different from the previous claim in this page.

Please explain the reason.

Thanks.

fjlee, July 14, 2014 - 1:20 pm UTC

I found another article on "Big Difference between CPU and ELAPSED in TKPROF output", version 10.2.0.3.

https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:296449500346066515

It explains that the CPU wait times of all sessions that are running concurrently should be taken into account. And this CPU wait time can't be "timed". And it explains: "Big Difference between CPU and ELAPSED in TKPROF output".

Thanks.


More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.