Skip to Main Content
  • Questions
  • How can we diagnose a process that does not report to the wait interface?

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Kim.

Asked: September 02, 2015 - 8:54 pm UTC

Last updated: September 14, 2015 - 4:19 pm UTC

Version: 11.2.0.3, Aix

Viewed 1000+ times

You Asked

From time to time, we see a process that uses a full cpu core, but does (almost) not log any waits.
A 10046 trace shows fx an i/o wait roughly every minuts, taking few milliseconds. A hanganalyze trace shows the exact same picture.

Are there a note or similar, we can follow to diagnose this behavior?

and Connor said...

Common causes for such behaviour are:

A- excessive logical IO, eg, you're spinning around in a (say) a tightly bound nested loop, so no physical IO (and hence minimal waits) are detected

B- logic bug (eg, endless loop in PLSQL)

C- or database bug (spinning on parse etc).

If you look at SQL_ID in V$SESSION, you'll get an idea of whether its running a SQL when you observe this behaviour. If there is (and its consistently the same SQL) then target that SQL.

If there is an SQL_ID, and its changing constantly, then you've got a potential logic error, ie, something is running lots of SQL's over and over (but you'd expect to see EXEC calls in the 10046 if that was the case).

If that gets you no joy, then you really have to start looking at deeper level trace tools.

Some good information about that here:

https://db-blog.web.cern.ch/blog/luca-canali/2014-12-life-oracle-io-tracing-logical-and-physical-io-systemtap

Or perhaps consider Dtrace if your platform supports it. (I dont *think* AIX does, but I'm not 100% sure).

Hope this helps.

Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

But....(?!?)

Kim Anthonisen, September 03, 2015 - 6:32 am UTC

This is not a question about (pl)sql optimization.

The session/process is running a single sql statement (according to v$session), and waiting for reading the exact same block. This picture stays for about a minute, and then it shifts to reading another block.

I have never seen a session not reporting to the wait interface before. If it spinning on logical (cached) i/o, it should still report to the wait interface, right? Or am I missing something?

I would like to be able to investigate the process from the O/S, so truss/dtrace kind of stuff would be usefull, I guess.

Can You point me to any good source for doing this?
Connor McDonald
September 04, 2015 - 3:31 am UTC

The wait interface is about when you are *not* on CPU, ie, you are waiting for something to complete (eg a physical IO) before you can then consume CPU.

If you are *already* on CPU, eg, logical IO, then no waits will be recorded because you are not waiting for anything.

For example,

SQL> drop table T purge;

Table dropped.

SQL> create table T (x int primary key);

Table created.

SQL> insert into T values (1);

1 row created.

SQL> commit;

Commit complete.

SQL> select event, time_waited_micro
  2  from v$session_event
  3  where sid = sys_context('USERENV','SID');

EVENT                                                            TIME_WAITED_MICRO
---------------------------------------------------------------- -----------------
Disk file operations I/O                                                     51359
control file sequential read                                                  2269
enq: RO - fast object reuse                                                   3780
log file sync                                                                 1289
db file sequential read                                                      62818
db file scattered read                                                        3487
SQL*Net message to client                                                      472
SQL*Net message from client                                              133222790
SQL*Net break/reset to client                                                  122
events in waitclass Other                                                     2891

10 rows selected.

SQL> select count(*)
  2  from (
  3  select x
  4  from T
  5  connect by x <= x
  6  )
  7  where rownum < 1000000;

  COUNT(*)
----------
    999999

SQL> select event, time_waited_micro
  2  from v$session_event
  3  where sid = sys_context('USERENV','SID');

EVENT                                                            TIME_WAITED_MICRO
---------------------------------------------------------------- -----------------
Disk file operations I/O                                                     51359
control file sequential read                                                  2269
enq: RO - fast object reuse                                                   3780
log file sync                                                                 1289
db file sequential read                                                      62818
db file scattered read                                                        3487
SQL*Net message to client                                                      477
SQL*Net message from client                                              133286910
SQL*Net break/reset to client                                                  122
events in waitclass Other                                                     2891

10 rows selected.



Notice how none of the wait times went up (besides sqlnet coming back to my screen), because the query I ran was simply burning CPU on a single row table (which was already in the buffer cache).

Dont forget that the SECONDS_IN_WAIT time in v$session (and v$session_Wait) is either the time spent waiting *or* the time since the *last* wait. So you might be doing a physical IO intermittently (which you see in the wait details in v$session) and then burning a lot of CPU (none of which will record any waits).

So my inclination is still toward a tightly bound logical IO style execution of an SQL.

Hope this helps.

A nice dtrace example here http://blog.tanelpoder.com/2009/04/24/tracing-oracle-sql-plan-execution-with-dtrace/

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.