Skip to Main Content
  • Questions
  • SQL*Net message from client - in SQLTRACE shows a huge wait event

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Ronald.

Asked: April 17, 2007 - 4:59 am UTC

Last updated: February 01, 2011 - 3:35 pm UTC

Version: 9.2.0.7

Viewed 10K+ times! This question is

You Asked

Hi Tom,
In my trace file there is a large wait event.

I need to findout what is this wait event. Can you help me to find what is this event.

Background:
We run certain job from application. Once the job is issued, no more the application from client side is required(We can shut the application also).

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse   131833     84.85      82.83          4       1886          0           0
Execute 221963     41.12      68.20       1790     187883     145386       75689
Fetch   153811     53.91      94.48     159204    1205403          0     1162237
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   507607    179.88     245.52     160998    1395172     145386     1237926

Misses in library cache during parse: 122235

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message from client                703579       59.83       1919.68
  SQL*Net message to client                  703578        0.00          0.35
  SQL*Net more data to client                349684        0.00          7.71
  db file sequential read                    158923        0.14         50.84
  log file sync                               63923        0.18         95.93
  SQL*Net more data from client               11652        0.06        534.86
  latch free                                     40        0.00          0.02
  buffer busy waits                              11        0.00          0.00
  undo segment extension                         43        0.00          0.00
  db file scattered read                         81        0.02          0.11



I need to find what exactly is the fllowing wait event.


Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message from client                703579       59.83       1919.68


Regards,
Ronald.

and Tom said...

sqlnet message from client is the wait event we increment while.....

waiting for the client to tell us to do something.

It is time we spent just hanging out - waiting.


For example, go into sqlplus and do the following:

ops$tkyte%ORA10GR2> alter session set events '10046 trace name context forever, level 12';

Session altered.

ops$tkyte%ORA10GR2> !date
Tue Apr 17 13:46:40 EDT 2007

ops$tkyte%ORA10GR2> !date
Tue Apr 17 13:47:02 EDT 2007

ops$tkyte%ORA10GR2> select * from dual;

D
-
X

ops$tkyte%ORA10GR2> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - Production
With the Partitioning, OLAP and Data Mining options



My tkprof shows:

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        3      0.00       0.00          0          0          0           0
Execute      4      0.00       0.08          0          0          0           2
Fetch        2      0.00       0.00          0          3          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        9      0.00       0.09          0          3          0           3

Misses in library cache during parse: 1

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       9        0.00          0.00
  SQL*Net message from client                     9       27.86         28.59


that max wait - that was me just sitting there watching the cursor blink.


so, your application at one point did something for 59.83 seconds - without bothering the database at all - the database was just hanging out while the client application did it's thing.


This could be indicative of a client application inefficiently processing data - if you think the client should be

a) getting data
b) doing something very fast
c) getting more data and repeat

this would indicate the client is not achieving B)

Rating

  (29 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Ronald or Brayan?

Michel Cadot, April 17, 2007 - 1:15 pm UTC

This is what I said him 9 hours ago on OraFAQ:
http://www.orafaq.com/forum/m/231462/102589/#msg_231441

Regards
Michel

Tom Kyte
April 18, 2007 - 11:06 am UTC

welcome to my world :)

Ronald Nazareth, April 18, 2007 - 1:01 am UTC

Hi Tom,

Thanks for your reply. Infact it was very informative.

I have a doubt,
Suppose my application is not doing anything with database, and there is CPU consumption(load on CPU). That means the time logs as "SQL*Net message from client "

Regards,
Ronald.
Tom Kyte
April 18, 2007 - 12:03 pm UTC

huh?

Perhaps a useful example..

G Forestieri, October 12, 2007 - 5:01 pm UTC

Something we see from time to time. Stats that look very much like Mr. Nazareth's, running an Oracle Report on Oracle E-Biz suite.

Same report run several times in a day on rarely changing data. Usually runs "fast" (say under 1 minute), but occasionally goes to the dogs (not that there is anything wrong with dogs) and runs for half an hour or more.

Trace analyzer shows the high "SQL*Net message from client" numbers in Mr. Nazareth's example.

There is no "client" entering data, there is no network to traverse. The report is not calculating pi to 4000 places. Perhaps this is an example Mr. Nazereth would have in mind. If so what might be a probable area or two to investigate?



Tom Kyte
October 14, 2007 - 2:32 pm UTC

there is a client - it is running the report

there is a network - always, that is the only way for the client to get stuff.

So, I would say the report is getting hung up doing something, perhaps it was starved for cpu, perhaps it get stuck writing to disk.

sqlnet message from client - we are waiting to be told what to do next in the database. we are not currently doing anything in the database.


Thx

G Forestieri, October 22, 2007 - 1:09 pm UTC

There is always a network involved? Think Oracle e-biz suite....with all tiers (save the end user) on one box. The "client" in this case is Oracle reports on the same box in this case. Hence my statement that there is no network. Unless I'm totally missing something. Wouldn't be the first time.

After writing the question I thought of a possible wait time writing report output to the disk...if that disk is near-totally occupied doing something else, the database would have to wait. Its a start.

Thanks for your stab at it.

Tom Kyte
October 23, 2007 - 6:57 pm UTC

there is ALWAYS A NETWORK LAYER INVOLVED

there are two processes, the client - and the server.

they use a network protocol of some sort, it might be IPC (inter process communcation), it might be the loopback link

but SQLNET IS ALWAYS THERE.

there is a network, always.

and sqlnet message from client shows up with IPC just as often as with "not IPC"

it means the client is doing stuff and NOT ASKING THE DATABASE to do stuff.

Sorry to get you yelling

G Forestieri, October 26, 2007 - 2:53 pm UTC

Sorry to get you so exasperated you felt compelled to yell. In my silliness when I say network I'm thinking of things like those latency-inducing pieces like cables and cards and protocols, WANs, etc. There certainly is a bit of interprocess communication going on in a server, no one doubts that.

Thanks again for your effort.

Tom Kyte
October 29, 2007 - 12:23 pm UTC

IPC has latency inducing things - it is a "networking thing" as well in a true sense.

and incorrect IPC settings can make IPC slower than a loopback 127.0.0.1 link or even hitting the network.


Just because they are on the same machine does not mean "little or no latency", not at all - it can be many times slower than a real network...

IPC settings - a guide?

G Forestieri, October 31, 2007 - 5:54 pm UTC

Thanks for the followup. Can you recommend some texts covering IPC settings?

thx
Tom Kyte
November 02, 2007 - 11:49 am UTC

very OS dependent. So, no, not really.

Is my interpretation of "SQL*Net message from client" correct?

Dakshinamurty, April 03, 2008 - 5:36 am UTC

We were testing scalability of a Bank's loan interest calculation batch job.

The stats pack for 9 processes run in parallel having 9 branches each, each having 7500 accounts is as follows:

Elapsed: 13.04 (mins)
DB Time: 17.87 (mins)

Event Waits %Time -outs Total Wait Time (s) Avg wait (ms) Waits /txn
SQL*Net more data from client 6,525 0 2 0 0.28
SQL*Net message from client 3,133,184 0 7,477 2 134.06

And the application team was telling that each process in turn does some application server side processing in parallel, which can be controlled through configuration.

There was 7477/60/9=13.85 Min of wait of SQL*Net message from client. But this was more than the elapsed time which is understandable as some parallel process is going on in each process.
So, to understand how much time is spent in DB and how much time outside, I told to do a single process run for 100 accounts not using application server side parallelism.

The stats pack got is:

Elapsed: 1.07 (mins)
DB Time: 0.08 (mins)

Event Waits %Time -outs Total Wait Time (s) Avg wait (ms) Waits /txn
SQL*Net more data from client 473 0 0 0 3.01
SQL*Net message from client 12,654 0 116 9 80.6
It was proved that outside DB the job is running (1.07-0.08)/1.07*100 = 92.52% of time. So, by tuning APP we will be getting more benefit.

But APP people were telling that here also 116/60=1.93 Min of wait of SQL*Net message from client, which was more than elapsed time(1.07 min). So, there will be something wrong in interpretation. I was telling them that they didn't swith off parallelism properly. Is my interpretation of "SQL*Net message from client" correct?
Tom Kyte
April 03, 2008 - 8:05 pm UTC

... So, by tuning APP we will be getting more benefit.
...

that is ALWAYS true. but we digress...


how many connections to the database were "open" during this "test"

it would take just one open - but idle - sqlplus session - that did something during the snapshot - to through this off.

Re: Is my interpretation of "SQL*Net message from client" correct?

Dakshinamurty, April 07, 2008 - 11:50 pm UTC

Thank you. You answered my query. In both cases that I mentioned, sqlplus sessions ..."idle - sqlplus session - that did something during the snapshot" would have been there.

Since from my second case, it is evident that:
Time spent outside DB = 1.07 - 0.08 (Elapsed time - DB Time) = 0.99 Sec.
So, sqlplus sessions idle wait should have been:
1.93 (116 sec wait of SQL*Net message from client) - 0.99 = 0.94 Min. And that's why this difference was coming.

(..."how many connections to the database were "open" during this "test"": 48 and 55 respectively for both cases)

"select *" while connected "/" or "/@<name>"

Matt Miller, August 18, 2009 - 2:54 pm UTC

We're seeing big "SQL*Net message from client" delays -- over 1.5 seconds per fetch -- from something like:

sqlplus /
SQL> select * from <a_table> where rownum <= 300000;


This is in a script that directs its output to /dev/null. The purpose is to monitor the 'db file ... read' times so we can track disk performance, but suddenly we're seeing these huge "SQL*Net message from client" waits. For example:

WAIT #1: nam='SQL*Net message from client' ela= 1626354 driver id=1413697536 #bytes=1 p3=0 obj#=44827 tim=46279574112001
WAIT #1: nam='SQL*Net message to client' ela= 3 driver id=1413697536 #bytes=1 p3=0 obj#=44827 tim=46279574112213
FETCH #1:c=0,e=272,p=0,cr=2,cu=0,mis=0,r=15,dep=0,og=1,tim=46279574112398
WAIT #1: nam='SQL*Net message from client' ela= 1625980 driver id=1413697536 #bytes=1 p3=0 obj#=44827 tim=46279575738440
WAIT #1: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=44827 tim=46279575738608
FETCH #1:c=0,e=287,p=0,cr=3,cu=0,mis=0,r=15,dep=0,og=1,tim=46279575738814


So, are you saying I should probably blame IPC on this Unix box?
Tom Kyte
August 24, 2009 - 4:45 pm UTC

well, if you are using @name - that wouldn't be IPC would it?

need a bit more detail - something to work with

Matt Miller, September 28, 2009 - 2:31 pm UTC

I didn't use "@name", the connect string shown was "/".

I was able to reduce the problem wait to nearly nothing by doing:

SQL> set autotrace traceonly statistics

So, that solved my problem (and makes more sense than just directing output to /dev/null), although it didn't explain why this query suddenly started seeing the increased wait times.

Tom Kyte
September 30, 2009 - 7:33 am UTC

I based that on your comment title

"select *" while connected "/" or "/@"


the sql net message from client is the database waiting for the client to do something.

In this case - the client (sqlplus) spent lots of time doing IO to the terminal - The time spent by sqlplus formatting the string to print on the terminal and the time spent printing it on the terminal were large and measurable - those were your waits.

wait events

A reader, January 03, 2010 - 12:13 am UTC

Tom:

Can you tell why it is taking 20 seconds to do a simple password validation where user enters a userid/password and submit it via http. the database has a sha1 java stored procedure and a smal table to check if we have a match..

The client is not doing anything. it is sending this statement to database. The issue also happens on the first request during the day and then it becomes quick. is this a netowrk issue. how can i resolve it.


TKPROF: Release 9.2.0.2.0 - Production on Sun Jan 3 00:02:17 2010

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

Trace file: tips_ora_11084216.trc
Sort options: default

********************************************************************************
count = number of times OCI procedure was executed
cpu = cpu time in seconds executing
elapsed = elapsed time in seconds executing
disk = number of physical reads of buffers from disk
query = number of buffers gotten for consistent read
current = number of buffers gotten in current mode (usually for update)
rows = number of rows processed by the fetch or execute call
********************************************************************************

alter session set events '10046 trace name context forever, level 12'


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 0 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1 0.00 0.00 0 0 0 0

Misses in library cache during parse: 0
Misses in library cache during execute: 1
Optimizer goal: CHOOSE
Parsing user id: 419

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 1 0.00 0.00
SQL*Net message from client 1 14.36 14.36
********************************************************************************

SELECT COUNT (*)
FROM tipsuser
WHERE userid = 'mike'
AND password = SHA1_PKG.F_SHA1('secret')

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 1.52 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.25 16.15 0 2 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.26 17.68 0 2 0 1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 419

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=135 r=0 w=0 time=16570058 us)
1 TABLE ACCESS BY INDEX ROWID OBJ#(233561) (cr=135 r=0 w=0 time=16570043 us)
1 INDEX UNIQUE SCAN OBJ#(233562) (cr=1 r=0 w=0 time=70247 us)(object id 233562)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
SQL*Net message from client 2 36.58 36.59
********************************************************************************

select longname
from
javasnm$ where short = :1


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.11 0 0 0 0
Execute 10 0.00 0.00 0 0 0 0
Fetch 10 0.00 0.18 0 40 0 10
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 21 0.00 0.30 0 40 0 10

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: SYS (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
10 TABLE ACCESS BY INDEX ROWID OBJ#(385) (cr=30 r=0 w=0 time=167475 us)
10 INDEX UNIQUE SCAN OBJ#(386) (cr=20 r=0 w=0 time=115744 us)(object id 386)

********************************************************************************

select longdbcs
from
javasnm$ where short = :1


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.04 0 0 0 0
Execute 0 0.00 0.00 0 0 0 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1 0.00 0.04 0 0 0 0

Misses in library cache during parse: 0
Parsing user id: SYS (recursive depth: 1)
********************************************************************************

SELECT max(version)
FROM
"SYS"."JAVA$POLICY$SHARED$TABLE"


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 2 0.00 0.01 0 0 0 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.05 0 6 0 2
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 6 0.00 0.06 0 6 0 2

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: SYS (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=3 r=0 w=0 time=53648 us)
2 TABLE ACCESS FULL JAVA$POLICY$SHARED$TABLE (cr=3 r=0 w=0 time=53622 us)

********************************************************************************

select privilege#,level
from
sysauth$ connect by grantee#=prior privilege# and privilege#>0 start with
(grantee#=:1 or grantee#=1) and privilege#>0


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 3 0.00 0.00 0 0 0 0
Execute 3 0.00 0.05 0 0 0 0
Fetch 180 0.00 0.06 0 87 0 177
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 186 0.00 0.12 0 87 0 177

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: SYS (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
59 CONNECT BY WITH FILTERING (cr=29 r=0 w=0 time=36524 us)
24 NESTED LOOPS (cr=26 r=0 w=0 time=224 us)
24 CONCATENATION (cr=2 r=0 w=0 time=96 us)
0 INDEX RANGE SCAN I_SYSAUTH1 (cr=1 r=0 w=0 time=35 us)(object id 110)
24 INDEX RANGE SCAN I_SYSAUTH1 (cr=1 r=0 w=0 time=41 us)(object id 110)
24 TABLE ACCESS BY USER ROWID SYSAUTH$ (cr=24 r=0 w=0 time=103 us)
31 NESTED LOOPS (cr=3 r=0 w=0 time=321 us)
45 BUFFER SORT (cr=0 r=0 w=0 time=147 us)
45 CONNECT BY PUMP (cr=0 r=0 w=0 time=43 us)
31 INDEX RANGE SCAN I_SYSAUTH1 (cr=3 r=0 w=0 time=102 us)(object id 110)




********************************************************************************

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 1.52 0 0 0 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 2 0.25 16.15 0 2 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 5 0.26 17.68 0 2 0 1

Misses in library cache during parse: 1
Misses in library cache during execute: 1

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 3 0.00 0.00
SQL*Net message from client 3 36.58 50.95


OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 7 0.00 0.17 0 0 0 0
Execute 15 0.00 0.06 0 0 0 0
Fetch 192 0.00 0.29 0 133 0 189
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 214 0.00 0.53 0 133 0 189

Misses in library cache during parse: 0

2 user SQL statements in session.
7 internal SQL statements in session.
9 SQL statements in session.
********************************************************************************
Trace file: tips_ora_11084216.trc
Trace file compatibility: 9.00.01
Sort options: default

1 session in tracefile.
2 user SQL statements in trace file.
7 internal SQL statements in trace file.
9 SQL statements in trace file.
6 unique SQL statements in trace file.
379 lines in trace file.



Tom Kyte
January 04, 2010 - 11:54 am UTC

... The
issue also happens on the first request during the day and then it becomes
quick....

jvm start up time
plus
cache warm up time (for all of the jvm/java stuff you loaded)
plus
shared pool warm up time


we've been over this sam.

sql java

A reader, January 04, 2010 - 12:16 pm UTC

Tom:

Is this then *NORMAL* oracle processing warm-up time that no SQL or INSTANCE tuning is required? is it going to happen every day. I thought once you load the thing in cache it stays in LMU area.

Can you tell me if there is a way to avoid this as some users complain.

would an early morning job calling all the procedures make sense.

Tom Kyte
January 04, 2010 - 12:42 pm UTC

we've had this discussion sam.

something is happening over night, the system does something different - every day is like a new day. It is being shutdown, or does a workload that is considerably different, or a connection pool has been recycled, something.


if you identify it as being the java start up time, then you can answer this yourself:

... would an early morning job calling all the procedures make sense....

performance

A reader, January 04, 2010 - 1:17 pm UTC

Tom:

They do not shut it down everynight. Startup time is in V$instance and it shows a date a few months old.

All they do is create a backup using export every night. But i also told DBA to stop that on TEST instance to check whether it that was causing the issue and nothing really changed.


I have taken snapshots of V$sql on daily basis just to see if things are getting flushed but I did not see much difference.


How would you advide to do further troubleshooting on this ?


How can you confirm that connection pool is recycled? It still should not take a few seconds to establish a connection. Also, a conectio in mod_plsql will stay alive in P1 for only 15 minutes if there was no activity. so I assume http processes in the morning will create new connections to DB.

I believe TKPROF withe Level 12 trace has given us the most we can get out of it.

thanks a lot,
Tom Kyte
January 04, 2010 - 1:25 pm UTC

sounds like connection pool recycling, during the night, the connections go away, the jvm startup - coupled with caches emptying of their own accord (inactivity).

If it happens every morning
and then it goes away for the day

it would appear to be the JVM startup + cache loading, as stated.

... How can you confirm that connection pool is recycled?...

by talking your your application server administrators.


... It still should not take
a few seconds to establish a connection....

it won't take long to establish a connection, no one is even talking about that.


... I believe TKPROF withe Level 12 trace has given us the most we can get out of
it.
....

that'll show you sql, if sql is your issue - it would, but based on what you posted, there isn't any sql going on, so.... it won't

performance

A reader, January 04, 2010 - 2:43 pm UTC

Tom:

<<<sounds like connection pool recycling, during the night, the connections go away, the jvm startup - coupled with caches emptying of their own accord (inactivity). >>>

You are right on this . The admins do not know much about oracle internals or tuning so they are not much help .

Is there a way for me to technically *PROVE* that the above is happening in the system using some oracle dynamic views or taking snapshots at night and in morning, etc. or any other way? This is what i am trying to get to.

This also happens for non-java (100% plsql code) procedures- so there is no JVM involved.


Tom Kyte
January 04, 2010 - 4:28 pm UTC

get a test system.

shut it down.
start it up.

execute your function, time it.
do it again, with same inputs.
time it.

the differentials in timing will be "what it takes to get started when everything is cold"

When I haven't used dbms_metadata.get_ddl in a while on a test system for example - it takes a good 5+ or more seconds to get started - lots of "support library" stuff (in java) gets loaded up for that one.


... This also happens for non-java (100% plsql code) procedures- so there is no JVM
involved.
....

we'd need to see a tkprof of that one - would be expecting lots of recursive sql for that - to get loaded up, get into a test machine that you control and do things "cold" versus "hot".

and compare.

performance delay

A reader, January 06, 2010 - 9:55 am UTC

Tom:

DBA says this might be caused by two things:

1. oracle has "deferred write" on flushing dirty buffers to disk.
so if a transaction makes a change to record, the block containing the changed record stays in
the dirty list of the buffer cache until the next process.
When the next process tries to access (read) the data block it will cause the db writer to
write the dirty buffer to disk.

2. when oracle idles for a while the undo segment may be offlined.

Is this correct for 9i R2, Can this be the cause of the morning delay in evey procedure and how can i verify this (i.e TKPROF, statspack, V$views, configs).


Tom Kyte
January 06, 2010 - 1:15 pm UTC

Need a new DBA do you?

1) so what. is your thing even creating dirty blocks? did you see any waits for that?

no
and
no

are the answers.

and even so, it would not account for that level of "wait".


2) ummm, again, no. But even so - so what - you are running a......... query. All undo segments would NOT be offlined, you only need one to get going - and it wouldn't take anywhere near seconds to bring a new one about if we wanted it.



I'm trying really hard Sam to tell you what I believe it to be. re-read this page please.

performance

A reader, January 06, 2010 - 3:48 pm UTC

Yes, this is actually is for the regular pl/sql procedures (not Java function JVM).
I will post a TKPROF on this so you can see the waits.


If i understand this correctly, when a user does an update to a record in oracle it becomes
a dirty block until you commit and then it write it to disk in a few seconds.
It would not keep the dirty block there overnight waiting for a commit or the next morning transaction before writing to disk.

If the cache is not available it will write the dirty blocks to disk.


http://www.dba-oracle.com/t_oracle10g_417_buffer_cache_.htm


Is this correct?

You mentioned "connection recycling" and "cache emptying" issues.
But I see the delay too after i connect using windows sql*plus and run the procedure too
-- no connection pool here (using dedicated server).


Tom Kyte
January 06, 2010 - 4:12 pm UTC

... If i understand this correctly, when a user does an update to a record in
oracle it becomes
a dirty block until you commit and then it write it to disk in a few seconds.
It would not keep the dirty block there overnight waiting for a commit or the
next morning transaction before writing to disk.
...


the block becomes dirty as soon as you update (the commit has nothing to do with anything here). dbwr, in the background, will write it out sometime (regardless of commit status). It might be a few seconds, minutes or hours (so yes, it could be tomorrow morning). It depends on how the DBA has things setup.

Now, that I've said that, I have to say "but SO WHAT??? what does that have to do with this issue here. Nothing I say. What do you say - and if you say anything - say it clearly, precisely and definitely"

You are not waiting for a buffer busy wait, you are not waiting for a free buffer, you are therefore not waiting on a block being written from the cache. So, I say "so what" - it isn't what you are waiting on.


I'm not getting what is so hard here - "the first time I run something in the morning - after it hasn't been run for a while - it is slow - but then it is fast"


... But I see the delay too after i connect using windows sql*plus and run the
procedure too ...

do you understand that using sqlplus is like using the ultimate in connection pool recycling? It would see the same thing - if the stuff ain't loaded in the shared pool, if the jvm ain't started up, if X ain't available and needs to be made available - I don't care if you use sqlplus toad jdbc, whatever - it'll have to be done.

The obvious answers are: it isn't in the cache, we need to start something up, things have been flushed out. Not sure how many more different ways I can say that - so I won't, I'm done here.

SQL*Net message from client

Florin, January 22, 2010 - 9:01 am UTC

Hello Tom,

At one of my clients I have the same wait event on top:
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
SQL*Net message from client 259,711,599 17,914,149 56.98
pipe get 743,956 3,579,265 11.38
jobq slave wait 913,906 2,216,280 7.05
virtual circuit status 84,779,385 2,187,649 6.96
SQL*Net message from dblink 92,961,398 1,934,497 6.15
------------------------------------------------------------

When I run vmstat the CPU runqueue is between 30 and 40 with the system having 10 CPU`s. CPU Utilization is almost 100% every time with Oracle CPU Utilization about 20% of the total CPU. The user and system CPU from vmstat are always between 40% and 60%.

I also have to mention that it is a 9.2.0.6.0 db version on a HP-UX B.11.11 U 9000/800 64bit.

I have read this article and the 2 main ideas are that it can be an application issue or a network problem.
The same in my case?

Thanks


Tom Kyte
January 25, 2010 - 1:49 am UTC

they are all pretty much idle waits. Are the clients running on this machine too or just the database??

A reader, January 25, 2010 - 8:21 am UTC

Yes.
I have encountered many UGA memory leaks in the trace files and also a SQL statement from Patrol created by BMC software that is on top as buffer gets and physical reads.
Reading the My Oracle Support note 31056.1 “UGA Space Leak” I think that we have encountered the bug no 4451759. In this case the recommendation is to put the latest patchset (9.2.0.8.0).
I don`t know if this wait event can be related with this bug but I will recommend the installation of the latest patchset.


improvement by changing sqlplus' settings?

SEan, February 05, 2010 - 11:15 pm UTC

I have the query which runs 20+ seconds by sqlplus on the db server, and it runs hours by sqplus remotely. I have put the query in a script and eliminated the results' display on the screen by SET TERM OFF in the script.
I understood that the network's latency may play a role for the slowness as running it remotely. Before I look into the network relevant I would like to know if we can make the improvement by changing the sqlplus environment settings? What are the sqlplus environment settings shown from the GUI sqlplus Options tab could impact the speed of sending the data to the db server?


Executing from remote (still going)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited

--------------------------------------------------------------------------------
Waited
--------------------------------------------------------------------------------
 
--------------------------------------------------------------------------------
SQL*Net message to client 80986 0.00 0.04
direct path write temp 316 0.08 9.78
direct path read temp 575 0.01 0.04
<b>SQL*Net message from client 80985 14.44 5589.73</b>


executing from the db server
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited

--------------------------------------------------------------------------------
Waited
--------------------------------------------------------------------------------
 
--------------------------------------------------------------------------------
SQL*Net message to client 76970 0.00 0.02
direct path write temp 316 0.10 10.45
direct path read temp 576 0.58 0.63
SQL*Net message from client 76970 0.00 24.71

Tom Kyte
February 08, 2010 - 9:38 pm UTC

set arraysize larger, that'll cut down on the round trips - if the problem is that remote is on a high bandwidth - high latency network (like a satellite connection), then this will help.

If you are on a low bandwidth network, nothing sort of 'do not move that much data' will help too much.


Sean, February 06, 2010 - 8:32 am UTC

the arrayszie setting determines the roundtrips between sqlplus and the db server.

SQL*Net Message from client

Eduardo Claro, April 19, 2010 - 10:02 pm UTC

Tom, please help. I have an EBS installation, Oracle 9i, and one of the concurrents (batch processes) has a long-running query that spends most of its time on "SQL*Net Message from client" event. Client and Server are on the same machine, and I did a test case running the same query with Event 10046 level 12, varying arraysize to see if it would change something. Unfortunatelly, nothing changed. The query itself is reported to spend about 30-40s, but it really spends more than 8 minutes because of SNMFC events. Here is a tkprof excerpt:

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.02 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 81 7.51 36.07 23729 207054 0 40000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 83 7.53 36.09 23729 207054 0 40000

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 173

Rows Row Source Operation
------- ---------------------------------------------------
40000 COUNT STOPKEY
40000 TABLE ACCESS BY INDEX ROWID GL_BALANCES
156790 NESTED LOOPS
43327 TABLE ACCESS FULL GL_CODE_COMBINATIONS
113463 INDEX RANGE SCAN GL_BALANCES_N1 (object id 33916)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 81 0.00 0.00
SQL*Net more data to client 837 0.00 0.01
SQL*Net message from client 81 6.22 452.42
db file sequential read 21912 0.31 29.61
db file scattered read 258 0.33 0.96
SQL*Net more data from client 2 0.00 0.00

This was for an arraysize of 500.

Changing arraysize to 10000, it is almost the same:

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.02 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 9 5.68 7.08 16435 206842 0 40000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 11 5.70 7.10 16435 206842 0 40000

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 173

Rows Row Source Operation
------- ---------------------------------------------------
40000 COUNT STOPKEY
40000 TABLE ACCESS BY INDEX ROWID GL_BALANCES
156790 NESTED LOOPS
43327 TABLE ACCESS FULL GL_CODE_COMBINATIONS
113463 INDEX RANGE SCAN GL_BALANCES_N1 (object id 33916)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 9 0.00 0.00
SQL*Net more data to client 869 0.00 0.01
SQL*Net message from client 9 57.58 443.45
db file sequential read 13741 0.06 2.07
db file scattered read 354 0.06 0.37
SQL*Net more data from client 2 0.00 0.00

There are now less occurrences, but total time is equivalent.

Any clue how could I speed up things here?

The client is a C Program, core from Oracle EBS, so I could not change it easily.

The query used for the tests is:

SELECT /*Array Size 500 - Exec 1*/
nvl(bal.PERIOD_TYPE, ''),
nvl(bal.PERIOD_YEAR, -1),
bal.PERIOD_NAME,
nvl(bal.PERIOD_NUM, -1),
nvl(bal.PERIOD_NUM, -1),
bal.ACTUAL_FLAG,
decode(cc.TEMPLATE_ID, NULL, 'N', 'Y'),
-1,
nvl(bal.BUDGET_VERSION_ID, -1),
-1,
nvl(bal.ENCUMBRANCE_TYPE_ID, -1),
bal.CURRENCY_CODE,
bal.SET_OF_BOOKS_ID,
nvl(bal.TRANSLATED_FLAG, ''),
nvl(bal.PERIOD_NET_DR, 0) - nvl(bal.PERIOD_NET_CR, 0),
nvl(bal.PERIOD_NET_DR, 0),
nvl(bal.PERIOD_NET_CR, 0),
nvl(bal.QUARTER_TO_DATE_DR,0)-nvl(bal.QUARTER_TO_DATE_CR, 0),
nvl(bal.QUARTER_TO_DATE_DR, 0),
nvl(bal.QUARTER_TO_DATE_CR, 0),
nvl(bal.BEGIN_BALANCE_DR, 0) - nvl(bal.BEGIN_BALANCE_CR, 0),
nvl(bal.BEGIN_BALANCE_DR, 0),
nvl(bal.BEGIN_BALANCE_CR, 0),
nvl(bal.PROJECT_TO_DATE_DR, 0) - nvl(bal.PROJECT_TO_DATE_CR, 0),
nvl(bal.PROJECT_TO_DATE_DR, 0),
nvl(bal.PROJECT_TO_DATE_CR, 0)
,
nvl(SEGMENT2,''),
nvl(SEGMENT3,''),
nvl(SEGMENT4,''),
nvl(SEGMENT5,''),
nvl(SEGMENT6,''),
nvl(SEGMENT7,''),
nvl(SEGMENT8,''),
nvl(SEGMENT9,''),
nvl(SEGMENT10,''),
nvl(SEGMENT1,'')
FROM
GL_BALANCES bal,
GL_CODE_COMBINATIONS cc
WHERE bal.CODE_COMBINATION_ID = cc.CODE_COMBINATION_ID
AND cc.CHART_OF_ACCOUNTS_ID = 50245
AND bal.SET_OF_BOOKS_ID = 1
AND nvl(bal.TRANSLATED_FLAG, 'x')in ('Y','N','x')
AND cc.TEMPLATE_ID is NULL and rownum <= 40000
AND ( (nvl(SEGMENT2,'') >= '3110101001' AND nvl(SEGMENT2,'') <= '3110101999')
OR (nvl(SEGMENT2,'') >= '3110102001' AND nvl(SEGMENT2,'') <= '3110102999')
OR (nvl(SEGMENT2,'') >= '3110103001' AND nvl(SEGMENT2,'') <= '3110103999')
OR (nvl(SEGMENT2,'') >= '3110104001' AND nvl(SEGMENT2,'') <= '3110104999')
OR (nvl(SEGMENT2,'') >= '3110105001' AND nvl(SEGMENT2,'') <= '3110105999')
OR (nvl(SEGMENT2,'') >= '3110106001' AND nvl(SEGMENT2,'') <= '3110106999')
OR (nvl(SEGMENT2,'') >= '3110190001' AND nvl(SEGMENT2,'') <= '3110190999')
OR (nvl(SEGMENT2,'') >= '3110201001' AND nvl(SEGMENT2,'') <= '3110201999')
OR (nvl(SEGMENT2,'') >= '3110202001' AND nvl(SEGMENT2,'') <= '3110202999')
OR (nvl(SEGMENT2,'') >= '3110203001' AND nvl(SEGMENT2,'') <= '3110203999')
OR (nvl(SEGMENT2,'') >= '3110204001' AND nvl(SEGMENT2,'') <= '3110204999')
OR (nvl(SEGMENT2,'') >= '3110205001' AND nvl(SEGMENT2,'') <= '3110205999')
OR (nvl(SEGMENT2,'') >= '3110206001' AND nvl(SEGMENT2,'') <= '3110206999')
OR (nvl(SEGMENT2,'') >= '3110207001' AND nvl(SEGMENT2,'') <= '3110207999')
OR (nvl(SEGMENT2,'') >= '3110208001' AND nvl(SEGMENT2,'') <= '3110208999')
OR (nvl(SEGMENT2,'') >= '3110209001' AND nvl(SEGMENT2,'') <= '3110209999')
OR (nvl(SEGMENT2,'') >= '3110210001' AND nvl(SEGMENT2,'') <= '3110210999')
OR (nvl(SEGMENT2,'') >= '3110290001' AND nvl(SEGMENT2,'') <= '3110290999')
OR (nvl(SEGMENT2,'') >= '3110301001' AND nvl(SEGMENT2,'') <= '3110301999')
OR (nvl(SEGMENT2,'') >= '3110401001' AND nvl(SEGMENT2,'') <= '3110401999')
OR (nvl(SEGMENT2,'') >= '3110402001' AND nvl(SEGMENT2,'') <= '3110402999')
OR (nvl(SEGMENT2,'') >= '3110403001' AND nvl(SEGMENT2,'') <= '3110403999')
OR (nvl(SEGMENT2,'') >= '3110404001' AND nvl(SEGMENT2,'') <= '3110404999')
OR (nvl(SEGMENT2,'') >= '3110405001' AND nvl(SEGMENT2,'') <= '3110405999')
OR (nvl(SEGMENT2,'') >= '3110406001' AND nvl(SEGMENT2,'') <= '3110406999')
OR (nvl(SEGMENT2,'') >= '3120101001' AND nvl(SEGMENT2,'') <= '3120101999')
OR (nvl(SEGMENT2,'') >= '3210101001' AND nvl(SEGMENT2,'') <= '3219999999')
OR (nvl(SEGMENT2,'') >= '3220101001' AND nvl(SEGMENT2,'') <= '3229999999')
OR (nvl(SEGMENT2,'') >= '3230101001' AND nvl(SEGMENT2,'') <= '3230199999')
OR (nvl(SEGMENT2,'') >= '3230201001' AND nvl(SEGMENT2,'') <= '3230201999')
OR (nvl(SEGMENT2,'') >= '3230202001' AND nvl(SEGMENT2,'') <= '3230202999')
OR (nvl(SEGMENT2,'') >= '3230203001' AND nvl(SEGMENT2,'') <= '3230203999')
OR (nvl(SEGMENT2,'') >= '3230204001' AND nvl(SEGMENT2,'') <= '3230204999')
OR (nvl(SEGMENT2,'') >= '3230205001' AND nvl(SEGMENT2,'') <= '3230205999')
OR (nvl(SEGMENT2,'') >= '3230206001' AND nvl(SEGMENT2,'') <= '3230206999')
OR (nvl(SEGMENT2,'') >= '3230207001' AND nvl(SEGMENT2,'') <= '3230207999')
OR (nvl(SEGMENT2,'') >= '3240101001' AND nvl(SEGMENT2,'') <= '3249999999')
OR (nvl(SEGMENT2,'') >= '3250101001' AND nvl(SEGMENT2,'') <= '3250499999')
OR (nvl(SEGMENT2,'') >= '3250501001' AND nvl(SEGMENT2,'') <= '3250501999')
OR (nvl(SEGMENT2,'') >= '3250502001' AND nvl(SEGMENT2,'') <= '3250504999')
OR (nvl(SEGMENT2,'') >= '3250505001' AND nvl(SEGMENT2,'') <= '3250505003')
OR (nvl(SEGMENT2,'') >= '3250506001' AND nvl(SEGMENT2,'') <= '3258999999')
OR (nvl(SEGMENT2,'') >= '3259001001' AND nvl(SEGMENT2,'') <= '3259001999')
OR (nvl(SEGMENT2,'') >= '3259990001' AND nvl(SEGMENT2,'') <= '3259999999')
OR (nvl(SEGMENT2,'') >= '3260101001' AND nvl(SEGMENT2,'') <= '3269999999')
OR (nvl(SEGMENT2,'') >= '3270101001' AND nvl(SEGMENT2,'') <= '3270199999')
OR (nvl(SEGMENT2,'') >= '3270201001' AND nvl(SEGMENT2,'') <= '3270299999')
OR (nvl(SEGMENT2,'') >= '3270301001' AND nvl(SEGMENT2,'') <= '3279999999')
OR (nvl(SEGMENT2,'') >= '3299901001' AND nvl(SEGMENT2,'') <= '3299999999')
OR (nvl(SEGMENT2,'') >= '3520101001' AND nvl(SEGMENT2,'') <= '3520101006')
OR (nvl(SEGMENT2,'') >= '3530101001' AND nvl(SEGMENT2,'') <= '3539999999')
OR (nvl(SEGMENT2, '') = '3520201001')
)
AND ( (nvl(SEGMENT4,'') >= '000000' AND nvl(SEGMENT4,'') < '501111')
OR (nvl(SEGMENT4,'') > '501111' AND nvl(SEGMENT4,'') <= 'ZZZZZZ')
)
AND ( (nvl(SEGMENT1,'') >= '001' AND nvl(SEGMENT1,'') <= '999')
)
AND (
(bal.period_name in ('MAR-10')
AND ( (nvl(bal.period_name,'') = 'MAR-10'
AND ( (bal.CURRENCY_CODE = 'BRL'
AND bal.ACTUAL_FLAG = 'A'
AND (bal.PERIOD_NET_DR != 0 or bal.PERIOD_NET_CR != 0))
OR
( (bal.CURRENCY_CODE = 'BRL'
AND bal.ACTUAL_FLAG = 'A'
)))))
) )
/

(I put a ROWNUM to restrict number of rows fetched).

If you can give me any clue, it would be appreciate.

Thanks in advance.
Tom Kyte
April 20, 2010 - 8:07 am UTC

... has a long-running query that spends most of its
time on "SQL*Net Message from client ...


read that far and stopped. That means..... drum roll..... problem not in database, problem in client, we are (database) are waiting for it (client) to send us a request.

message from client - we are waiting to be told to do something.


the client is spending a lot of time processing those records - we cannot make client go faster. Maybe move client to a machine running faster CPU's.

YET - SQL*Net Message from client

Eduardo Claro, April 21, 2010 - 9:25 pm UTC

ok, Tom, it was always clear for me that this event is an IDLE event for the database point of view. But, in this case, I did a teste myself, running the query directly from SQL*Plus in the database server. And the symptoms were exactly that exposed before: most of the time were spent on that idle event. So, for this specific case completely reproduced thru SQL*Plus, what could this event mean? SQL*Plus was in someway processing rows for each fetch?
Tom Kyte
April 22, 2010 - 8:03 am UTC

show me, run this script:

connect user/password
set autotrace TRACELONLY statistics;
set arraysize 1000
alter session set events '10046 trace name context forever, level 12';
select ....
exit

run tkprof on that - make sure to use the connect, immediately followed by alter, immediately followed by exit


otherwise, you'll be timing the time you sat at the prompt (sql net message from client) and the time sqlplus spent printing to the screen (want to make sqlplus run faster - just minimize it! or better yet, use set autotrace traceonl to have sqlplus just fetch the data)

YET - SQL*Net Message from client

A reader, April 28, 2010 - 10:03 am UTC

Ok, did exactly what you said. The results of tkprof are following:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.02          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       41     11.19      49.00      40126     211274          0       40000
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       43     11.20      49.03      40126     211274          0       40000

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 173  (APPS)

Rows     Row Source Operation
-------  ---------------------------------------------------
  40000  COUNT STOPKEY
  40000   TABLE ACCESS BY INDEX ROWID OBJ#(33876)
 156790    NESTED LOOPS
  43327     TABLE ACCESS FULL OBJ#(33834)
 113463     INDEX RANGE SCAN OBJ#(33916) (object id 33916)


Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   GOAL: CHOOSE
  40000   COUNT (STOPKEY)
  40000    TABLE ACCESS   GOAL: ANALYZED (BY INDEX ROWID) OF
               'GL_BALANCES'
 156790     NESTED LOOPS
  43327      TABLE ACCESS   GOAL: ANALYZED (FULL) OF
                 'GL_CODE_COMBINATIONS'
 113463      INDEX   GOAL: ANALYZED (RANGE SCAN) OF 'GL_BALANCES_N1'
                 (NON-UNIQUE)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                      41        0.00          0.00
  SQL*Net more data to client                   853        0.00          0.02
  db file sequential read                     40047        0.31         41.72
  SQL*Net message from client                    41       12.07        454.49
  db file scattered read                         24        0.01          0.09
  log file switch completion                      1        0.00          0.00
<code>
</code>
Tom Kyte
April 28, 2010 - 12:44 pm UTC

can you email me the RAW TRACE file - thomas.kyte@oracle.com


YET - SQL*Net Message from client

A reader, April 28, 2010 - 10:06 am UTC

oops, I have just realized that I forgot to include SET AUTOTRACE in the script. Should I run it again, or the data uploaded in the before review is enough?
Tom Kyte
April 28, 2010 - 12:44 pm UTC

run this:



connect user/password
set autotrace TRACELONLY statistics;
set arraysize 1000
alter session set events '10046 trace name context forever, level 12';
select ....
exit



and send me the raw TRACE file. yes, the autotrace is extremely relevant.

YET - SQL*Net Message from client

Eduardo Claro, April 28, 2010 - 10:24 am UTC

ran it again, now with SET AUTOTRACE TRACEONLY STATISTICS. Look how different are the results:

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 41 11.02 64.27 37910 206932 0 40000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 43 11.02 64.27 37910 206932 0 40000

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 173 (APPS)

Rows Row Source Operation
------- ---------------------------------------------------
40000 COUNT STOPKEY
40000 TABLE ACCESS BY INDEX ROWID OBJ#(33876)
156790 NESTED LOOPS
43327 TABLE ACCESS FULL OBJ#(33834)
113463 INDEX RANGE SCAN OBJ#(33916) (object id 33916)


Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
40000 COUNT (STOPKEY)
40000 TABLE ACCESS GOAL: ANALYZED (BY INDEX ROWID) OF
'GL_BALANCES'
156790 NESTED LOOPS
43327 TABLE ACCESS GOAL: ANALYZED (FULL) OF
'GL_CODE_COMBINATIONS'
113463 INDEX GOAL: ANALYZED (RANGE SCAN) OF 'GL_BALANCES_N1'
(NON-UNIQUE)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 41 0.00 0.00
SQL*Net more data to client 853 0.00 0.03
db file sequential read 37562 0.24 56.25
SQL*Net message from client 41 0.03 0.90
db file scattered read 51 0.11 0.32

Tom Kyte
April 28, 2010 - 12:48 pm UTC

bingo - don't send me the trace file.

You were waiting for sqlplus to PRINT ON THE SCREEN.

using that autotrace, the only thing you changed was to make it so sqlplus did NOT print on the screen.

You have just shown that it was client processing time, nothing more.

Should the "Times Waited" column always match?

Richard, October 20, 2010 - 1:51 pm UTC

Hi tom,

I notice in these follow-ups that the "Times Waited" column for "SQL*Net message from client" and "SQL*Net message to client" seem to be a 1:1 ratio.

Should this always be the case? Say we have 1 additional "SQL*Net message from client"

I notice in the OP's tkprof output he has 703579 "from" and 703578 "to". While yours is 9:9. When I look through tkprof's I've performed I believe I see a 1:1 ratio 99% of the time -- in my unscientific "study" :-)

Is it fair to say Oracle returned the results of the query but did not receive anything ("acknowledgement"?) from the client? Might this happen when there are no rows returned?

I'm also trying to troubleshoot a "database performance" issue a vendor is complaining about and I see very large "SQL*Net message from client" waits when I see an "n:n-1" ratio

Thanks again for the great site.

Regards,
Rich

Tom Kyte
October 25, 2010 - 3:12 pm UTC

the sqlnet message from clients are assigned to the statement following the wait. So as you switch from statement to statement (open this cursor, fetch that cursor, fetch this cursor, update that table, etc) the waits counted against a statement might come from the "prior" statement. It is hard to assign a wait that happens in the middle of statements to a single statement - so it just goes to the next statement.


Think about it this way - waiting for IO - you KNOW what statement that is, you are in the call.

Waiting for a statement to be submitted to you - you don't KNOW what statement that is going to be.

If you interleave a lot of statements, the waits for "from client" will get assigned to the statement following the wait.



ignore that ratio, it means nothing.

The large wait is the result of us (the database) waiting and waiting and waiting for the slow client to tell us to do something. that is all.


Also, remember you turned on tracing at some point. It is very easy to missing a wait that way - it fell outside of the time you were tracing.

Thanks...

Richard, October 26, 2010 - 2:13 pm UTC

Thanks Tom (sorry, didn't notice until too late that I didn't capitalize your name in the previous post!)


Question

SOY, January 11, 2011 - 10:10 am UTC

HI TOM,

We did debugging of clarify 2Tier . It shows the action select customer.




· 1348 requests for a this action seems to be pretty high

· The DB waits 43.51 secs for answer from the client which might be our problem WHAT is the root cause for this.



OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS



call count cpu elapsed disk query current rows

------- ------ -------- ---------- ---------- ---------- ---------- ----------

Parse 1348 0.01 0.04 0 0 0 0

Execute 1348 0.80 0.77 0 45 41 6

Fetch 3847 4.05 6.65 534 99449 0 62661

------- ------ -------- ---------- ---------- ---------- ---------- ----------

total 6543 4.86 7.47 534 99494 41 62667



Misses in library cache during parse: 0



Elapsed times include waiting on following events:

Event waited on Times Max. Wait Total Waited

---------------------------------------- Waited ---------- ------------

SQL*Net message to client 3973 0.00 0.00

SQL*Net message from client 3973 7.46 43.51

db file sequential read 534 0.27 2.70

SQL*Net more data to client 6543 0.00 0.31

log file sync 5 0.00 0.01

latch: cache buffers chains 1 0.00 0.00

SQL*Net more data from client 53 0.00 0.00

single-task message 1 0.12 0.12

SQL*Net message to dblink 20 0.00 0.00

SQL*Net message from dblink 20 0.00 0.02

SQL*Net more data from dblink 10 0.00 0.00

SQL*Net break/reset to client 4 0.00 0.00

Is it an issue with our application design or with network/system?
My e-mail :kishumit@gmail.com

Tom Kyte
January 12, 2011 - 10:11 am UTC

... We did debugging of clarify 2Tier ...

No idea what that means.



re-read the original answer above please - I told you what the cause of sql net message from client is as clear as I could there.

sqlnet message from client is the wait event we increment while.....

waiting for the client to tell us to do something.


Basically - it is the time the APPLICATION spent doing stuff - outside of the database, without asking the database to do anything. It is CLIENT PROCESSING time - not database time.

question

A reader, January 28, 2011 - 3:25 pm UTC

Tom,

I am working on tuning one applicaiton. it is taking long time and tkprof report

select codemappin0_.ID as ID4_0_, codemappin0_.CODE as CODE4_0_,
codemappin0_.MEANING as MEANING4_0_, codemappin0_.TYPE as TYPE4_0_
from
CODE_MAPPING codemappin0_ where codemappin0_.ID=:1


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1115 0.00 0.02 0 0 0 0
Execute 1115 0.00 0.14 0 0 0 0
Fetch 1115 0.00 0.03 0 2230 0 1115
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3345 0.00 0.20 0 2230 0 1115

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 69

Rows Row Source Operation
------- ---------------------------------------------------
1 TABLE ACCESS BY INDEX ROWID CODE_MAPPING (cr=2 pr=0 pw=0 time=0 us cost=1 size=67 card=1)
1 INDEX UNIQUE SCAN SYS_C0013339 (cr=1 pr=0 pw=0 time=0 us cost=0 size=0 card=1)(object id 24783)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 1115 0.00 0.00
SQL*Net message from client 1115 28.08 65.38


OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 78740 0.10 2.10 0 0 0 0
Execute 78739 2.20 24.87 1 12587 39738 8844
Fetch 72325 218.10 387.74 0 18256304 0 47592
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 229804 220.40 414.72 1 18268891 39738 56436

Misses in library cache during parse: 0

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 101501 0.00 0.32
SQL*Net message from client 101501 420.57 2993.30
log file sync 8616 0.12 15.64
SQL*Net more data from client 4519 0.00 0.24
SQL*Net more data to client 9038 0.00 0.37
latch: cache buffers chains 40 0.00 0.00
buffer busy waits 53 0.00 0.00
latch free 2 0.00 0.00
db file sequential read 1 0.00 0.00
buffer deadlock 1 0.00 0.00


OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 8 0.00 0.00 0 0 0 0
Execute 8 0.00 0.00 0 1 3 1
Fetch 9 0.00 0.00 0 18 0 5
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 25 0.00 0.00 0 19 3 6

Misses in library cache during parse: 0

78741 user SQL statements in session.
7 internal SQL statements in session.
78748 SQL statements in session.
********************************************************************************
Trace file: aisu1_ora_2003114.trc
Trace file compatibility: 11.1.0.7
Sort options: exeela
1 session in tracefile.
78741 user SQL statements in trace file.
7 internal SQL statements in trace file.
78748 SQL statements in trace file.
41 unique SQL statements in trace file.
2305860 lines in trace file.
3474 elapsed seconds in trace file.


I fount that total elspsed time in trace file is 3474 while SQL*Net message from client event it self is 2993.30. Also the Times Waited for SQL*Net message from / to client is high for majoriry statements are high. Can you help me for this probelm ?
Tom Kyte
February 01, 2011 - 3:35 pm UTC

... SQL*Net message from client 101501 420.57 2993.30
...

we spent ALL of our time waiting for the client to tell us to do something.

looking at the one sql, it is all generated code (ugly, ugh, bad) - which indicates to me that the programmers don't know much about databases- however in this case it doesn't matter because the performance issue is NOT in the database, it is in the code.

Looks like typical slow by slow code (that query that is executed 1,115 times to look up a single silly record by a primary key indicates that to me). And the processing time for each record they fetch IN THE CLIENT is long.

The code isn't asking us to do anything.

we spent 387 seconds executing SQL.
we spent 2,993 seconds waiting to be told to execute SQL.

Even if we executed NO SQL - it would still take 3,000 seconds to run this.

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.