Session killed from killed from the client side, but alive from the server side
Nag, October 29, 2001 - 5:56 pm UTC
Tom
I killed quite a few sessions while an update was going on( the same update), as it was taking too much of time.
Apart from that a few other session were also killed by me.
Now, I find that one of my process which usually takes 20 secons is taking 130 seconds.
I called up the dba , and was informed that I have round about 15 sessions showing up on the server side.
Was this the reason for my process to slow down.
Suppose say I have 50 sessions killed from the client side, but alive from the server side, what kind of impact it will have on my work(DML, selects, updates, deletes, and inserts) i.e. the work Iam doing in my current session.
October 29, 2001 - 8:29 pm UTC
depends, if the killed sessions were still in the process of rolling back, it could slow things down. Check out USED_UBLK in v$transaction to see if there are outstanding transactions and how fast they are rolling back.
I cannot say "yes that was the reason". A DBA should have been able to tell you (using OS tools and peeking at whats going on in the database).
'Kill session' priority
Femi, November 28, 2001 - 1:06 pm UTC
Tom,
what actually does USED_UBLK in v$transaction tell me. Would it tell me if the user that was killed has any transaction being rolled back?
November 28, 2001 - 1:34 pm UTC
If USED_UBLK > 0, the session is in the process of being rolled back.
If USED_UBLK is 0, the session is effectively KILLED, resources have been released, we are just waiting for their client to send a request so we can tell them "you have been killed". Locks are gone at this point, its rolled back.
If you see used_ublk > 0 (say 1000) and 15 seconds later its at 800, you can guess that it will take about another minute (15 seconds/ 200 blocks, 800 more to go) to finish rolling back.
Killing the OS server process.
Mark J. Bobak, November 28, 2001 - 3:16 pm UTC
Hi Tom,
Have had luck in most cases w/ the following steps:
1.) kill session
2.) kill -9 <OS_PID_OF_SERVER_PROCESS>
3.) ORADEBUG WAKEUP <ORACLE_PID_OF_PMON>
That seems to always immediately make the session go away.
Any negatives in doing that? Seems to me that it just gets
deferred.
Thanks,
-Mark
PS. Above the window I'm typing in now, in BIG RED letters
it says "Please do not use this to ask a question." But,
clearly, people do, particularly when the question is
directly related to the original question. So, I'm
wondering....is the policy changed? Evolving? Is it ok
to ask directly related followups? Is the warning intended
to dissuade people from asking unrelated questions? Or
are we all just breaking the rules? (Including you, since
it also says "I will not be able to answer your questions
here.") ;-)
November 28, 2001 - 3:27 pm UTC
I personally like alter system kill session all by itself.
1) it is soooo infrequent that I have to do this (i mean -- really infrequent). There is a BUG in your system if you find you must do this frequently.
2) it lets the client know "nicely" that the session was killed. Otherwise they get a nastly "3113" or "packet writer failure" when they come back.
3) normally, the speed with which the killed session gets rolled back is more then sufficient (and monitorable).
but yes, kill -9 is something that is done (orakill on NT)
(i changed the "please don't ask" to "keep it in line with the question". used to be I didn't have a way to followup a comment ;)
Excellent, thanks!
Mark J. Bobak, November 28, 2001 - 3:51 pm UTC
Thanks Tom. In the above, my #1, I meant to write
'alter system kill session'. And I agree, that's almost
always enough.
However, in cases (fairly rare on my system) where that's
not enough, my #2 and #3 above have (almost?) never failed
to zap the session.
Anyhow, thanks, and thanks for clearing up the policy.
-Mark
Column details
Gururaj Kulkarni, August 06, 2002 - 2:15 am UTC
Tom,
I referred to user_col_comments to get the information about used_ublk of v_$transaction(SYS). But there is nothing.
Where would I get the information about columns and the contents?
Thanks
Abnormally terminated application and rbs
Vera, October 24, 2003 - 7:14 pm UTC
Hi, Tom,
Your Expert One-To-One is the book I read over and over again as I used to read The Lord of the Rings. Right now I am searching through it and prepare paper mostly based on it - in order to understand the below issue and also to lay out reasons for my team why we cannot and should not disable some of Oracle functionalities.
We have huge application that takes a long long time to run. It was literally translated from COBOL, it processes records row by row, inserts them in 3 "target" tables and log error table and commits for every row. I hope I am close to convincing my superiors of the idea of rewriting it, but right now my problem is this:
One of our users started the application on the server through the front-end GUI (Visual Basic). Then his computer got frozen and he manually rebooted it. At that time I happened to look at trace files and alert log and saw them rapidly growing with messages (end-of-file, resource busy --the application was trying to truncate tables in a loop (Tom, I did not write this code!)-- and "logical" error messages related to the fact that non-truncated tables had more than one records etc. I called the user and found out about reboot. I looked at the "target" tables and saw that they still were loaded - slowly, at the rate of 1 record per 7 minutes. Our error log table was slowly growing too - with the same messages as in the files. It was late and our DBAs who could reboot the instance were gone for the day. I killed the user's session from my TOAD. Session got marked "KILLED" not INACTIVE and the files were still growing and tables were still (very very slowly) loaded. In a couple of hours, after I already left and while this was (I am sure) still going on, another user kicked out the same job so the next day I found our error log table populated with millions of records (remember, application commits every record in a loop - Tom, I did not write this code!). Well, somebody stipulated that all this mess happened because Oracle was rolling back and my superior (very technical and smart, but he is not Oracle person) asked me if we could get rid of rollback segments, redo and other Oracle functionalities ("we have batch jobs and we do not need all these wonderful things"). When I said no, I was asked to prove it and that is what I am trying to do now. I am digging and will dig through your book and other sources, and I already did, but I - and many many other developers I am sure - would be extremely interested in how you see that: why Oracle continued to load tables and why we should not somehow disable rollback (I am not talking about NOLOGGING AND INSERT/*+APPEND*/, I am talking about completelly disabling rb), redo etc.
At this very moment I do not have an access to our system, but I am sure we use Dedicated server mode.
October 25, 2003 - 8:05 am UTC
ummm, just give him the concepts guide and say "read it"
it would become obvious that the only reason you use oracle is to have redo, rollback, and other related functionality.
just point him/her to this page -- bookmark this followup.
The answer to their question is "NO", you cannot turn off redo, you cannot turn off rollback, and "YES" it sounds as if the code has been done as wrong and inefficiently as possible.
the entire reason they bought a database would be obviated by turning off these things.
What happens in the Background? Is V$SESSION_LONGOPS useful?
Olga, January 13, 2004 - 6:25 am UTC
Hello Tom,
I can't see what happens in V$TRANSACTION (I have no DBA rights) but i could not imagine that a rollback is the reason for my KILLED session state.
When I do an INSERT /*+APPEND*/ INTO nologging_table FROM big_query I could watch the following steps in V$SESSION_LONGOPS:
1. Table Scan
2. Hash Join
3. Sort Merge
4. Sort Output
Ok, I cancel my session and let the DBA kill the session. When I cancel it at the "Table Scan"-Level the Session will be killed immediate, but when I cancel it at the "Hash Join"-Level then the session runs further (TIME_REMAINING in V$SESSION_LONGOPS goes down) in the background and my session is marked as KILLED for a long time.
What happens there?
Many thanks for your great site!
January 13, 2004 - 5:39 pm UTC
I'll guess (your dba should be able to verify this -- watching used_ublk in v$transaction)
has used_ublk started going up when you get to the hash -- perhaps you have started inserting during that phase and we are rolling back modifications to an index.
Kill a CREATE AS SELECT
Olga, March 15, 2004 - 12:43 pm UTC
Hi Tom,
it's me again. I have a running CREATE MATERIALIZED VIEW-Statement in a SQL*Plus session. Because I forgot a join condition I canceled the SQL*Plus Session by closing the Window and then I killed the session. But the session is only marked as "KILLED" and I can see a enormous running
Hash-Operation in v$session_longops.
What "cleanup" happens there in the background? I thought I could kill a running CTAS or Create Matview-Statement immediately because it would not generate rollback data.
Bye,
Olga
March 15, 2004 - 2:24 pm UTC
by closing the window, the server has no idea the connection is broken, that there is no client (windows tcp/ip doesn't tell the server "i've gone away")
It'll run till the call completes and then undo what it did. In the future, a simple "ctl-c" is what you want. But now, you can ask the DBA to help you out and kill the process.
pid
jasdeep, May 05, 2004 - 10:25 am UTC
how to find pid from oracle to use kill -9.
i have related pid of unix to process of v4session
but couldn't find more than one similar process.
i am using topas to find top 5 processes that are using my resources.
May 05, 2004 - 2:33 pm UTC
why? you want the people that are doing serious work to not be able to do it?
something wrong here -- something really wrong.
Use alter system kill session if you like (that'll kill their session, much nicer than kill -9
seems you already know how to find the pid's -- not sure what you mean by "more than one similar process"
A reader, May 06, 2004 - 6:03 am UTC
one similar processes means the top 5 pid's i am getting in topas command.
i am not finding them in oracle.
May 06, 2004 - 7:48 am UTC
*leave them be*
I really don't understand why you would be doing what you are asking to do anyway.
A reader, May 06, 2004 - 6:04 am UTC
one similar processes means the top 5 pid's i am getting in topas command.
i am not finding them in oracle.(they r non existing)
Kill -9 on Parallel query servers?
Michael, May 24, 2004 - 3:16 pm UTC
We have a cold backup every Sat. at 0900. The shutdown shell script I wrote is simple:
#!/bin/ksh
export ORACLE_SID=DEVL
export ORACLE_HOME=/u01/oracle/product/9.2.0
/bin/rm -f shutdevl.log
lsnrctl stop
#The following finds LOCAL connections to this instance and
#kills them
DEVL_ORASESS=`ps -fuoracle |grep DEVL|grep LOCAL |awk '{print $2}'`
if [ "$DEVL_ORASESS" != "" ]; then kill -9 $DEVL_ORASESS
sleep 15
fi
sqlplus /nolog @shutdevl.sql
#the following emails the log to my pager
/usr/bin/mailx 9999999999@myairmail.com </export/home/oracle/shell/shutdevl.log
exit
The sql script that is called (shutdevl.sql) logs in as an externally identified sysdba and does a shutdown immediate. This past weekend, there was parallel query server process, p003, that was ACTIVE for some reason, even though the local connection that had been using it was killed by above kill -9 section. So, the database did not shut down, even after trying for about an hour. The backup was no good.
We have just been discussing putting an extra set of logic in the script that would check for pmon and if it was still there after 5 minutes, do a shutdown abort. The other alternative is to grep for parallel query processes in the above kill -9 logic (grep LOCAL |grep DEVL |grep p00) and do a kill -9 on those as well.
Which is better, kill -9 on p00n processes just in case one of them is active for some reason, or shutdown abort? With the shutdown abort option (I hate doing those) we would have to have another set of logic that knows a shutdown abort was done, and does a startup/shutdown/startup. Thanks for your input, this is a great site.
May 24, 2004 - 4:10 pm UTC
hot backup, it is all about hot backup
unattended shutdown/startups = 99.99% of all "sorry, database not available" messages in my experience.
database either doesn't shutdown without a kick in the pants, or upon coming up wants a little love and attention.
what is best is a hot backup which is not any harder (or easier) to restore with than a cold backup in archivelog mode.
In that fashion, you just let it keep on trucking.
that is my input. I never shutdown for something as mundane as a backup. Never.
Agreed, but......
Michael, May 24, 2004 - 5:00 pm UTC
right now I am stuck with cold backups. This is a decision that was made before I came. The management feels that due to the nature of the database, it would be faster to recover with a restore from a cold backup and then running the load scripts up to the day of the failure (this is a warehouse) than to use archived log files. In addition, there would be the matter of LOTS of archived redo log files to manage and apply for a recovery. That does not bother me, but that is their decision, over my protests.
My inclination is to just put in another section of the shell script...if it sees pmon after the initial shutdown immediate (after 5 minutes) it calls a different sql script to do a shutdown abort, then a startup, then a normal shutdown. I just was not sure if killing the parallel query processes was okay to do, or if it was preferable to shutdown abort.
May 24, 2004 - 7:10 pm UTC
shutdown abort;
startup with restricted session;
shutdown immedidate;
is what I would do -- but then it will fail coming up sometime.
What about using read only tablespaces instead???? flip the tablespaces into read only mode, transport the tablespaces off using exp, flip them back read write for a load?
or.....
do the load
make the ts's read only
copy their files
take a hot backup of just system and rbs
now all you need to roll forward through would be system and rbs...
Thanks for new ideas
Michael, May 25, 2004 - 9:14 am UTC
I figured out the first one, but the other tips you gave are great ideas. Sometimes, I can get too close to the forest to see the trees. When I read your suggestions, my reaction was to whack myself in the head as say "DOH!". Guess I was doing my best imitation of an ostrich.
i got this err
A reader, May 28, 2004 - 8:21 pm UTC
i got this error in my alert log file and the instance hanged
smon:parallel transaction recovery tried
i have made query for getting how many locks are there in the system
that was not working
the users were not able to work and ultimately we have to restart the instance.
plz suggest what can be the reason
May 29, 2004 - 10:58 am UTC
Please contact support and file a tar -- they'll help you diagnose your issue
Very helpful
Ora-Fan, May 31, 2004 - 6:14 am UTC
But sometimes there may reall y be an issue, right?
Dratz, September 16, 2004 - 3:08 pm UTC
I have a user that tried to "delete from table_x" three days ago, it took a while so he cancelled and tried again and again and again.
My DBA killed all the sessions. Locks are still being held and the user's other killed sessions are still being blocked. Running that query (from above) has returned the same values for two days without change:
------
15
127
1
1
This is a 9i database and I've never seen earlier versions take THIS long to rollback a delete of 73k rows. Is Oracle treating this like Java does garbage collection? I must be missing something because I just can't believe I have to bounce the database before anyone can do DML on this table.
This user is the only one with locks on this table and the only one waiting for locks (except my test delete). Obviously, queries are fine. Nothing in the logs to indicate a problem.
September 16, 2004 - 3:25 pm UTC
what does used_ublk show and what is the status of the session and how did the user "cancel"? and what exactly was canceled?
database connection errors in logs
jharvu, September 16, 2004 - 4:50 pm UTC
The java application running on one of our development databases (version 9.2.0.3) started reporting 'database connection not available' errors in the logs for some users. Others were happily connected and did not face any problems. Since no code changes were made, we decided to bounce the DB and that looks to have solved the problem, because after restarting the Db, there were no such errors.
I would like to know what would have caused the 'unable to connect to database' error before restart. Could you throw some light , please?
September 16, 2004 - 7:54 pm UTC
totally insufficient data
whose logs
what logs
what was the oracle error message
I assume the db restart was accompanied by a mid tier restart too -- was it that perhaps.
those numbers are from used_ublk
Dratz, September 16, 2004 - 5:34 pm UTC
15
127
1
1
User (on Windows) called to complain, said he had rebooted his machine and was encountering same problem. DBA found that all of his previuos sessions were still marked as active. He killed them then "kill -9" them.
The same user has caused the same problem (currently) with an insert into another table.
Rollback segments are not busy, nor very full.
One important correction (94 instances here), this database is actually Oracle8 Enterprise Edition Release 8.0.6.3.0 - Production.
September 16, 2004 - 8:00 pm UTC
kill -9 ! what was wrong with alter system kill session?
are those used_ublks associated with those sessions?
are those sessions visible in v$session?
killed session due to disconnected client
Menon, June 30, 2005 - 7:19 pm UTC
Hi Tom
Say I have a Java program runnig a PL/SQL procedure.
If the JVM crashes in between - I think the following
is what happens. Please confirm or correct.
If the client is disconnected, the
database session it was using to
execute the procedure itself will get killed.
Which means you will no longer be
connected to Oracle. (Similar
to when you press Ctrl C
in sqlplus in the middle of an operation
or a PL/SQL call.)
This will cause your procedure
to automatically rollback its changes (unless it
has already committed some changes - but
you should not have commit within PL/SQL
anyways.) Note that
rollback itself could take a long time.
Since rollback is supposed to be less
frequent than commit, Oracle "optimizes"
for commit (it does lot of work in the
backgeround so when you commit
most work is already done).
But when you rollback - it is costly
since it has to undo all the good work
it did in anticipation of commit.
Thanx!
June 30, 2005 - 9:04 pm UTC
if the jvm crashes, the server probably won't notice until it tries to write the response back to the now crashed client -- meaning if the plsql routine has transactional control, and it commits, it'll commit.
The plsql procedure won't know the client dies until it tries to talk to it.
Thanx!
Menon, June 30, 2005 - 11:33 pm UTC
This is back to basics for me. So an example of
plsql routine talking to the client is when your
routine consists of returning some results through
a result set and the client has done another fetch
in the middle? Any other examples?
July 01, 2005 - 9:35 am UTC
normally, a client calls a plsql routine and client is suspended until the plsql procedure returns whatever output it has to return.
So, plsql talks back to the client when it is "done" normally (although it can talk sooner with PIPE ROW() and pipelined functions)
Thanx for the clarification!
Menon, July 01, 2005 - 2:49 pm UTC
followup on the previous question
Menon, July 05, 2005 - 12:44 pm UTC
"normally, a client calls a plsql routine and client is suspended until the plsql
procedure returns whatever output it has to return.
So, plsql talks back to the client when it is "done" normally (although it can
talk sooner with PIPE ROW() and pipelined functions) "
Two followup questions:
1. JVM executing the PL/SQL procedure has crashed ->
the client has disconnected. But the PL/SQL procedure
is still executing till such time that it tries
to contact the client. Does this mean that the
session is still active? Looks like it has to be
since the PL/SQL is executing!:)
2. If you don't want to wait for the PL/SQL routine
to "contact" the client, I take it that you have to
kill the session before "restarting" the PL/SQL procedure
(executed through Java)? In other words, is there a reason
for the client to wait for the rollback to finish or can the client simply re-execute the procedure? This assumes
that there are no commits in the procedure. If there
are, then by definition the procedure is not "re-startable" and we are in trouble:)
Thanx!
July 05, 2005 - 1:11 pm UTC
1) yes, it would be.
2) the client was the jvm?? but anyway, it depends on the logic of the procedure here doesn't it.
say the procedure was:
as
...
begin
lots of stuff, updates and all.
COMMIT;
return;
end;
the procedure will commit not rollback and then discover the client went "away"
Thanx!
Menon, July 05, 2005 - 3:15 pm UTC
"the client was the jvm??"...
well, you knew what I meant:)
Thanx - that is what I thought...
spoke too soon...
Menon, July 05, 2005 - 3:23 pm UTC
So I take it that if the procedure is still
executing (has not contacted the client yet and that
contact will happen an hour later because the procedure
is long), and I want to kill the procedure then I need
to kill the database session. Once the session
is killed, rollback of the work done so far
will start.
If the procedure was restartable, then I can simply
reexecute my procedure from my Java program. If not,
then well, I have to find out at what stagee my procedure
stopped executing and hopefully will be able to
do that - not a good position to be in :-)
July 05, 2005 - 3:50 pm UTC
correct.
correct.
Thanx Tom!
Menon, July 05, 2005 - 7:48 pm UTC