Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Roger.

Asked: February 01, 2010 - 10:30 am UTC

Last updated: March 01, 2010 - 9:36 am UTC

Version: 10.1.0.5

Viewed 1000+ times

You Asked

Dear Tom,

How can a session be killed within a RAC cluster without causing automatic failover to second instance?

This is needed to prevent long running select statements from continually consuming resources (clearly tuning SQL is the correct term answer, but often such fixes cant be introduced immediately in the operational environment)

Running the following kills the session;

ALTER SYSTEM KILL SESSION SID, SERIAL#  immediate;


This however causes the RAC cluster to failover the SELECT statement onto the second instance which then requires additional alter system commands to kill.

Unfortunately killing the client process cannot be performed until after hours. Modifying tnsnames or not using Oracle services isnt really an option (eg normal behaviour should be to failover SQL unless overridden by DBA)

Thanks

Roger

and Tom said...

use the resource manager.

With that you can either set up a proactive rule (if you think a query is going to exceed N units of time - don't even START the query)
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:5955622922067

or use a resource plan that limits the amount of resources they can use.



Rating

  (3 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

How about killing the server proccess?

Brad, February 19, 2010 - 12:47 am UTC

We address that issue (unwanted failover) by getting the server PID from gv$session and killing that process on the RAC node via "kill -9"

The resource manager approach seems problematic, although I may just not understand how it works.

It seems like using resource manager to prevent run-aways from starting would depend on the optimizer correctly identifying it ahead of time. You may add a hint to a query that makes it run okay despite having an expensive plan. In that case you wouldn't want resource manager to prevent it from running.

Also, putting an upper limit would be tricky for us because some of our queries can run a very long time even when they are correct and people would get upset if a legit query was killed after running for 4-5 hours.

I haven't used it, so I'm really just guessing how it works. Is resource manager smart enough to handle those issues?
Tom Kyte
February 25, 2010 - 12:20 am UTC

kill -9 is a very bad idea. Use at your own risk, expect the unexpected.

You can use resource manager to not run a query at all, you can use PROFILES to stop a long running query (create profile), you can use the resource manager to "slow down" a process that is consuming a lot of resources (eg: it runs full steam for the first minute, then, it is 'niced' down to a lower priority - getting fewer and fewer resources over time - letting it finish, but slowly).

You should read up on the resource manager AND profiles - both are pretty old features of the database - been around for a long time.


... I haven't used it, so I'm really just guessing how it works. ...

might be time to read about it?

http://blog.tanelpoder.com/2010/02/17/how-to-cancel-a-query-running-in-another-session/

Sokrates, February 25, 2010 - 5:43 am UTC

... might be a good start point for reading
Tom Kyte
March 01, 2010 - 9:36 am UTC

I saw that before, I don't agree, not a good idea.

The author agrees with that as they wrote:

A word of warning – this stuff is not for your everyday production usage!

Isn't it just a bug?

Oleksandr Alesinskyy, March 01, 2010 - 3:56 am UTC

I wonder if the fail-over behavior described in the original question is intended and not just a bug of the fail-over implementation? It seems to me that an intentionally killed session should not be a subject to the fail-over (or at least it should be configurable).
Did I missed something?
ALTER SYSTEM KILL SESSION SID, SERIAL# immediate;

This however causes the RAC cluster to failover the SELECT statement onto the second instance which then requires additional alter system commands to kill.