A small, innocent mistake this week led to an unscheduled production outage. I had SSH sessions going for both my dev and my prod instances. I wanted to bounce my dev and entered SHUTDOWN IMMEDIATE. A few minutes later the customer called.... Yes, I had mistakenly used the production session.
I've since searched for a sanity check solution that would require operator confirmation to proceed with the shutdown. IOW some procedure or trigger that would execute prior to the SHUTDOWN. If the IP address was production, and if the time was within normal business hours, prompt the user for an on-screen confirmation before proceeding.
The sticky bit (sorry for the pun) is that the client ssh session loses control once the SHUTDOWN statement executes. So I thought to drop the interactive control with a simpler file read. Have the BEFORE trigger shell out to host, and look for a particular one-line file. A simply Boolean check, if the file exists, don't do the shut down.
As always, thanks to all participants for any guidance.
A database trigger is not going to help there, eg
SQL> create or replace trigger NOSHUT
2 before shutdown on database
3 begin
4 raise_application_error(-20000,'no no no no');
5 end;
6 /
Trigger created.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
You can see that the trigger did not halt the shutdown. You can see that the trigger did fire from the alert log.
Shutting down instance (immediate)
Stopping background process SMCO
Shutting down instance: further logons disabled
Sun Mar 12 10:59:07 2017
Stopping background process CJQ0
Stopping background process MMNL
Stopping background process MMON
Sun Mar 12 10:59:09 2017
Errors in file C:\ORACLE\diag\rdbms\np12\np12\trace\np12_ora_7004.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-20000: no no no no
ORA-06512: at line 2
but we shut down anyway.
So you really need a *process* driven mean to do this. For example, an approach could be to prohibit shutdown commands in SQL Plus via PRODUCT_USER_PROFILE. Then if you have grid installed, you can "enforce" that such operations are done by srvctl not sqlplus. If you dont have grid installed, then you might have a custom script which does:
- temporarily remove the row from PRODUCT_USER_PROFILE
- then shutdown
- reinstate the row on startup
or
- connect as a special user with a once only password that you dont know
- it does the shutdown
etc etc...Basically a means via which the level of confidence you have in not being able to make a mistake goes up.
Others welcome to add their thoughts.