Skip to Main Content
  • Questions
  • Catastrophic Database Failure -- Deletion of Control and Redo Files

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question.

Asked: April 08, 2024 - 1:37 pm UTC

Last updated: April 11, 2024 - 1:10 pm UTC

Version: 19.3.0.0.0

Viewed 1000+ times

You Asked

We recently had a database failure that resulted in data loss after an Oracle 19.3.0.0.0 database had both both its control, and redo log files deleted. Please note that I am not a DBA, but simply an analyst that supports the system that sits on this Oracle database. Any amount of data loss is fairly serious, and I am wondering how we avoid this in the future.

Before the control, and redo files were deleted, we had an event wherein the drive this database is on was full. This caused the database stop writing transactions, and disallowed users from accessing the application. Once space was made on this drive, the database operated normally for several hours until...the redo, and control files were deleted.

What would have caused the control, and redo files to be deleted?

In trying to figure out what happened, it was noted that if we had expanded the drive's memory in response to its becoming full, the later data loss would not have happened. Does Tom agree with that sentiment? Are these two events linked (disk drive nearly full and later data loss), or are they symptomatic of two different things?

and Chris said...

This is only something you can answer with an internal investigation. Oracle Database itself won't delete these files!

My guess is whoever made space on the drive accidentally deleted these files. So the two events are probably linked; it's unlikely anyone would delete files on the server when there's plenty of space.

But it is possible that there was a routine clean-up that just happened to do this shortly after the drive was full.

Ultimately anyone with admin access to the database server can log on to it. Then carry out destructive actions like this.

To reduce the chances of this happening again you need to put good processes and procedures in place, such as:

- Ensure as few people as possible have access to the database server
- Everyone who does have (admin) access to the DB server at least understands the basics of Oracle Database file structures so they know which files must not be removed
- Audit access to the database server, so if something like this happens again you can figure out who did this
- Make sure you have backups in place and a (tested!) recovery procedure

Rating

  (1 rating)

Comments

Kristen, April 10, 2024 - 1:42 pm UTC

If the redo and control files were deleted when the space was made on the database's drive, would the database be symptomatic right afterwards, or could this take hours/many, many transactions to become apparent?

The deletion of a control file would be immediately apparent, I would think, but I thought I would ask.
Chris Saxon
April 11, 2024 - 1:10 pm UTC

From the docs:

The control file must be available for writing by the Oracle Database server whenever the database is open

You're going to notice pretty quickly if this is deleted. Same for the redo logs - the database writes all changes to these, so you're in trouble fast if they're removed.

Depending on how the deletion process was carried out, it's possible that it cleared some space quickly so the database was usable again. But it ran for several hours before removing the control files/redo logs.


More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library