'Corrupted tablespace' - missing datafile
A reader, May 30, 2004 - 2:33 pm UTC
I realize the buffer cache bit and how data gets flushed to disk during checkpoints, but am not clear about the below.
<QUOTE>
"In your case, the mv was just a rename -- anyone that HAD IT OPEN would still have it open. It would only be when they went to open it again they would hit the error."
</QUOTE>
I kinda understand how a move works in Unix. Say I "touch" a file "test". Open it with vi editor in one session. Move "test" to "test2" in another session. And then make changes in the first 'vi' session, and save. We then end up with 2 files at the OS level - test and test2. While test has saved the changes, test2 is empty. That's somewhat expected.
But in the case that I'm simulating, I don't see why you say that I "HAD IT OPEN" all the while. Because I was able to simulate the same thing (was able to successfully issue inserts/commits and even issue a 'alter system checkpoint;') by opening a separate session immediately after having moved the file.
Could you explain why you say that "I HAD IT OPEN" all the while? Does opening a separate session ensure a new "file open" request?
The above is simply of academic interest to me. I wouldn't be doing whacky stuff on a database I care about :).
Thanks again,
Anand
May 30, 2004 - 2:49 pm UTC
don't use vi in the analogy -- vi opens a file by copying it to /tmp!
the process, the oracle process that did the checkpoint had already opened the file, had an open file descriptor to it.... erasing that file at the OS level, renaming it, moving it -- nothing would have any effect on the fact that the process already had it open.
so, that file was opened by whatever process did the checkpoint.
you renamed the file.
you said "please checkpoint"
process doing the checkpoint said "be my pleasure, oh look file I need is already opend, cool -- write to it"
eventually, file was not opened -- needed to open it, had the old name, no such file exists -- bamm -- error.
file descriptors
A reader, May 30, 2004 - 3:18 pm UTC
Thanks. My knowledge of file descriptors is very limited. I guess I ought to read up, before I can absorb what you said.
As I understand it - the previous checkpoint had open the file (had an open file descriptor). Then I physically moved the file. This didn't matter to the oracle process since the file descriptor was already open. So the checkpoint was successful. But once the file descriptor is closed, and the checkpoint needs to open a NEW one; that's when it fails.
<QUOTE>
eventually, file was not opened -- needed to open it, had the old name, no such file exists -- bamm -- error.
</QUOTE>
When does a checkpoint actually open and close a file descriptor? Obviously, not when every "alter system checkpoint;" is issued. Because, I was able to successfully issue multiple "checkpoints" for a certain period of time one after the other (after having moved the underlying datafile). Is this where the concept of 'lazy' checkpointing comes in?
May 30, 2004 - 3:48 pm UTC
you didn't really even physically move the file here -- you simply renamed it -- the file data on disk never got touched!
processes open files. something a process does might need a file, but processes open and close files as needed. they might already have them open, maybe not. If not -- they open() them and that is when it started failing. files are being opened and closed for lots of reasons, all of the time in the database.