Directories
mark, November 26, 2005 - 3:41 am UTC
"I spread the files over 500 subdirectories" Is there a benefit to this, other than administrative one? Especiall if you're on SAN...Where I work there used to be a standard to have /oradata01 through /oradata05 to hold the files....I changed it to just /oradata.
November 26, 2005 - 12:28 pm UTC
the limit is in the OS and the directory structure itself. Directories are themselves "files" that are read to find other files. Too many files in a directory can definitely affect file open and directory traversal operations.
Quadro, November 26, 2005 - 9:12 am UTC
Kathrin,
if you are trying to make last possible effort to minimize backup time -- consider using 10G and block change tracking feature for rman. With BCT tablespace size is virtually unrelevant (only amount of blocks changed).
Quadro, November 26, 2005 - 9:23 am UTC
Mark,
there is file system overhead issues then you have too many files in the same directory.
thanks so much
A reader, November 28, 2005 - 3:33 am UTC
Thanks so much,
for putting evidence into this. I had a bad feeling about this design from the start, but was not able to name the points exactly. With this test scenario I'll have a fair chance to get my colleagues to re-design, before we become completely doomed.
regards,
Kathrin
Number of tablespaces
SS, January 13, 2010 - 8:55 am UTC
Hi Tom,
There was a proposal on the table to move files like MS Word, excel etc out of the blobs where they are currently stored and into the OS file system. The reason was that it was getting really expensive to store GBs of data in the database.
I proposed that we consider moving the blobs into a separate tablespace and put the tablespace in a cheaper storage area. Do you think that is a valid idea?
Also, we have a multi-tenant database architecture using VPD. One of the requirements is to limit the "total size of files" per blob by customer. So, for example, consider a documents table with the structure:
(document_id number primary key,
document blob,
customer_id number foreign key references customer(customer_id))
Is there a good way to enforce a restriction that the document blob cannot exceed 20gbs / customer id?
Also, thanks for your talk in Pittsburgh yesterday. You have a very good way of explaining things.
January 18, 2010 - 4:25 pm UTC
.. The
reason was that it was getting really expensive to store GBs of data in the
database.
...
and storing them in the file system will solve this how? If you want to talk expensive - explain what happens when someone does an "rm" - you know, to free up some space. Sure you can restore from backups (maybe) but can you do so consistently? with respect to the database?
What about security - how are you going to give the level of protection in the file system you have in the database?
and more.
Especially given your multi-tenant approach - security and recoverability might be relevant - your customers might expect that.
For the 20gb/customer, I'd probably use a materialized view/summary table to hold the output of a summation by customer id. Maybe that would be in the background (so as to not affect the current application) and when they exceeded 20gb - you would have a flag set that would prevent future inserts.
You might (*MIGHT*) consider storing the file size as an attribute - not referencing dbms_lob.getlength() - since getlength would actually read the blob to figure that out. If you know the length when you add the document compute it once) maintaining the aggregate is a very "low impact" thing to do.
Thank you
SS, January 19, 2010 - 9:54 am UTC
expansion on direcory file counts
Mathew, May 29, 2016 - 5:13 pm UTC
As Tom stated, the directories are a file, however one thing to keep in mind that it's a special file that the kernel file system driver assumes would be able to fit cleanly into a very small portion of ram. Once a folder gets a given amount of files (in my environment it was 50,000 files) ls -l and other stat operations in the folder start to take exponentially longer (seconds, minutes, hours). In addition the directory is a flat file database, thus it doesn't "shrink" and keeps a high water mark of the total size until you either delete and recreate the folder or use a third party tool to reclaim the directory blocks. Each file system handles these differently, but the ramifications are the same across all of them.
May 30, 2016 - 5:15 am UTC
Good input.
A "common" problem people have in this area is if they dont clean up their standard audit files (*.aud). Once you get tens of thousands of these, any operation (eg connect) that needs to write an audit file tends to slow down badly.