Home>Question Details



Steve -- Thanks for the question regarding "Storing Files in an Oracle Database", version Oracle 9.2

Submitted on 10-Jul-2008 16:41 Central time zone
Last updated 2-May-2009 12:04

You Asked

Hi Tom, if possible could you please explain the advantages and dis-advantages of storing files within an Oracle database. We are currently using 9.2 and 10g. We have no immediate plans to move to 11g. Thanks, we all appreciate your answers!

and we said...

I know of no advantages to storing data I want to keep for a long time outside of a database.

If it is in the database I can

be sure it is professionally managed

backed up

recoverable (with the rest of the data)

secured

scalable (try putting 100,000 documents in a single directory, now, put them in table - which one 'scales' - it is not the directory)

I can undelete (flashback) easily

I have locking

I have read consistency


I honestly cannot think of an advantage to storing anything of importance in the filesystem. 
Reviews    
4 stars What about the disk storage amount?   July 10, 2008 - 10pm Central time zone
Reviewer: Amin Adatia from Ottawa, ON Canada
I am trying to determine the actual space taken up by having the files in each of the environments. 
If anyone else has that information I would be grateful. At my project, the System Administrators 
always give the low performing disks for file storage. So there is an immediate hit on performance 
if you do anything other than read the file one time and extract the "text" and save it in a CLOB 
-- which defeats the whole purpose of not putting the files in the database table.


Followup   July 11, 2008 - 8am Central time zone:

if you store them in the database, they are not files anymore - they are data.

So the comparison is meaningless - they become important data in the database, in the file system - they are just boring files.

I'd be amazed though if you had disks of such differing performance that you actually could tell they were on the "low performing" disks unless you were a really high end sort of system with thousands of transactions/second or minute against these files. And if you were - that is all the more reason to put them in the database so they are backed up, recoverable, secured, audited, etc etc etc (eg: they must be important)
2 stars depends entirely on your situation   July 11, 2008 - 12pm Central time zone
Reviewer: Ben from Illinois, USA
I find that a mix of file system & database storage works best.  

For retaining images (high end photos, x-rays, etc), MS Office documents for general use, and other 
very large files I recommend file system.  Some imaging systems can add multiple GB of storage 
daily.  Do you really want your db archives to grow like this?  There are also frequently issues 
with getting 3rd party software to render documents properly.  We use a server based control for 
rendering tiff & jpg through a webpage and it can not use a db based repository.  Not to mention 
all of the archiving/retention products out there that work with file level storage (except for 
Oracle's products, of course...)

For central file storage, when the files themselves will be relativly small but you will have a lot 
of them, database works well.  It does scale well, until the users start trying to upload 300MB pdf 
files & then the archive dest fills up every couple of hours...


Followup   July 11, 2008 - 1pm Central time zone:

yes, you do really want your database archives to grow like this.

Do I want you to keep MY medical stuff in a directory, unsecured, unmanaged, not backed up like the database is - we have no idea if recovery is possible, who accessed it last? Who has access to it.

The document is stored on a server, the document gets to a client for display. Documents in the database can actually appear as a network share on a client - just like a file, looks like a file, behaves like a file - but we have everything - everything the database offers as far as data features.


The only use I have for files is to give them to the database to store important stuff.
4 stars data on network shares   July 14, 2008 - 5am Central time zone
Reviewer: Donat from Brussels Belgium
> Documents in the database can actually appear
> as a network share on a client - just like a file,
> looks like a file, behaves like a file - but we
> have everything - everything the database offers
> as far as data features.

Tom, could you please elaborate on how this can be achieved?

Thanking you in anticipation and best regards,
Donat Callens

Followup   July 15, 2008 - 9am Central time zone:

http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14259/partpg5.htm

4 stars To Donat - Re access files in DB via network share   July 15, 2008 - 5am Central time zone
Reviewer: Kim Berg Hansen from Middelfart, Denmark
Regarding accessing files in DB via network share.

One way is using WEBDAV. For example we have a system where we upload files to the XML repository in Oracle - here we can add XML metadata to the files (tags and other info so we can sort of "attach" the files to records in our relational data.) The XML repository can then be published via WEBDAV and the files accessed via HTTP, FTP or network shares. Or you can install the ODrive client, which gives you a more "advanced" network share for collaborating on the files.

That way (using the XML repository in Oracle) works fine - to integrate it into our application we have had to code a bit ourselves. Another more "full-fledged" solution would have been to get Oracle Content Services, which (as far as I know) provides most of your needs for files in the DB "out of the box." :-)


4 stars file system   July 15, 2008 - 5pm Central time zone
Reviewer: sam 
Tom:

1. Do you agree that most IT managers or architects prefer to store files on the filesystem. I 
guess becuase they are afraid of the DB or hey treat it as black box or data dump.

2.  If i have my files in the DB, can users see them like on a windows share drive (drive/directory 
tree) or you have to create an application that displays that for them.

3. What did you mean by scalable? try puttin 100,000 documents in one directory versus one table.

4. Is not there one disadvantage of uploading the files from CD or disk to the ORacle which can 
take considerabel time if you compare it to a copy to a unix filesystem or windows.

thanks,


Followup   July 15, 2008 - 8pm Central time zone:

1) they are therefore not architects if they think of the database as a black box.

Those that value their data, the security of their data, the availability of their data store their data in a database.

2) see above, we just talked about that.

3) go ahead, put 100,000 documents into a directory and then start to try to open the last one. See how long it takes. Now triple that, and again and again. File System Directories are not made, not designed to hold thousands of files, they do not scale. You would have to find some magic way to distribute them logically over many hundreds/thousands of directories yourself and pray that you can find them later when you need to.

4) No, there isn't. You still have to UPLOAD the file regardless of where you store it, we can write lobs into the buffer cache if you want.

Another disadvantage of the file system - you would upload the file to the appilcation server - which people tend to run N of instead of just one. Therefore, you have to cross mount all of the file systems to every application server - yet another single point of failure, contention, whatever.
5 stars filesystem   July 15, 2008 - 9pm Central time zone
Reviewer: A reader 
Tom:

I did not understand your last comment. Can you explain what you mean. what do they run N. do you 
mean when 10 people wants to download a file from the web server file system.



Followup   July 16, 2008 - 10am Central time zone:

... the file to the appilcation server - which people tend to run N of instead of just one. ....


run N application servers.


people tend to run more than one application server.
5 stars Oracle File system outperforms windows file system   July 15, 2008 - 10pm Central time zone
Reviewer: Balaji Chellappa from Stamford, CT USA
The entire argument below is based on keeping/maintaining the files in Oracle 11g XDB.

I was under the impression, keeping the file system inside oracle is no way better than keeping it in the Windows share. But it is not true. It is just mind set and it takes some time to come out of it. Just think about it. If you access a file through windows it checks your permission on the same and then make it available to you. In Oracle file is stored as the same chunks but the difference is in the tablespace datafiles. When you try to access file through Oracle it does the same thing. It just goes to the physical location where the file is stored and gives it to you. I always tell myself (may or may not be true, but makes me feel more comfortable with Oracle file system) Oracle handles the files in a special way. It doesn't cache the big files in the memory like regular tables and then made it available to the User. Rather as soon as it realizes User requested for the file it just goes and read it from the location where it was stored physically in the tablespace and gives it to the user. Obviously it will do smart things like caching the physical location, security point view whether the current user has access to it etc.

Actually I am working on a project migrating our sales force application running on Oracle 10g R2/Apache web server/Apex 2.0 environment to Oracle 11g/XDB-Webdav/Apex 3.1 environment. Our application is distributed across 350 laptops out in the field and synchronizing with homeoffice database every day. We are successfully running it for the past 3 years. We had numerous issues while copying resource files (like images, css, pdf etc) to local machine from a centralized windows share. Especially if the User is connected to home office through a telephone line VPN connection most of the time the file copy process was failed due to instable connection and no way we will know whether the process succeeded or not. Also windows does numerous security checks before it allows the User to access it and was slow. Hence we decided to give it a shot on Oracle 11g-XDB-webdav as a replacement for Apache, keep everything in the database and get away with windows file share/copy.

Our original file copy process (We have around 1100 files, most of them are Apex images of size 1-50 kbs, some pdfs, xls etc total size 100 MB) from windows share to local machine took about 3 minutes. That too laptops connected through local network(LAN).

Then I actually stored those files as a BLOB in a Physical table in a centralized database and used database links to read the BLOB and copy the same to Oracle webdav. Believe it or not it took less than a minute to copy everything. It is not just that, it is more reliable and I am 200% sure the process went through fine if my commit succeeds.

Also you can make the Oracle file system look like a regular windows explorer file system using webdav (Copy entire folder between webdav and windows/ paste / delete will work like regular windows explorer copy paste). Moreover FTP can also be used to access the Oracle file system.

Till now I talked about the advantage. The only disadvantage that I am seeing right now is file edit. I don¿t have any tools that allow me work directly (open/edit/save) on the webdav files. Might be some tools available in the web. Yet to google it for the same.

PS 1: I had lot issues while copying the BLOB to webdav because of new security features (ACL) introduced in Oracle 11g. It took good amount of time to understand and make it work. I did everything by myself and none of the forums talks about. I will share the same in detail when I get some time.

PS 2:We are considering further process improvement by compressing the files using utl_compress.lz_compress while storing it in the centralized database and uncompressing the same in the local notebook after downloading it from the centralized DB Server.

5 stars filesystem   July 16, 2008 - 5pm Central time zone
Reviewer: A reader 
Tom:

1. How does mounting a filesystem to N application servers will be not good compared to storing 
files in oracle. Would they mount one filesystem on app server to all other app servers?

2. I like your ideas about storing files in DB. In order for me to sell that I need to get more 
educated on the benefits which you list breifly. However, is there a book or article that goes into 
more depth on the benefits.

3.  I have a system now where a vendor uploads a bunch of electronic files for a book (3 giga 
bytes) to a remote server A. Then a bunch of perl prgoram run some tests and if they pass it moves 
the files across network from Server A to server B for production control. there they do some QA 
checks and then move files to server C for archival and server D for website.

Each book is about 10 files (mp3 and xml) but there will be thousands of books on the filesystem in 
directories and subdirecotries.

a} How would you design this in oracle? the way i am thinking i can have one database or three 
databases (instead of servers) on one machine and let the vendor upload files to one table. Then I 
can flag the book in that table or copy the record to another table for production control, etc.

So instead of having several servers/filesystems, i have one database/tables. Staff can easily 
query info or access files on any book and run them.

b} Would I run into size problem here if the table became 100 tera bytes or whatever. would it 
create a 100% dependency on full time DBA (one of the issues)?

c) how can some external progam like perl run tests on files stored inside oracle?


4 stars   July 17, 2008 - 2am Central time zone
Reviewer: A reader 
If it is not recommanded now to use CLOB and BLOB to store data in a talbe and only use Oracle XML 
DB Repository Data for better performance.

If we only need to upload only the scan data and dont need to update this then we use CLOB and BLOB.


Followup   July 17, 2008 - 11am Central time zone:

where did you see that recommendation.

as with anything, the only reasonable answer is "it depends, what do you want to do, what features/functions do you need - then we can tell you the best way to do it"
3 stars Differences in XDB's handling of LOB and XML data   July 18, 2008 - 8am Central time zone
Reviewer: Max 
As far as I know one can't have LOB data "sent" to XDB *AND* get it stored in *user-defined* tables 
by XDB (without any additional coding) -- as opposed to what can be achieved with XDB's processing 
of XML data (by registering XML schemas to define mappings of XML elements to user-defined tables).

Or is there any to "move" LOB data to LOB columns of user-defined tables instead of XDB tables like 
that ...?


5 stars LOBs and performance   July 29, 2008 - 1pm Central time zone
Reviewer: Stew Ashton from Paris, France

Tom, I accept (and repeat in my company) all your arguments for storing files in the database; however, developers and software companies generally object that File System access is much faster than access to LOBs in the database.

Oracle has submitted proof that this is true for versions prior to 11G:
http://www.oracle.com/technology/products/database/securefiles/pdf/securefiles_oow2007.pdf

On slide 10 they show that using SecureFiles, LOB access is about as fast as File System access, while on slide 11 they show that it is about 3 times faster than pre-11G LOB access. It is easy to deduce from this that pre-11G LOB access is about 3 times slower than File System access.

I gather from this that for 11G storing files in the database has huge advantages and no drawbacks, whereas prior to 11G it has great advantages and one drawback.

By the way, I wonder how many files per directory they had in their File System test ??

Followup   August 1, 2008 - 9am Central time zone:

and i wonder if they enabled any sort of caching for the lobs myself. OS file system writes/reads - buffered. LOB writes and reads, by default, not buffered.

buffer it and you'd see a huge difference
4 stars buffered?   August 9, 2008 - 11pm Central time zone
Reviewer: ap from Los Angeles
Tom

Could you elaborate on your last comment?

"And i wonder if they enabled any sort of caching for the lobs myself. OS file system writes/reads 
- buffered. LOB writes and reads, by default, not buffered. Buffer it and you'd see a huge 
difference"

How would the results from the Oracle performance paper change?


Followup   August 12, 2008 - 4am Central time zone:

if you write a lob, right now - by default, that is not done into the buffer cache, it is done as an IO to disk that must complete before the application gets control back.

if you go to read that same lob, right now - by default, that is a physical IO to disk - no buffer cache.

The file system typically buffers
Lobs by default - do not


5 stars ap   August 12, 2008 - 11am Central time zone
Reviewer: ap from Los Angeles
Tom

If I understand you correctly, are you saying that the actual performance of SecureFiles compared 
to a filesystem is even better than what is mentioned in the Oracle paper, given that their results 
were based on a buffered filesystem versus unbuffered Oracle?

Also, given that they used a SAN in their test, I would assume it was setup with a write-back cache 
(which is the default). In this case, both types of tests would be sort of buffered, right?


5 stars filesystem   August 14, 2008 - 11pm Central time zone
Reviewer: A reader 
Tom:

IT seems there are two camps on how to store files

http://en.wikibooks.org/wiki/Programming:WebObjects/Web_Applications/Development/Database_vs_Filesys
tem

The biggest thing about using filesystems is:

a.  faster performance
b.  cheaper disk space (than hiring dba)

Are you saying i can store 100,000 files in one table or one million and my performance will never 
change?

What about if you build this nested directory structure and put those 100,000 files. would that 
keep filesystem yield good performance?


Followup   August 20, 2008 - 8am Central time zone:

Think about this...

cheaper is better right? is it? what happens when you have the data not being managed and something bad happens. How do you secure, manage, backup, recover, protect this data in your file systems?


I'm saying you can store billions of documents in a single table and I would expect the access time to be approximately the same to get to a single document as I would if there were one row in that table.

With a table, you have the ability to scale up like that.

Now with the file system you have to write code, lots and lots of code to spread the files out - try to remember where you put them - adding directories left and right as the documents arrive. And you have no point in time recovery, you have no rollback, you have no commit, you have no real ability to secure the files, you have no audit trail really to tell you what has been done to them, etc etc etc.


Databases were 'born' to protect data - that is what they do, they are called database Management systems - they were born to manage data.


If you value these documents, if you want them around for a long long time, you'll protect them.
3 stars no such thing as a perfect solution   August 25, 2008 - 4pm Central time zone
Reviewer: Ben 
everyone will have an opinion on this & on Oracle forums I would expect them to lean one way & 
other vendor (non-DB) I'd expect them to lean the other way.  Storing files in the data does have 
all of the advantages of storing any other data in a database.

That said, I can sum up why not to in two words : other vendors.  Archiving systems, image 
manipulation, etc, etc, etc. are all are written to 1st work with a file system & maybe 2nd to work 
with a DB.

Yes, webdav is good, yes it can give a user the experience of a file system.  User experience is 
_not_ system experience & sometimes it just won't work & the only way to find out for certain is to 
try it.  

If you can make your complete system work with storing the files in the DB, you're better off in 
the long run.  But be warned that it could influence future software choices.

"If it is in the database I can"
be sure it is professionally managed" : ??? are you slightly systems admins saying DBAs are better?
"backed up" : been doing this with file systems for decades before databases existed.  Pretty sure 
it can be done still
"recoverable (with the rest of the data)" : see above
"secured" : again, see above
"scalable" : you need a caveat.  I frequently here this from DB people. I must chime in to say this 
is in fact absolutly FALSE.  It is not the file system that gets slower, but rather the interface 
that bogs down.  For users this is a file list from a shell, command prompt, Windows Explorer, or 
whatever interface they are using.  Modern file systems operate at the same speed if 10 files, 
100,000 files or a million.  It's all about how you interact with it.  
"undelete (flasback) easily)" : 1) not all systems can allow delete to exist.  2) again, its' all 
about how you interact with it, but generally yes, the db recovers faster.
"locking" : 
"read consistency" :

We store millions of images in the files system, with a max of 5,000 per directory and store file 
path refrences in our database.  The only reason we keep it to 5,000 per folder is so that with the 
average size of image we can archive the directory to an optical archive system.  We use a 3rd 
party rendering server to offer up any file format to a web client, but it must use a file system.  
We have successfully used this setup with over 10,000 files in a folder (average size 200KB) and 
actually get FASTER times that the 3KB text files we store in our database (10gR1).  The DB 
response isn't slow, it's just the file system is faster.  If we were to put 100,000 files in the 
same directory, the response time wouldn't change for the system (from the user's perspective).  
Due to regulations, no file ever imaged is allowed to be deleted.  This is easily enforced with 
file permissions, confirmed with 3rd party change managment software, logged & audited by other 3rd 
party managment software.  Yes, I know you can log & audit et. all in a db, but the key here is the 
3rd party change management system is enterprise wide, monitoring shares, ldap/ActiveDirecoty 
servers, system configs, yada, yada.  If it was stored in the DB, that would be the limit of the 
auditing.  Here we can store the file once, use it in many systems, & audit it with the same tool 
we use to audit non-database information.




Followup   August 26, 2008 - 9pm Central time zone:

... are you slightly systems admins
saying DBAs are better?
...

yes, 100%. DBA's manage data, they care about data, they understand the need to perform a consistent point in time recovery, they know about auditing, version control and many other things that frankly - SA's don't think about. This is about data, not about getting eth0 plumbed correctly.

... been doing this with file systems for decades before databases
existed. Pretty sure it can be done still
....

see above, there is something more than just backing up in data management - especially when it comes to "files". Non-repudiation, auditing, securing, version control, yadda, yadda yadda - you know, the features you "buy" in those version control things that are really trying to be - get this - a database on top of the file system! Yes, that is what they actually do, they are * a database*....



... It is not the file system
that gets slower, but rather the interface that bogs down....

hhhmmmmmmm

interesting

so you have put 1,000,000 files in a directory?

... with a max of 5,000 per
directory and store file path refrences in our database. ...

oh, no, you don't. You know it is a "bad idea"



It is true, YOU can write a lot of software to do a lot of things.

Or, you can use what you paid for already.

You are correct, and it is entirely up to you. Me, I prefer to .... not write code, as often as possible. I am a bad coder (we all are), I prefer to use what I can that has come before and is used heavily by others.
5 stars database   August 26, 2008 - 10pm Central time zone
Reviewer: A reader 
Tom:

if you call yourself a "bad" coder i do not know who is good in this world.

Is there a way when you use a system to tell whether a file is stored in the DB or filesystem.

I noticed that my bank stores all the previous monthly statements (PDF) in the web app. I am just 
wondering if it is in DB.

Do you know if facebook or youtube stores all those files in the DB.

I think majority of applications out there store files on filesystem due to lack of API and 
slowiness in LOB access. 

However, i think the trend is starting to reverse and many enterprises are storing unstructured 
data into DB. But this all depends on what database they use. I do not think if you use Sybase, 
informix, my sql, DB2, sql server you would want to store files in the DB.

I think Oracle is "years" ahead of all those players in this area.


Followup   August 27, 2008 - 8am Central time zone:

we are all bad coders, we get lucky sometimes.

... Is there a way when you use a system to tell whether a file is stored in the DB
or filesystem.
...

depends on how it was setup, you can make it so that you cannot tell whether the file is in a file system or the database. I can mount my "file share" as a network disk, ftp in and out of it, or use a web interface to down/upload a file - among many other access methods. And all of the files are in.... the database.


... I noticed that my bank stores all the previous monthly statements (PDF) in the
web app. I am just wondering if it is in DB.
...

probably not - and that is probably OK. Many times the pdf's are generated on the fly (you know, there are no pages stored for asktom, the page you see is generated every time you access it). And even if not, they are generated and can be easily regenerated from the source data that IS in the database - then they expire in 12-18 months.


3 stars try this   August 27, 2008 - 10am Central time zone
Reviewer: Ben 
Tom,
 Databases have their place in the enterprise, but they are not the enterprise as you seem to 
suggest.

Scaling : File system vs Database
Please try this experiment and determine for yourself if a file system does not scale
Create a script/application that creates 100,000 files named fileXXX.txt in a single folder (any 
OS)
Store all of these files also in a database, by your prefered method.
do a direcotory list via filesystem
do a full table scan in the database to get a list of all of the filenames
How do the times relate?  I'll bet the DB is faster, but this will come down to setup.
Now : open a file from the database & open a file from the file system.  Does the one from the file 
system open any slower than when you are storing only one file?

now create a database, turn on archiving & flashback (because we're good little DBA's) and load 
16TB of high resolution medical images into the database.  How much storage do you need?  How long 
does it take?  Back up the database any way you like.  Put it in the file system on a SAN LUN & use 
a mirror tech to copy.  Which backup is faster?  Thought so.

You can purchase a lot more software already battle tested than you can write & it all works with 
files stored in a file system.  Also, you can scale up file system capacity far easier than you can 
a DB (NAS, SAN, Nfs, etc)

FYI - System admins are not the high school MS certified lackies sent to worry about "eth0".  
That's helpdesk.  A System Admin has in depth knowlege of all applications that run on a system & 
this MUST include any Oracle application (RDBMS, OAS, anything) on the host.  A DBA is just a 
limited subset of a System Admin.

I am a bad coder (we all are) : no, we are not all bad coders.  When you go to a hospital & they 
hook up all kinds of diagnostic equipment, do you think the software on that equipment has tons of 
bugs?  There is a BIG difference between a Software Engineer & people who code.


Followup   August 28, 2008 - 9am Central time zone:

... but they are not the enterprise
as you seem to suggest.
...

interesting, I agree they are not "*THE*" enterprise, just one of the biggest components of it. Erase my applications and I'm slowed down. Erase my data and I'm *out of business*.


so, why did you limit your implementation to 5,000. Why do that?


Your backups are not even remotely equivalent to our backups and you know what? I'd be using partitioning and disk based backup and backing up a fraction of the database - so, mine would be much faster - most of the image data would over time become read only, we'd ignore it entirely. And with block change tracking and the ability to catch our backups up - using incrementals only - I'd always have a full database backup ready to recover with and one that only took a short period of time to create.


we are all bad coders, some of us have to pass more hurdles than others - and yes that diagnostic stuff has lots of hurdles. I'll bet your imaging storage and retrieval system did not (there are many cases of machines over dosing patients and doing 'bad' things because of software glitches. Heck there is even a virus on the space station right now - that software was written by some engineers, not just coders)


I prefer, whenever possible, to not reinvent the wheel - if the product I already purchase has the feature set I need, I'll use it. Backup, recovery, flashing back, partitioning, scaling to billions of rows in a single table (directory), etc... I'm there.
5 stars filesystem   August 27, 2008 - 12pm Central time zone
Reviewer: A reader 
Ben:

Most system admins i have seen do not know much about the applications running on the machine.
All they know is Linux, Unix, or windows stuff. If there are some, they are very few.

Regarding loading 16 TB into DB. What difference would it make from a storage prespective. 
You still need 16 TB on your filesystem to load it.

Concerning backup i am not expert. But, do not you need to backup the 16TB from filesystem. 
Is it much faster to back up 16T bytes from filesystem than backing up the DB?

I think the question is "which database" you are testing with? Are you using the 11g LOB or new 
SECureFiles data type.


5 stars "Software Engineering"   August 27, 2008 - 4pm Central time zone
Reviewer: S from Canada
You were saying Ben?

http://en.wikipedia.org/wiki/Therac-25
http://www.wired.com/software/coolapps/news/2005/11/69355


3 stars code   August 27, 2008 - 5pm Central time zone
Reviewer: rc from The Netherlands
>>I am a bad coder (we all are), I prefer to use what I can that has come before and is used 
heavily by others. 

That is why I use Windows instead of Linux. Windows is more heavily used. 


Followup   August 29, 2008 - 1pm Central time zone:

ahh, but windows as a server platform is not necessarily more heavily used.

And the comparison would be between:

a) an OS you write
b) versus and off the shelf OS

I know which one I would almost certainly choose, and it would not be a) in the year 2008.
1 stars   August 28, 2008 - 7am Central time zone
Reviewer: Oleksandr Alesinskyy from Germany
Really, for most cases storage of the files in the databases is not very wise solution. Declared 
advantages are mostly imaginable. Backup? There are tons of products that perform backup of the 
filesystem (and often database backups arewritten to external devices by means of those filesystem 
backup products).

Flashback? Version control systems are viable (and for many cases preferrable) alternative.

Presentation of datain the database as files to client? Yes, attempts have been made, but not very 
succesful one. What is with OFS now? Is it dead, yeh?

Performance? For most cases write/update performance would be order of magnitude higher for FS then 
for DB. And BTW most OS supports several FS, so optimal FS may selected for a task. And migration 
from one FS to another one is much simplier then one DB to another. 

And so on, and so on. DB is optimal to store and process data with a structure known to DB, when 
you can leverage the full power of SQL. But for "black-box" data ... things are different.


Followup   August 29, 2008 - 4pm Central time zone:

"mostly imaginable" huh. interesting.

The problem isn't "could it be done in a file system", sure with lots of work - it absolutely could be.

Backing up a database - large databases - well know, done all of the time, bread and butter for most DBA's, it is what they do. Data management - it is what they do.

Enter the file system and - well, it is really not being managed by the same mind set, the same group of people.

Flashback is for short term "whoops" operations - version control systems do not necessarily lend themselves for that - version control would still be built into the file itself. OFS exists, superceded, but exists - same underpinning - different upper layer infrastructure - to say "it is gone or not successful" would be somewhat misleading.


I prefer the security, the backup, the recoverability, the concurrency, the openess of the database - the database makes it look like a file system - if you start getting into version control systems - you are not talking a filesystem anymore.


for ANY data that is relevant to you, the database is the right place to store, organize, index, retrieve, search, version, backup, recover - whatever.
4 stars filesystem   August 28, 2008 - 8am Central time zone
Reviewer: A reader 
Are not XML documents text files? You can use xquery just like sql with xml documents in DB. Can 
you do that if xml file is on filesystem.

what about security? can you secure your filesystem like a DB.

You are also counting that your file pointers in the DB will always be in sync with the filesystem 
because if they are not you will have some issues that wont occur if the file is in a ROW/TABLE.


Followup   August 29, 2008 - 10pm Central time zone:

besides - people do seem to enjoy setting up many application servers - now you have to create a distributed file system - available to all servers and mediate shared access and ....
3 stars RE   August 28, 2008 - 8am Central time zone
Reviewer: Ben from USA
A Reader : 
This topic is a little off the initial subject, so I'd understand it being bumped.

  The fastest backup you can get will be a SAN to SAN FC mirror copy, which duplicates the entire 
disk while it is live.  The key is recovery.

If it is hosting database files, you now have a copy of one or more datafiles, but not neccessary 
in the form of a transportable tablespace unless you've done some other work.  Which process do you 
use to recover?
There are pleny of excellent forums/Metalink articles that go over what to do with datafiles for 
recovery if you don't have a TT, but suffice to say you need a better than average DBA.

If it is hosting native file system files, how do you recover?
Here it is easy.  Just dynaically add the mirror copy to any SAN attached host, recover the file(s) 
you need & you're done.

An alternate that is becoming very popular is never backing it up (collective gasp here)  These 
systems use Content Addressed Storage and multi-node duplication/sync to ensure that as every 
document is written, it is written to multiple disks & synced off site.  If a node goes down, just 
put a new node "off the shelf" & configure it to replicate from your off site node(s)  Properly 
configured, you can lose entire sites without any loss of data & rebuild in minutes (& then it 
resync automatically)  EMC's DiskExtender/Centera system is an example.

The best storage solution depends entirely on the knowledge of your staff & the amount of money 
you've got.

As to the systems admins who know only the OS, please, please, please call them somthing else (I 
call them Bobs - think Office Space)  The word system implicitly defines an entire process, which 
should NOT refer to an isolated host, but rather all physical hosts, their OS's, and ALL software 
in the stack. 

The major point I keep trying to make is that under most circumstances I agree with Tom & you'd be 
better off storing files in the db, but unlike Tom I can see plenty of situations where storing in 
the database would be a poor choice.  Which fits can only be answered by taking the entire system 
into account.

S:  What is the % fail on medical equipment vs business code?  I'll bet medical equipment software 
is of higher quality.

rc: there's more quality code for POSIX OS's than for Windows, so I must disagree.  The choice of 
OS should also be situational.

Oleksandr Alesinskyy : Flashback is a recovery tool, not a versioning tool.  Where do version 
control tools store data?  Many use a database.  I'ved used some that run on an Oracle DB that 
happens to have Flashback enabled.
Presentation of the data is only part of a justification for how to store.  How searchable is the 
data?  Legal Discovery laws are being passed in many US states & other countries.  Sometimes the 
lawmakers are clueless about anything electronic (such as the new "anti-hacker" law in Germany 
effectivily killing the security research industry there)  Know your laws!!!
Performance should be taken int account, but it should be from a user acceptance level, not just 
the fastest read.  Does everyone need a formula one race car to get to work?


4 stars some advantages with filesystems   August 29, 2008 - 5am Central time zone
Reviewer: Peter Köhler / Awaro GmbH from Germany
Hi Tom,

some problems are better solved with filesystems:

1.: If You download files no database call has to be made. With big files and clients having a 
really slow or unreliable internet connection (we have clients in Africa and Middle East!) this 
means that no expensive database connection is occupied over many minutes.

2.: Architecturally it seems strange to me that the view or the controller should deal with 
database connections. They belong to the service layer.

Just my two cents.

Greetings from Frankfurt, Germany

Peter Köhler


Followup   August 29, 2008 - 11pm Central time zone:

1) hmm, wonder how all of our files in Oracle are available. You make it sound like the 10's of terabytes of data we have stored in our single network file server would be almost always unavailable?

And we have developers / sales people / support / whatever in almost every country, including the two geographies you mention.

We must be totally hosed, we never get our files....

2) eh? you just lost me entirely, the first point - you lost most mostly, the second one - entirely. Who said anything about a database connection? We can ftp, smb, nfs, webdav, whatever. Under the covers, way under the covers, there be a database.
5 stars MVC   August 29, 2008 - 5am Central time zone
Reviewer: rc from The Netherlands
@Peter Köhler 

>> "2.: Architecturally it seems strange to me that the view or the controller should deal with 
database connections. They belong to the service layer."

You mean the model view-controller-design-pattern? Nobody in this thread has said that the view or 
the controller should connect to the database (correct me if I'm wrong). So I don't understand what 
you want to say. 

In MVC the model should connect to a service and this service connects to the database. 




4 stars @rc   August 29, 2008 - 7am Central time zone
Reviewer: Peter Köhler / Awaro GmbH from Germany
Sorry, i was not clear enough. 
What I meant was that if you download a file which is in the database than you have to deal with 
the http request which sould only be present in the view or in the controller layer of a web 
application AND you need to deal with a database connection (for the file).
In Java this is often hidden by some strange proxy objects, but to to me this has a bad taste.

Greetings from Frankfurt / Germany

Peter Köhler



Followup   August 29, 2008 - 11pm Central time zone:

not correct.
5 stars layers   August 29, 2008 - 9am Central time zone
Reviewer: rc from The Netherlands
I think there should be one layer for data access. Data access can be accessing a database, files 
on the file system or calling a web service. Data is data, the other layers should not know the 
origin of the data. 

If you change your mind and want to store the files in the database instead of the file system you 
should only have to change one layer in your code. 

Maybe that is what you mean? 


Followup   August 30, 2008 - 9am Central time zone:

well, if you change the data layer, you are changing TWO layers in your code - no?


and if you decided for whatever reason to put the files in the file system, you can still access them from the database without really changing the "data access" layer.


Regardless of where you put your "data access code", if you remove or replace the entire data layer (go from database to file system, from file system to database) you are going to reimplement that *data access layer* - regardless.

I prefer to put my data access layer in stored procedures because DATA goes into a DATA BASE.
3 stars RE   August 29, 2008 - 10am Central time zone
Reviewer: Ben 
A reader,
    "Regarding loading 16 TB into DB. What difference would it make from a storage prespective. You 
still need 16 TB on your filesystem to load it."
    
    You actually need quite a bit more.  With archiving enabled, the archives will grow at a rate 
determined by the inflow of data.  With Flashback, whatever blocks get updated will get logged into 
your flashback location.  With RMAN to disk, a good backup will backup both the datafiles & 
archives.  So a "worst case" scenario would be in a system where there is a high insert & high 
update of these files with routine backups.  In this example it would be 16TB of datafiles, 16TB of 
archives, ??TB of Flashback, 16TB+16TB for Rman.
    
Tom,
    Sorry I didn't respond to you in my last post.  The 5000 file limit was a purely arbitrary 
decision made by a senior exec (non-IT) who had read an article about how with a fancy new CD write 
drive you could take a Write Once Read Many approach.  So we bought us a fancy new 2x CDR drive & a 
CDrom tower.  Since then the CD archive has been abandonded (1st DVD, now CAS based archiving with 
auto off site syncing), but there has been no business reason to change this arbitrary limit (why 
change existing code...)
    FS backups are capable of doing _everything_ you describe for DB backups.  Most times an FS 
backup is faster than a DB backup & use a simple enough interface you can hand the task off to a 
Bob.  Don't get me wrong, Rman came a loooong way in 10g, but it's still not user friendly enough 
I'd trust an operator with it.
    I believe the virus on the laptops on the spacestation (not on the space station systems 
themselves) was the result of some Bob who should have his contract revoked for incompetence.
    
    Even Oracle recognizes that at times a FS storage approach is a better fit (note fit, not 
solution)  In OAS, the files for J2EE & Forms apps are stored -> in the FS, not DB.  You could 
argue this is because these could be deployed without an infrastructure DB, but that argument fails 
for the time when someone does deploy an infrastructure.  There is NO option for storing 
application files in the DB (except of course for PSP pages & ApEx...)
    
"interesting, I agree they are not "*THE*" enterprise, just one of the biggest components of it. 
Erase my applications and I'm slowed down. Erase my data and I'm *out of business*."
actually, most enterprises can exist without their data for a surprising amount of time (hours to 
days).  They can't go minutes without the communication infrastructure (email, im, phone, etc).  
The db has information critical to business.  Communication is critical to people.  Customers don't 
work with businesses, they work with people.  

I believe in the theory that a chain is only as strong as the weakest link.  Everything a user 
needs to perform their duties is part of this chain.  Yes the DB is critical, but so is the 
application, the network, & the user's workstation, etc.  Data is USELESS without the tools to 
interpret it.

RE : Peter Köhler / Awaro GmbH 
There are setups where the user's end doesn't do any DB connections & just sees the files as any 
other mapped drive/share.  As far as network traffic goes it's almost identical.  However, there is 
a performance hit.  Most systems can easily accomdate the speed drop from storing a file in the DB, 
but this is on a per system requirement decision.


Followup   August 30, 2008 - 9am Central time zone:

... You actually need quite a bit more. With archiving enabled, the archives
will grow at a rate determined by the inflow of data. ...

excellent, so we can actually do a point in time recovery, this is what databases are all about.

and actually, if you didn't have that need, you can do these things non-logged, caveat emptor - penny wise - pound foolish as they say.


... With Flashback, whatever
blocks get updated will get logged into your flashback location.  ....

you are confusing the flash recovery area for FLASHBACK DATABASE with flashback query which uses undo (or the pctversioned data). Don't confuse the two, you do NOT NEED a flashrecovery area for flashback query, flashback table, flashback versions query. Only for flashback database and that isn't what we were talking about - we were talking about flashback query to quickly undo a 'whoops' operation.

and truth be told, a flash recovery area isn't a bad idea if you have serious uptime requirements. And, well, many a file system these days do things very very very similar.




... Most times an FS backup is faster than a DB backup & use a simple enough
interface you can hand the task off to a Bob. ...


anything like that you hand off to a "bob" as you call it is something that'll fail. I would never in a million years consider handing off the single most important thing ever to a person that is not competent. That would be a CLM (career limiting move).

and database backups take 0 seconds. They are totally a background thing, don't get hung up on the time (but if you do, let's race - let me set up a proper database with block change tracking and a flashrecovery area and I'll use incrementals to catch my backup up)....

... There is NO option for storing application
files in the DB (except of course for PSP pages & ApEx...)
...


I do not believe anyone proposed storing application (exe's, zip's, war's, whatever) in the database. Applications come and go - they are source code controlled somewhere else (one hopes). We have been talking about the core data you need for your business.

Lose an application - slow me down.
Lose my data - SHUT ME DOWN (out of business).


... actually, most enterprises can exist without their data for a surprising amount
of time (hours to days). ...

which ones? I'm pretty sure I wouldn't want my doctor to diagnose me without having access to my records. I'm pretty sure Oracle would be hard pressed to operate without it's support/tar database. What industry could suffer the loss of data for hours or days without having something bad happen? Remember - if the already stored data has gone unavailable the ability to CREATE NEW data is gone too pretty much. You cannot work with existing customers, you cannot really create new customers (because the data needed to do so is - well - not there)

The only thing email (hey, wait, that is actually data isn't it - what if the datastore is the IMAP repository that you decided to store in a file system....) so well, actually email is gone.

The only thing IM would do is let us all sit around and say "hey, remember the days before computers, we were never down like this, we had that paper stuff"

The phone would do the same thing, assuming the phones work (eg: our phones are sort of data driven these days, if we suffered a major data loss, we could well lose our phones).

Communication - without anything to talk about - would just be a big "complain session". "hey when will the systems be up".




5 stars   August 30, 2008 - 4am Central time zone
Reviewer: LJ 
I've been using the Oracle database as a content repository for over 2 years now, including the 10g 
and now 11g versions. I've also been a Documentum Administrator for 7 years - and after comparing 
both experiences, I would never go back to using a file system as a content repository if given the 
choice.  Even if it meant going back to a 10g database.  

I'll admit not having the technical ability to do a "SAN to SAN FC mirror copy" without expensive 
software - but I also found that I don't *have* to do a SAN to SAN FC mirror copy because I use 
RMAN, a block change tracking file, and applied incremental Image Copies to do backups on RAID 10 
disks in separate enclosures from the Live storage.  This is by design, not accident.
In addition to not having to do SAN to SAN FC mirror copying, I also don't have to worry about 
securing a separate content server(s) in addition to the database, I don't have to worry about 
losing transactional control, or integrity, or reliability, or scalability.  I don't have to worry 
about whether or not I'll be able to restore content files and metadata either complete or 
incompletely to the exact point in time I need. 
I can use caching on the application server to make up for some (maybe not all) of the performance 
loss from not using a file system.  But performance isn't everything.  Performance sucks when your 
site is down for 3 days trying to match up separate metadata and content.   

And to reiterate one of Tom's mantra's ... there is no "one size fits all" answer for anything.  
There is no perfect solution for everyone.  The right decision will completely depend on the other 
components in your environment.  For Ben who commented above, he's obviously a talented Sys. Admin. 
and can do advanced maneuvers like SAN to SAN FC mirroring as a backup.  For his projects, perhaps 
this is the best way of doing things.  For me, I'd screw it up in under 10 seconds since I don't 
have those specific skills Sys. Admin. But since I have to manage the storage, backups and database 
myself, it's OK because I have ASM and SecureFiles and Paritioning and RMAN to accomplish the same 
goal, just in a different way.



1 stars data access layer   August 30, 2008 - 11am Central time zone
Reviewer: rc from The Netherlands
The data access layer is often simply the layer that calls the stored procs. But you can also call 
a web service to retrieve data in the data access layer. And I think it is a good place to access 
files on your file system too. 

I don't understand what Tom and Peter Köhler want to say.

Anyway, the view and the controller should not access the file system or a database or a web 
service. Those two layers should not have to know the exact origin of the data.  


Followup   August 31, 2008 - 9am Central time zone:

and you can have a web service that calls a stored procedure.


If you store data in the database, files even, you can

a) pretend it is a file system and access it that way
b) use web services to access it
c) access the information in SQL
d) etc etc etc etc


meaning - put it in a database and you can access this same information from multiple environments in an open fashion.

And your "view" or "controller" layers need not be bothered with the details.
4 stars Performance Difference in Files vs CLOB   August 30, 2008 - 12pm Central time zone
Reviewer: Amin Adatia from Ottawa, ON Canada
I would like to store my data in the database. However, we are finding that the read performance 
between the CLOB and OS_File is about 3 times in favour of the File. Is there something we can do 
in our code which is using the Java streams function. Are we stuck with the slow performance 
because of Java?


Followup   August 31, 2008 - 9am Central time zone:

tell us about your test and your needs here.

Are you needing to serve these files up over the "internet" for example, does your test including a web brower downloading the files, do you have to use java as the language in the middle (eg: mod_plsql can get files out of the database pretty easily - one line of code actually)

also how did you create your clob, do you read and re-read the clob over and over, have you enabled caching of the clob (read only caching or read/write caching - you decide)
1 stars data access layer   August 31, 2008 - 11am Central time zone
Reviewer: rc from The Netherlands
???


Followup   September 1, 2008 - 12pm Central time zone:

eh? did you have a question or comment?


3 stars DB or FS   September 2, 2008 - 10am Central time zone
Reviewer: Ben 
I don't want to get pulled into a "anything you can do, I can do better" debate.  For every 
situation that I can create to justify moving files into a database, I can create another scenario 
to justify using the file system.  

Below are some of the questions that I would typically ask to help determine which is a better fit 
(in no particular order):

What is the expected growth rate of data storage?  Remember that unstructured data tends to grow 
much faster than structured data.
What is the source of the files : user (such as word processing/spreadsheet)? (this is another way 
to ask about growth rate.  I've seen managers claim slow expected growth rates, but in reality when 
users are unleashed on the system...)
Does the DB server have sufficient capacity to allow for at least 3 years of this growth rate?
Assuming Archiving is turned on in the DB : Does the server have sufficient capacity in it's 
archive destination? (apply same for Flashback DB)
How does the expanded DB size fit into your backup systems and procedures?  
Can your Disaster Recovery plan accomodate the increased DB size?
Do any other vendor products have to interact with these files?  If so, can these products work 
with them stored in the DB?
What is the budget?
Is performance on file read critical?
What technologies are already in place & paid for?
What are the skillsets for your current admins?
What are the skillsets for your users?

If Tom's mantra is there is no "one size fits all"? Then why is he advocating that the database is 
ALWAYS the better fit?  I am simply advocating that there are plenty of situations where FS or a 
blended storage solution is a better fit.


Followup   September 2, 2008 - 12pm Central time zone:

I'll say it as simply as I can:

if this data is valuable to your business, if this data is necessary for your business, if the loss of this data would damage your ability to do business - it has no business NOT being in the database.

The only place I know data can be secured, properly backed up and recovered, safe - is a database.

A file is just data.


If the files are static images for your website, if the files are generated reports - reports that can easily be regenerated - put them where ever you want.

If the files contain information crucial to your business, there is only one place for them - a database.
3 stars respond   September 2, 2008 - 12pm Central time zone
Reviewer: Ben 
And I'll respond as simply as I can

"The only place I know data can be secured, properly backed up and recovered, safe - is a 
database."

Whereas I can not attest to your knowledge, the idea that an FS is limited compared to a DB for 
security, backup, recover, etc is simply not true.  

Name one way in which data can be secured in the database that has no equiv in the FS.

Name one way that a DB can't be backed up that can't be done in an FS

Same for recovery


If these are your predominant arguments for the DB as the sole file repository, then I respectfully 
suggest that you're suffering from multiple logical facilcies, predominatly the Appeal to Belief 
http://www.nizkor.org/features/fallacies/appeal-to-belief.html


Followup   September 2, 2008 - 1pm Central time zone:

backup/recovery:

do a consistent point in time recovery of your structured (relational) and unstructured data (files) - I need it to be restored to 11:59:23am last Tuesday.

And it obviously needs to be consistent. The meta data that describes where and how the files are needs to be consistent with the files themselves (using just a file system).


securing:

I would like an exhaustive list of everyone that has accessed or modified the file for the last two years - and for compliance reasons, I need all of the intermediate versions of the document (using just a file system).




And please bear in mind, database administrators - they do backup, recovery, securing as part of their day to day job. It is expected of them. System admins, frankly, they are *not* data oriented people - it is just not what they do. You want good, safe, secured data - you'll put it in a database.

You want files to go accidentally missing, not noticed they are missing, go ahead, put them in to the file system.

Sorry, but every time, every single time, I see people store things outside of the database - it eventually goes missing, parts of it at least. Or the need to consistently restore a file system with a database so the related data bits and bytes all "align" - it just isn't there.

And too many times - people do not even think about this stuff when they "design" their system.

For the same reason I hate

o triggers (do more harm than good in real life)
o autonomous transactions
o when others

I will always and adamantly say "put it in the database"

Could you build something in the file system yourself to achieve most/many/all of your goals? Probably.

That isn't the point, the point is - most implementations don't; don't do it right, don't think it through - and they don't realize it until it is too late.


This is not an appeal to belief, this is "I've seen it go horribly wrong so many times, I've wasted too many weekends and evenings trying to fix these kinds of mishaps"


1 stars response   September 2, 2008 - 2pm Central time zone
Reviewer: Ben 
"Could you build something in the file system yourself to achieve most/many/all of your goals? 
Probably."

already done, off the shelf products exist to do this.  yes, including point in time recovery.  No, 
it's not native to the OS, but that means companies COMPETE to create products & the end result is 
better than if one vender did it & said "this is the best"

Does Oracle do it well in the DB? Yes, but it has it's own limitations.

FYI - I seen "DBAs" drop data left and right, have incorrect update scripts updating millions of 
rows with bad data, etc.  This is human error, not system limitations.


Followup   September 2, 2008 - 2pm Central time zone:

Ok, say I install this "off the shelf product" - and I as an admin go around the product and just access files. Can you tell? What if I modify them - can you tell? What if I take a whole bunch home with me - can you tell?




Data loss, when data is managed in a database, is far less than the data loss when data isn't stored in a database. When you put it into a database - people tend to treat it a little more seriously. And that million row botched update - we have the tools to easily, readily, quickly put it back right.

but anyway...

If someone asks me "store in file system or store in database", I sort of presume also they are sitting down to WRITE this stuff - they are not in the market to buy yet another thing. And if they put it in the file system - they have basically nothing. They try to write it all. And that is just "not smart"
1 stars update   September 2, 2008 - 2pm Central time zone
Reviewer: Ben 
what happens when the db crashes?  Can users still process spreadsheets, wordprocessing ,etc?  
Surely you don't suggest that Oracle's DB is bug free?  

put in on an FS & you divorce yourself from db stability issues.

Every solution has pros and cons & I hope that in the future you are capable of taking your 
blinders off and re-evalutating your stance on this topic.


Followup   September 2, 2008 - 3pm Central time zone:

Ben

let me put it this way - by the time you get the feature set you say comes with being on a file system along with a layer:

...
already done, off the shelf products exist to do this. yes, including point in time recovery. No,
it's not native to the OS, but that means companies COMPETE to create products & the end result is
better than if one vender did it & said "this is the best"
.....

you know what you have? You have *a database*. It might not be Oracle (heck, you know, we use a file system too - we are a layer on top of a file system typically, although we can provide our own). But by the time you have the feature set of backup, recover, any sort of security, auditing - etc you have either a) bought a database (call it what you will, it is a database - access to the data must be through that layer - else the recovery is not there, else the security it not there) or b) written your own database

But a rose by another other name....

My database stays up, file systems and applications on top of them - that is the height of instability to me.

Surely you don't suggest that *anything* is bug free. What about all of the 3rd party products you mention? They are bug free? What happens when the nfs server freezes, veritas decides to go awry, the network becomes unavailable, the OS hosting the file system blue screens or panics, etc etc etc. The database is not inherently "unstable".



And in Oracle, we actually do run a single file server for the entire company, and a single email database for the entire company and so on... And if email and documents were unavailable - we'd not be able to work - virtually everything in the company runs on it.

Systems are only as "instable" as the people managing them ultimately.

And actually by the time you have your backup capability, recovery capability, versioning capability, auditing capability, <most any> capability - you have either installed a database or written your own.

So, actually, point made. To get the feature set you need - you install a database of some name and use it.


Data goes into a database. Stuff you don't care about can go willy nilly on the file system. And if someone in the future writes to me:

... if possible could you please explain the advantages and dis-advantages of storing files within an Oracle database. We are currently using 9.2 and 10g. We have no immediate plans to move to 11g. Thanks, we all appreciate your answers! ....

The answer would be the same: "DO NOT WRITE your own database, it is already written, put data you value into the database"

Because the person asking it is not asking "should we use product X or put it into Oracle", they are asking "should we just stuff our files into the file system and other stuff in the database". If the files mean a thing, they will put them in the database.

You choose your database, but ultimately, you choose a database.

All I'm saying is "data you value, goes into a database".
5 stars   September 4, 2008 - 10am Central time zone
Reviewer: A reader 
"what happens when the db crashes?  Can users still process spreadsheets, wordprocessing ,etc?  
Surely you don't suggest that Oracle's DB is bug free?  "

Uh, if your database crashes, how on earth are your users going to know which file is theirs 
without the associated metadata from that database?   Unless you give your users access directly to 
the FS and let them poke around through all the files until they find the one their looking for? 
(nice security model)

If your database crashes, then your files are useless whether or not your files are in the DB or 
not.  However, if your files are in the database and it crashes, we can recover the database - 
which includes the files - to whatever point-in-time.  If the files were on a file system, we'd 
need to hope we could restore the database to the same poin-in-time that the files could also be 
restored to on the another file server.  Without the metadata stored in the database, you just have 
a bunch of files on a FS with no way to tell anything except "here is a PDF file, wonder who 
created it and what's in it".

 


5 stars File system interface to the database?   November 18, 2008 - 9am Central time zone
Reviewer: Stew Ashton from Paris, France

Tom, with the advent of SecureFiles there is growing interest among my colleagues for storing files in the database; however, in most cases we are using purchased software that only knows how to store files on a file system. We were wondering if an interface exists to access an Oracle DB as if it were a file system? The idea would be to use the database completely "under the covers".

My searches only found rather old stuff (IFS), or else services that seemed to be wrapped up in other products such as Content Management.

Any pointers?

By the way, I am not saying this is a good solution. It would be infinitely better to store all the data together in business transactions. However, this might be a step in the right direction and it would at least show that SecureFiles performance meets our needs.

Followup   November 18, 2008 - 7pm Central time zone:

will this suffice?
http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14259/xdb22pro.htm#i1026728

5 stars Re: file-system interface   November 19, 2008 - 4am Central time zone
Reviewer: Stew Ashton from Paris, France

So you can type in search keywords better than I; we all knew that ;)

"used to provide users with a file-system interface"

Unfortunately, we would need something that provides programs with what appears as a local file-system interface: you know, fopen(), fclose() and the like. I can see where it would be much more feasible to serve higher-level protocols like FTP or WebDav, since they only deal with entire files and they are protocols, not interfaces.

Maybe providing a low-level "read from / write to a file" interface is just too much of a reach technically. In any case there would have to be specific interface code installed as a "file system" on the calling program's side.

4 stars WinFS   December 1, 2008 - 2am Central time zone
Reviewer: Parthiban Nagarajan from Coimbatore, India
Hi Tom

You are suggesting to store IMPORTANT files in the database rather than in FileSystem for the sake of Backup, Recovery and Security. Though its acceptable, its not practical. Inspite of database, we can depend on Third Party Softwares for Recovery, and User/Group privileges provided by the OS for Security. This should be enough for most of the IMPORTANT files, I believe.

If you are not satisfied with the above paragraph, what if the FileSystem itself acts as a database instead of making Database to act as FileSystem? Yes, I am talking about the WinFS from MicroSoft.

http://en.wikipedia.org/wiki/WinFS


I hope, it will be shipped with the next version of Windows OS.

I wonder, what would Oracle or any other Technology vendor's answer for this ?!!!

Followup   December 1, 2008 - 7am Central time zone:

gee, not practical. better not tell many of our customers or even or internal IT staff that - they might be very surprised.

Many people, people who really value their data, place data that is important to their business in the ....... database.


And it is hugely practical.


I don't believe in what you believe in.


My answer to something like 'winfs' (which you hope will be shipped some years into the future) is the same I have for distributed databases


distributed complexity is just that. If you want to query documents and relational data (and spatial and xml and image and audio and whatever) your best bet is to have.....

a single data repository - so they are all as of the same point in time, with consistent centralized security polices, managed by the same team - providing recovery and backup in a consistent fashion.



You give no factual or even "in my opinion" reasoning behind why you believe this is not practical. Perhaps that means you haven't read far enough into the documentation - haven't used the database in enough situations to know what it is actually capable of doing?
3 stars Storing files in the database is always a bad bad idea.   April 30, 2009 - 7pm Central time zone
Reviewer: Victor Logan from Chicago, IL USA
Tom,

Sure you would come up with every reason in the world to use a DB -- duh. Any reason why you would 
suggest not to?...:)

When you have large files to manage retrieving them from a webapp is a pain. read this here for the 
why http://faler.wordpress.com/2009/04/23/storing-imagesfiles-from-in-a-database-is-bad-mmmkay/

The best suggestion would be to use a content management like documentum, which still uses a 
database (this should make Tom happy) but uses the file system for storing files, al which is 
transparent for the users.




Followup   April 30, 2009 - 9pm Central time zone:

you know, whenever you say "is always", I know one thing for a fact

You are wrong.

never say never, never say always, I always say.



did you know that a file system is..... a database (a database with a really poor feature set but a database none the less)

did you know that every content management system - every single one - is a database


Don't tell me storing files in a database is always bad - or hard - or whatever, it just means to me that you haven't been around long enough.

All of our files in Oracle - terabytes and terabytes of them - are in.... a database (a popular relational database with lots of other stuff like text capabilities, backup, recovery, security, etc)

I read that blog entry and see:

... What does this mean? It means that a single user getting a single page will require 21 connections out of your database pool! Why? Because browsers and other clients will do a separate ¿GET¿ for each perceived ¿file¿ (such as images) that is referenced on the page. ....

well - I don't think anyone suggested putting your trivial images into the database - what is suggested is:

if your data means something to you, if the data is critical - hard to replace or irreplaceable - then the only responsible sensible place for them is a database that provides for all kinds of contigencies

On asktom, my images - simple file system. I can get them a million times over if they get mucked up. My files - database. Need them backed up and recoverable to the same point in time as my application data. Need them secured in the same fashion as my application data. If I decided to add more application servers, I need ubiquitous access to them (without having to cross mount a file system all over the place).

The patent trade office for example - millions and millions and millions of documents, images, drawings - guess where they are? They are not in a file system - they are professionally managed in....

a database (just happens to be an Oracle database)

as an example.


I did like that the blog entry says "almost all of the time." - very wise. To caveat like that. Never say never, never say always, I always say.


That blog entry doesn't say "do not store critical data in the database"

It does say "don't store trivial things that could easily be reproduced at the drop of a hat in there"

And it does not say why "retrieving them from a webapp is a pain", that is an entirely false statement. Goto my files tab, click on it, select a file, click on it - retrieved. You know what would be a pain? IF I STORED THAT ON A FILE SYSTEM.

What happens when you go to scale out your application server - which stored all of these files. Do you really want to cross mount your single disk with your *critical* data all over the place?

sorry, you present nothing here to really argue against.
5 stars fielsystem   May 1, 2009 - 6pm Central time zone
Reviewer: A reader 
Tom:

I am not a unix or windows expert. But is cross mounting a single disk with critical data to other 
app servers a complex task? I thought it ia simple unuix command.



Followup   May 2, 2009 - 12pm Central time zone:

there will be limits to how many places you can cross mount it and now you have your left hand all over your right hand messing with stuff.

There will be performance issues as well - filesystems are not built the same way a multi-user database is


A common phrase heard an environment like this (with lots of cross mounts) will be "oh, you needed those files, I needed more space, they looked old - sorry about that"


5 stars But all the big kids are doing it...   May 4, 2009 - 3am Central time zone
Reviewer: Ken from Brussels, Belgium
To Sam,

Let's assume that what you say about YouTube and FaceBook is correct. If you're one of those guys, 
it might be worthwhile to have a higly-customized system. 

I think that what Tom is saying is that it is possible to construct a fast, filesystem-based 
repository; it is just not the wisest choice for most of us.

If YouTube loses a video or two, who's going to notice? Even if they lost 10%, would they really 
suffer? Would they ever need to roll forward to a point in time? Do they need a history of document 
changes?

The point is to think about the needs of a business: speed, cost, security, recoverability, 
traceability, staffing needs, etc. The web sites you mention are not typical businesses, but rather 
exceptions which prove the rule.


4 stars Advantages of Oracle Database   May 18, 2009 - 5am Central time zone
Reviewer: Tunji Olaniyan from Lagos, Nigeria
I quite agree with Tom that there 're a lot of advantages to be derived from storing data in Oracle 
database. That 's why today, all critical business data of most companies world over are stored in 
Oracle. Those hard-to-beat features have made Oracle the hottest database.

Thank you.



Write a Review
 


All information and materials provided here are provided "as-is"; Oracle disclaims all express and implied warranties, including, the implied warranties of merchantability or fitness for a particular use. Oracle shall not be liable for any damages, including, direct, indirect, incidental, special or consequential damages for loss of profits, revenue, data or data use, incurred by you or any third party in connection with the use of this information or these materials.

About Oracle | Legal Notices and Terms of Use | Privacy Statement