You Asked
Hi Tom I need your opinion on this subject.
My input data is XML. Currently my requirement is to produce database reports out of this data using Oracle Reports and Oracle Discoverer. The XML structure itself is pretty complex. The database schema I have created from my XML schema has several master detail tables (> 50).
My questions are these: would I be better off not storing this data is relational tables, and store it instead in a few tables with XMLType columns?
How hard or easy would it be to generate reports off of this data if I had it all in XMLType columns?
From the litterature I have read, the XMLType enables storage of XML, and searching with xpath in pl/sql. Can this easily be translated in a new way of writing reports over large amounts of relational data? What about performance?
As always, thanks for your insight.
Yves
and Tom said...
I asked Sean Dillon, our local XML guy to take a look at this. Here's what he had to say:
----------------------------------------
Hi Yves,
To convert or not to convert, that's always the question. There are as many opinions on this as there are people out there willing to talk about it. Let me tell you what I think, and you can draw your own conclusions.
There are a variety of factors you should take into consideration when determining how to store and access your data. There are a few different ways you can store your data, and each different way offers trade-offs you need to consider.
First of all, you can store XML data as NATIVE XML in Oracle. in 8i and below, you'll be doing this in CLOBs or NCLOBs. In 9i and above, XMLType. Doing this offers XML-purists the ability to store completely unstructured data in the database with no prior knowledge of the format of the XML data. This also allows you to maintain document fidelity, so things like whitespace, processing instructions, etc are maintained along with the original document. Using XMLType columns gives you the ability to perform XPath searching, XPath extracts, and the existsnode function. With XMLType or CLOBs, you can use Oracle Text (Oracle interMedia) to index your XML data for high speed XML-centric searching. (show me all those documents in my database where the "DESCRIPTION" tag contains the words "really groovy". In 9i, the Oracle Text indexing capabilities have been improved significantly to offer more XPath-like high-speed indexing.
Just because you store your data in pure XML does not mean you can't use relational access. In Oracle9i, you can define views on top of your XML document storage to enable high speed relational access to your XML data using a healthy mix of XPath expressions, pipelined functions, and function-based indexes. This works, and can be very fast, but has a considerable amount of administrative overhead and is overkill if you will always be accessing your data relationally. The only reason I would advocate storing your data in XML only is if you publish the data in XML the VAST MAJORITY of the time. If you only need to access the data relationally every once in a while, this approach can work. If you need to access your data relationally most of the time, read on.
Next, you can store your data in relational tables, as you have pointed out. You can map elements in your XML data to columns in a table (or object table), and parse the XML document when you put it into the database into these rows and columns. This would be the obvious solution for Oracle Reports and Oracle Discoverer access. If you know the structure of your XML data, and you won't be dealing with other XML Schemas or DTDs for the majority of your documents, this is an efficient way to store your data. XML is then thought of as a data-transfer mechanism, important when importing data or exporting data, but within the confines of the database and reporting interface, it's just DATA. NOT XML. There are quite a few people that like to use XML simply because it's XML (ie, snazzy, cool, fun, exciting, etc). In reality, you needs to use the technology that best meets your requirements. If the majority of your data access is relational, store your data relationally.
People worry that because the data is stored relationally, they've lost their XML. This isn't a problem, however. Oracle has a variety of ways of constructing/generating XML data based on relational queries. You can also programmatically construct XML data if you need to. There are many ways to do this, you just need to determine the best mechanism depending on your environment and data model.
Another approach you can take is to store BOTH. When you put the native XML document into the database, parse it as well. This satisfies the relational data access needs, as well as giving you XPath queries, Oracle Text XML indexing, XML document fidelity, an audit trail (by storing the original XML you loaded) if needed, etc. Yes, there is more data storage in the database, but again depending on your needs this is an option available depending on your requirements.
Ultimately, it depends. I hope that the information here helps you understand some of the pros & cons for the different ways you can do it. Good luck!
Rating
(6 ratings)
Is this answer out of date? If it is, please let us know via a Comment