Hello Tom,
I am looking for some inputs on local versus global partitioning on indexes. Appreciate your
help.Thank you.
Ours is a batch processing application (OLTP), with transaction tables each more than 10 million
rows. By batch processing I mean we receive files which we stage in tables perform calculations which are in turn stored in some of our transaction tables. For calculations we need to store 1 year full of data in our transaction tables.
The transaction tables are currently monthly range partitioned, however the rows in recent partition of last year are going over a million now.
The transaction tables have indexes which are locally partitioned. Most of our queries which go
against these transaction tables have date predicates in them, however there are some instances
where a date is not available to be passed in.
In most of the places the date predicates passed is a single date, however there are few places
where the date predicates are range hence can go beyond a single day to even 30 days or more.
My questions: - 1. In such cases what should be the points to consider to decide whether to go for
a local partitioned index or a global non partitioned (normal) index.
2. If I have to join 2 (range partitioned) transaction tables on columns that are indexed.
(Parent table A joined with child table B and table B has a foreign key for table A. Foreign key indexed in table B).
If the join on these tables on this foreign key fetches less than 10 rows from the child table,
then does it make sense to create a locally partitioned index on the foreign key column? Or a
normal index (non-partitioned global) ?
Excerpt from Oracle docs I read
(
http://docs.oracle.com/cd/E18283_01/server.112/e16541/part_oltp.htm#CEGECIGF ): - “With partitioned
indexes, there are always multiple segments. Whenever Oracle Database cannot prune down to a single
index segment, the database has to touch multiple segments. This potentially leads to higher I/O
requirements (n index segment probes compared with one probe with a nonpartitioned index) and can
have an impact (measurable or not) on the run-time performance. This is true for all partitioned
indexes.”
I am bit confused and looking for some advice/pointers as in what points to consider while making
the decision to use local indexes vs global indexes. Is the no-separate maintenance required while
performing DDL’s on partitions (MERGE/SPLIT/DROP) the main reason to go with locally partitioned
indexes as against non-partitioned global indexes?
I suppose my first point would be - that doesn't really sound like OLTP to me ? It sounds like your are doing intermittent loads of data using a batches of data, which seems to resemble more a warehouse model (regular loads of lots of data) as opposed to an OLTP model (millions of loads of a single row at a time).
Having said that, here's an excerpt from Tom's book going back a while:
Data Warehousing and Global Indexes
It is my opinion that these two things are mutually exclusive. A data warehouse implies certain things; large amounts of data coming in and going out, high probability of a failure of a disk somewhere, and so on. Any data warehouse that uses a 'sliding window', where old data (say anything over five years old) is 'slid' off one end of a table and new data is added to the other end, would want to avoid the use of global indexes. .... (the reason is that partitioning maintenance would invalidate the global index and you do lots of partition level operations in a data warehouse, you would have to rebuild them often)
....
OTLP and Global Indexes
An OLTP system is characterized by many small to very small read and write transactions occurring frequently. Sliding windows of data is not something you are concerned about in general. Fast access to the row, or couple of rows, you need is paramount. Data integrity is vital. Availability is very important as well. Global indexes make sense in OLTP systems. Table data can only be partitioned by one key ? one set of columns. You may need to access the data in many different ways however. You might partition EMPLOYEE data by location. You still need fast access to EMPLOYEE data however by:
o DEPARTMENT ? Departments are geographically dispersed, there is no relationship between a department and a location.
o EMPLOYEE_ID ? While an employee ID will determine a location, you don't want to have to search by EMPLOYEE_ID and LOCATION, hence partition elimination cannot take place on the index partitions. Also EMPLOYEE_ID by itself must be unique.
o JOB_TITLE
o And so on...
......
Having said all of that, the world changes over time, and we are now a lot better at maintaining global indexes across partitioned tables (especially in 12c).
Ultimately, its always going to be tradeoff between your maintenance requirements and your performance requirements.
For example, a single key lookup into a global index might be 12 times quicker than a single key lookyp into a 12-partition local index. Is that a problem ? Well..only you can answer that question. That query might be run once per day, so who cares if it takes 24milliseconds instead of 2milliseconds. But if its run 100,000 times per day...then thats a big drama.
Conversely, if you are not regularly doing any partition maintenance, then a global index is unlikely to need rebuilding except in unusual circumstances, so if particular (important) queries get good value from such an index, then why not.
So there's no "hard and fast" rule - the key is - understand the performance requirements of your queries, and index accordingly.
And dont forget about the value of a simple full table scan. A full scan of a small partition is often a perfectly adequate approach to low frequency queries.
Hope this helps.