Skip to Main Content
  • Questions
  • Local versus Global indexes on daily changing partitioned table

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Jon.

Asked: April 17, 2007 - 1:49 am UTC

Last updated: September 08, 2011 - 5:50 pm UTC

Version: 10.2.0.2

Viewed 10K+ times! This question is

You Asked

I have this logging table which has a few million records per partition.

CREATE TABLE LOG_EXAMPLE
(TRANSACTION_ID NUMBER PRIMARY KEY NOT NULL,
APPLICATION_ID NUMBER NOT NULL,
LOG_TIME DATE DEFAULT sysdate NOT NULL,
ERROR_MESSAGE VARCHAR2(2000),
LOG_TYPE VARCHAR2(10) NOT NULL,
WARNING_MESSAGE VARCHAR2(2000),
USER_ID VARCHAR2(30) NOT NULL)
PARTITION BY RANGE (LOG_TIME)
(
PARTITION TRAN_PART_2007_03_18 VALUES LESS THAN (TO_DATE('2007-03-19', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_03_19 VALUES LESS THAN (TO_DATE('2007-03-20', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_03_20 VALUES LESS THAN (TO_DATE('2007-03-21', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_03_21 VALUES LESS THAN (TO_DATE('2007-03-22', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_03_22 VALUES LESS THAN (TO_DATE('2007-03-23', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_03_23 VALUES LESS THAN (TO_DATE('2007-03-24', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_03_24 VALUES LESS THAN (TO_DATE('2007-03-25', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_03_25 VALUES LESS THAN (TO_DATE('2007-03-26', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_03_26 VALUES LESS THAN (TO_DATE('2007-03-27', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_03_27 VALUES LESS THAN (TO_DATE('2007-03-28', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_03_28 VALUES LESS THAN (TO_DATE('2007-03-29', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_03_29 VALUES LESS THAN (TO_DATE('2007-03-30', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_03_30 VALUES LESS THAN (TO_DATE('2007-03-31', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_03_31 VALUES LESS THAN (TO_DATE('2007-04-01', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_01 VALUES LESS THAN (TO_DATE('2007-04-02', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_02 VALUES LESS THAN (TO_DATE('2007-04-03', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_03 VALUES LESS THAN (TO_DATE('2007-04-04', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_04 VALUES LESS THAN (TO_DATE('2007-04-05', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_05 VALUES LESS THAN (TO_DATE('2007-04-06', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_06 VALUES LESS THAN (TO_DATE('2007-04-07', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_07 VALUES LESS THAN (TO_DATE('2007-04-08', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_08 VALUES LESS THAN (TO_DATE('2007-04-09', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_09 VALUES LESS THAN (TO_DATE('2007-04-10', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_10 VALUES LESS THAN (TO_DATE('2007-04-11', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_11 VALUES LESS THAN (TO_DATE('2007-04-12', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_12 VALUES LESS THAN (TO_DATE('2007-04-13', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_13 VALUES LESS THAN (TO_DATE('2007-04-14', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_14 VALUES LESS THAN (TO_DATE('2007-04-15', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_15 VALUES LESS THAN (TO_DATE('2007-04-16', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_16 VALUES LESS THAN (TO_DATE('2007-04-17', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_17 VALUES LESS THAN (TO_DATE('2007-04-18', 'YYYY-MM-DD')),
PARTITION TRAN_PART_2007_04_18 VALUES LESS THAN (TO_DATE('2007-04-19', 'YYYY-MM-DD')));

CREATE SEQUENCE TRANSACTION_ID_SEQ;

CREATE OR REPLACE TRIGGER T_LOG_EXAMPLE_BI
BEFORE INSERT
ON LOG_EXAMPLE
REFERENCING OLD AS OLD NEW AS NEW
FOR EACH ROW
begin
if :new.transaction_id is null then
select transaction_id_seq.nextval
into :new.transaction_id
from dual;
end if;

end t_transaction_bi;

CREATE INDEX LOG_TIME_IDX ON LOG_EXAMPLE (LOG_TIME, APPLICATION_ID);
--users want to sort by date within a parition and only get the top 100 results.

CREATE INDEX APPLICATION_IDX ON LOG_EXAMPLE (APPLICATION_ID);

There is row level security based on the application_id. That is why the indexes include the application_id column. Plus there are some other fields and foreign keys I removed to make this simple.

On a daily basis, a job runs to drop the oldest partition and add a new one.

--drop partition snippet
for v_rec in cur_drop_partitions loop
execute immediate ('alter table LOG_EXAMPLE drop partition ' || v_rec.partition_name || ' UPDATE GLOBAL INDEXES');
end loop;


--add partition snippet
for i in 1 .. v_diff loop
execute immediate('ALTER TABLE LOG_EXAMPLE ' ||
'ADD PARTITION TRAN_PART_' || to_char(v_max_date + i, 'yyyy_mm_dd') || ' ' ||
'VALUES LESS THAN (''' || to_char(v_max_date + i + 1, 'dd-mon-yyyy') || ''') ' ||
'TABLESPACE ' || v_tablespace);
end loop;

We are finding out that most users are searching for one particular day or one log type or combination of log type and day.

I understand the benefit of adding "update global indexes" so users don't have an outage. My question is, which is better given the fact that my table is changing daily plus, most queries are limited to a single partition?

Maybe the solution is a mix of local and global indexes?

and Tom said...

neither is "better"

the question you have to answer is:

a) do I want fast ddl, but an outage
b) can I deal with slow ddl, but provide continuous access

<quote src=Expert Oracle Database Architecture>

¿Live¿ Global Index Maintenance

Starting in Oracle9i, another option was added to partition maintenance: the ability to maintain the global indexes during the partition operation using the UPDATE GLOBAL INDEXES clause. This means that as you drop a partition, split a partition, perform whatever operation necessary on a partition ¿ Oracle will perform the necessary modifications to the global index to keep it up to date. Since most partition operations will cause this global index invalidation to occur, this feature can be a boon to systems that need to provide continual access to the data. You¿ll find that you sacrifice the raw speed of the partition operation, but with the associated window of unavailability immediately afterward as you rebuild indexes, for a slower overall response time from the partition operation but coupled with 100 percent data availability. In short, if you have a data warehouse that cannot have downtime, but must support these common data warehouse techniques of sliding data in and out, then this feature is for you¿but you must understand the implications.
Revisiting our previous example, if our partition operations had used the UPDATE GLOBAL INDEXES clause when relevant (in this example, it would not be needed on the ADD PARTITION statement since the newly added partition would not have any rows in it)
ops$tkyte@ORA10G> alter table partitioned
  2  exchange partition fy_2004
  3  with table fy_2004
  4  including indexes
  5  without validation
  6  UPDATE GLOBAL INDEXES
  7  /
Table altered.
 
ops$tkyte@ORA10G> alter table partitioned
  2  drop partition fy_2004
  3  UPDATE GLOBAL INDEXES
  4  /
Table altered.
 
ops$tkyte@ORA10G> alter table partitioned
  2  add partition fy_2006
  3  values less than ( to_date('01-jan-2007','dd-mon-yyyy') )
  4  /
Table altered.
 
ops$tkyte@ORA10G> alter table partitioned
  2  exchange partition fy_2006
  3  with table fy_2006
  4  including indexes
  5  without validation
  6  UPDATE GLOBAL INDEXES
  7  /
Table altered.

we would have discovered the indexes to be perfectly valid and usable both during and after the operation:
ops$tkyte@ORA10G> select index_name, status from user_indexes;
 
INDEX_NAME                     STATUS
------------------------------ --------
FY_2006_IDX                    VALID
FY_2004_IDX                    VALID
PARTITIONED_IDX_GLOBAL         VALID
PARTITIONED_IDX_LOCAL          N/A
 
6 rows selected.
 
ops$tkyte@ORA10G> set autotrace on explain
ops$tkyte@ORA10G> select count(*)
  2  from partitioned
  3  where timestamp between sysdate-50 and sysdate;
 
  COUNT(*)
----------
      6750
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=9 Card=1 Bytes=9)
   1    0   SORT (AGGREGATE)
   2    1     FILTER
   3    2       INDEX (RANGE SCAN) OF 'PARTITIONED_IDX_GLOBAL' (INDEX) (Cost=9¿

But there is a tradeoff: we are performing the logical equivalent of INSERT and DELETE operations on the global index structures. When we drop a partition, we have to delete all of the global index entries that might be pointing to that that partition. When we did the exchange of a table with a partition, we had to delete all of the global index entries pointing to the original data and then insert all of the new ones that we just slid in there. So the amount of work performed by the ALTER commands was significantly increased.
In fact, using runstats and a slightly modified version of the preceding example, we can measure the amount of ¿extra¿ work performed to maintain the global indexes during the partition operation. We¿ll slide out FY_2004 and slide in FY_2006 as before, adding in the requisite index rebuild. This will be the sliding window implementation that causes the data to become unavailable due to the need to rebuild the global indexes. We¿ll also then slide out FY_2005 and slide in FY_2007 using the UPDATE GLOBAL INDEXES clause, to emulate the sliding window implementation that provides for full data availability, even during the partition operations. In that manner, we can measure the same operations using the different techniques and compare them. Our expected outcome will be that the first approach will consume fewer database resources and therefore perform ¿faster¿ but incur a measurable period of ¿downtime.¿ The second approach, while consuming more resources and perhaps taking longer overall, will not incur downtime. As far as the end users are concerned, their ability to work never ceased. They might have been processing a bit slower (since we were competing with them for resources), but they were still processing, and they never stopped.
So, if we take the earlier example, but also create an empty FY_2005 table like FY_2004, and a full FY_2007 table like FY_2006, we can measure the differences between the index rebuild approaches, starting first with the ¿less available approach¿:
exec runStats_pkg.rs_start;

alter table partitioned exchange partition fy_2004
with table fy_2004 including indexes without validation;

alter table partitioned drop partition fy_2004;

alter table partitioned add partition fy_2006
values less than ( to_date('01-jan-2007','dd-mon-yyyy') );

alter table partitioned exchange partition fy_2006
with table fy_2006 including indexes without validation;

alter index partitioned_idx_global rebuild;

exec runStats_pkg.rs_middle;
Here is the highly available UPDATE GLOBAL INDEXES approach:
alter table partitioned exchange partition fy_2005
with table fy_2005 including indexes without validation 
update global indexes;

alter table partitioned drop partition fy_2005
update global indexes;

alter table partitioned add partition fy_2007
values less than ( to_date('01-jan-2008','dd-mon-yyyy') );

alter table partitioned exchange partition fy_2007
with table fy_2007 including indexes without validation
update global indexes;

exec runStats_pkg.rs_stop;

We might observe the following:
ops$tkyte@ORA10G> exec runStats_pkg.rs_stop;
Run1 ran in 81 hsecs
Run2 ran in 94 hsecs
run 1 ran in 86.17% of the time

Name                                  Run1        Run2        Diff
...
STAT...CPU used when call star          39          59          20
...
STAT...redo entries                    938       3,340       2,402
STAT...db block gets                 1,348       5,441       4,093
STAT...session logical reads         2,178       6,455       4,277
...
LATCH.cache buffers chains           5,675      27,695      22,020
...
STAT...table scan rows gotten       97,711     131,427      33,716
STAT...undo change vector size      35,100   3,404,056   3,368,956
STAT...redo size                 2,694,172   6,197,988   3,503,816

The index rebuild approach did run faster, both as observed by the elapsed time and the CPU time. This fact has caused many a DBA to pause and say, ¿Hey, I don¿t want to use UPDATE GLOBAL INDEXES¿it¿s slower.¿ That is too simplistic of a view, however. What you need to remember is that while the operations overall took longer, processing on your system was not necessarily interrupted. Sure, you as the DBA might be looking at your screen for a longer period of time, but the really important work that takes place on your system was still taking place. What you need to do is see if this tradeoff makes sense for you. If you have an eight-hour maintenance window overnight in which to load new data, then by all means, use the rebuild approach if that makes sense. However, if you have a mandate to be available continuously, then the ability to maintain the global indexes will be crucial.
Looking at the redo generated by each approach, we can see that the UPDATE GLOBAL INDEXES generated considerably more¿over 230 percent more¿and we would expect that to only go up as we add more and more global indexes to the table. The redo generated by the UPDATE GLOBAL INDEXES is unavoidable and cannot be turned off via NOLOGGING, since the maintenance of the global indexes is not a complete rebuild of their structure but more of an incremental ¿maintenance.¿ Additionally, since we are maintaining the live index structure, we must generate undo for that¿in the event the partition operation fails, we must be prepared to put the index back the way it was. And remember, undo is protected by redo itself, so some of the redo you see generated is from the index updates and some is from the rollback. Add another global index or two and you would reasonably expect these numbers to increase.
So, UPDATE GLOBAL INDEXES is an option that allows you to trade off availability for resource consumption. If you have the need to provide continuous availability, it will be the option for you. But you will have to understand the ramifications and size other components of your system appropriately. Specifically, many data warehouses have been crafted over time to use bulk, direct path operations, bypassing undo generation and, when permitted, redo generation as well. Using UPDATE GLOBAL INDEXES cannot bypass either of those two elements. You¿ll need to examine the rules you used to size these components before using this feature, so you can assure yourself it can work on your system.
</quote>

Rating

  (6 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Jon, April 17, 2007 - 1:49 pm UTC

I am a little more interested in when I would pick a local index versus a global one. In one OTN document I read, it stated that for a warehouse, you don't want global indexes because of partitioning. So at what point do you pick a local index over a global one? Just when you are adding/removing partitions frequently? Or maybe when you are adding/removing partitions in a Warehouse?

Global Indexes Pros
* Faster than local indexes when not doing partition elimination

Global Indexes Cons
* Requires "update global indexes" which then creates more redo
* The optimizer may pick a global index over partition elimination
* May require rebuilding as we drop and add a partition daily

Local Indexes Pros
* Local to each partition
* Less redo generated as we drop and add partitions
* Partition elimination not avoided when using the index

Local Indexes Cons
* Slower than a global index when not doing partition elimination

I am leaning towards moving to local indexes because most times, people search for one partition at a time and we are adding and removing partitions on a daily basis.

Thanks!!!
Tom Kyte
April 18, 2007 - 11:11 am UTC

those pro/con's are not universally true. IT DEPENDS

and using todays software, global indexes do not need go invalid - they are a crucial part of warehouses. really old papers, when partitioning first came out, yeah - they would say avoid them if you can.


Update Global Indexes

Jignesh, April 18, 2007 - 4:54 am UTC

.. And for Partition Exchange.. we need primary key on source and target tables should be same.

Consider scenario when we do not have any local indexes on target table. In this case, we need to build the indexes on source table, do exchange partitions and then UPDATE GLOBAL INDEXES again ! ! bit of overhead isn't it?
Tom Kyte
April 18, 2007 - 12:07 pm UTC

so - it is "online"


Update global index clause

satin satin, March 09, 2010 - 10:53 am UTC

While creating the partition table or global index can we use "Update global index clause" .So that it will prevent in feature if drop any partitions.Because everymonth we have to manage the partititons which are really bothering us and jobs are abending because of the indexes are disabled . Any suggestions will be helpfule

some more help in understanding global nonprefixed indexes

jv, June 29, 2010 - 1:27 pm UTC

Dear Tom,

good day to you, if you can please help me in understanding the below statement from one of the Oracle documents.

Quote
Global Nonprefixed: it is not possible to create a nonprefixed global index, because there are no management or performance benefits when compared to a non-partitioned index.
Unquote

I believe until Oracle 8i Global Nonprefixed indexes were not supported but in later releases they are supported, correct me if I am wrong here.

Also if they are supported is it true "they offer no management or performance benefits...". I am confused with this statement from the document.

Thanking you for your help on this.

Regards,
jv
Tom Kyte
July 06, 2010 - 1:09 pm UTC

No, it is not

ops$tkyte%ORA11GR2> create table t ( x int, y int, z int )
  2  partition by range(x)
  3  (partition p1 values less than (1),
  4   partition p2 values less than (2),
  5   partition p3 values less than (3)
  6  );

Table created.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> create index global_idx on t(x) global partition
  2  by range(x)
  3  (partition pi1 values less than (2), partition pi2 values less than (3), partition junk values less than (MAXVALUE ));

Index created.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> create index global_idx2 on t(y,x) global partition
  2  by range(x)
  3  (partition pi1 values less than (2), partition pi2 values less than (3), partition junk values less than (MAXVALUE ));
by range(x)
          *
ERROR at line 2:
ORA-14038: GLOBAL partitioned index must be prefixed



the partition key for the global index must be on the leading edge.

The comment refers to the fact that if the second index DID WORK, then a query of the form

where y = ?
where y > ?
...

and others would have to scan each and every partition, there would be nothing gained from partitioning this index by X.

How about non-prefixed local indexes?

Kulkarni, August 10, 2011 - 7:49 am UTC

Hi Tom,
Good day. I have a scenario where I need to partition a table on date column but most of the queries do not use that column as the leading column. Also there is no unique constraint on the column that is used in the predicate of most of the queries hence I can not make it a primary key. Here is the script I used to create a partitioned table and later inserted rows from the original non-partitioned table into this new partitioned table.

CREATE TABLE DATASTREAM.DS_PRICES_PART
( DS_ID VARCHAR2(9) NOT NULL ENABLE NOVALIDATE,
UPDATE_STAMP DATE NOT NULL ENABLE NOVALIDATE,
MVALUE NUMBER(14,3),
NOSH NUMBER(15,3),
PRICE NUMBER(15,3),
PHIGH NUMBER(15,3),
PLOW NUMBER(15,3),
POPEN NUMBER(15,3),
VOLUME NUMBER(13,0),
PRICE_US NUMBER(15,3),
PHIGH_US NUMBER(15,3),
PLOW_US NUMBER(15,3),
POPEN_US NUMBER(15,3)
)
PARTITION BY RANGE (UPDATE_STAMP)
(PARTITION DS_PRICES_p1 VALUES LESS THAN (TO_DATE('31-MAR-1975 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p2 VALUES LESS THAN (TO_DATE('31-MAR-1980 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p3 VALUES LESS THAN (TO_DATE('31-MAR-1985 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p4 VALUES LESS THAN (TO_DATE('31-MAR-1990 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p5 VALUES LESS THAN (TO_DATE('31-MAR-2000 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p6 VALUES LESS THAN (TO_DATE('31-MAR-2005 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p7 VALUES LESS THAN (TO_DATE('31-MAR-2010 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p8 VALUES LESS THAN (TO_DATE('31-MAR-2015 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) tablespace DATASTREAM_LARGE);

In most of the queries ds_id is referred as the leading column along with update_stamp. The combination of these two columns is unique. Now what kind of index do you suggest as I can not create global non-prefixed indexes?

Please help.
Tom Kyte
August 13, 2011 - 4:28 pm UTC


In most of the queries ds_id is referred as the leading column along with
update_stamp.


what do you mean by "leading column" - in a predicate?

if you have a where on update_stamp and ds_id - and both use equals, just use a local index on the two columns in any order.

if you have a where clause on those table and use a range operator on either - put the one you use equals on first, then the other. We'll still be able to partition eliminate so a local might be fine.

If you need to use a global index, so be it - just index as you would normally index?

without more information, I cannot really say anymore.

Local Indexes

jloayza, September 08, 2011 - 4:11 pm UTC

Hi Tom,

Regarding to the last question, what kind of index should I use when most of the queries does not include the partition key. Is it correct to use local non prefixed?

Regards

J.L

Tom Kyte
September 08, 2011 - 5:50 pm UTC

If most of the queries that want to use that index are not going to use the partition key, you almost certainly want a global index or a non-partitioned index.

If you have an emp table that is hash partitioned into 128 partitions by empno and you have an index on LAST_NAME, you almost certainly want the index on last name to be GLOBALLY partitioned - if you used a locally partitioned index - we'd have to do 128 - one hundred and twenty eight - index range scans every time we used it!!!!

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.