Skip to Main Content
  • Questions
  • Hash partitioning in OLTP, a way of suicide?

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, pedro.

Asked: December 06, 2003 - 11:53 am UTC

Last updated: July 02, 2012 - 7:40 am UTC

Version: 9.2.0.2

Viewed 10K+ times! This question is

You Asked

Hello

I am currently working with Peoplesoft CRM v8

One of consultants decided to partitioned the tables using hash partitioning, he also partitioned the indexes locally. Obviously he just showed me how clueless he is making the indexes local to the tables and time has shown I am right, we had to re-partition all indexes into global.

I asked him a couple of days ago what is the technical reason of using hash partitioning (never worked well for me in OLTP). He said

1. We increase availability.
-- I would agree if he stored the partitions in different tablespaces however as clueless he is he stored everything in same tablespace. So availability is not applicable here.

2. Administration facilities.
-- Well to certain degree I would agree since dealing with a 50 million table is harder than dealing with 64 small partitions. Again he showed me how clueless by creating 64 partitions in an OLTP environment. (I would not create more than 8)

3. Increase performance and reduces contention.
-- I never believed Hash partitioning increases performance and the reality and time has shown I am right.
I do agree about reducing contention.. BUT.... comment at the end


I proceeded to making a benchmark comparison, we have two identical database one hash partitioned and the other not. I took two statspack snapshots at each session in each instance to get the session statistics. Ran a identical SQL statements which shares the same execution plan in each sessions/instance. By the way I was the only user in the whole server and each database.

Here are the relevant statistics,

Hash Partitioned results:
sesstats
-------------------------------------------------------------------
consistent gets 601,857 963.0
consistent gets - examination 193,513 309.6
cpu used by this session 8,593 13.8
cpu used when call started 8,593 13.8
parse count (hard) 1 0.0
parse count (total) 13 0.0
parse time cpu 2 0.0
parse time elapsed 2 0.0
physical reads 492,418 787.9
pinned buffers inspected 73 0.1
prefetched blocks 10 0.0
recursive calls 258 0.4
redo entries 18 0.0
redo size 8,960 14.3
rows fetched via callback 64,503 103.2
session cursor cache count 7 0.0
session cursor cache hits 2 0.0
session logical reads 601,877 963.0
session pga memory 379,608 607.4
session pga memory max 379,608 607.4
session uga memory 4,264 6.8
session uga memory max 104,616 167.4
shared hash latch upgrades - no w 84,288 134.9
sorts (memory) 15 0.0
sorts (rows) 36 0.1
table fetch by rowid 5,142,160 8,227.5
table scan blocks gotten 170 0.3
table scan rows gotten 5,024 8.0
table scans (short tables) 16 0.0
user calls

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
db file sequential read 493,086 504 84.65
CPU time 89 14.89
control file parallel write 203 2 .28
db file scattered read 23 0 .05
process startup 1 0 .05

No-Hash Partitioned results:
sesstats
-------------------------------------------------------------------
consistent gets 593,514 948.1
consistent gets - examination 192,675 307.8
cpu used by this session 6,766 10.8
cpu used when call started 6,766 10.8
parse count (hard) 3 0.0
parse count (total) 14 0.0
parse time cpu 1 0.0
parse time elapsed 3 0.0
physical reads 422,118 674.3
recursive calls 265 0.4
recursive cpu usage 1 0.0
redo entries 18 0.0
redo size 8,392 13.4
rows fetched via callback 64,224 102.6
session logical reads 593,533 948.1
session pga memory 248,352 396.7
session pga memory max 248,352 396.7
session uga memory 2,024 3.2
session uga memory max 98,448 157.3
shared hash latch upgrades - no w 25,609 40.9
sorts (memory) 15 0.0
sorts (rows) 36 0.1
table fetch by rowid 5,141,482 8,213.2
table scan blocks gotten 16 0.0
table scan rows gotten 236 0.4
table scans (short tables) 16 0.0
user calls 25 0.0

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
db file sequential read 422,162 346 83.06
CPU time 70 16.81
control file parallel write 204 0 .10
log file parallel write 26 0 .02
control file sequential read 177 0 .01


From all results no-hash partitioned is better

My question is:

Is it worthy lose performance just gain reduction in contention...?

IMHO Hash partition is a no-no in OLTP, a way of suicide










and Tom said...

(Hash partitioned tables) PLUS (global range partitioned INDEXES) EQUALS (success in OLTP)


It is funny -- this is almost verbaitim the example I used in my new book "Effective Oracle By Design". I was trying to drive home that

a) partitioning is NOT fast=true
b) you MUST understand the physics behind the data, whats happening.

Say you took a table T with columns ( ID primary key, CUST_ID, ... )

You hash partitioned into 64 partitions by ID.
You have a local index on CUST_ID.

MOST of your queries are "where cust_id = :x"

Guess what you just accomplished.

You accomplished the feat of increasing your IO by a factor of 64!!! by 64 times!!

why? well, we have 64 tiny little index segments to range scan -- your customer id could be in any, all or none of them.

Solution -- hash partition table, range partition index by cust_id -- now you will NOT have affected read performance at all (probably, it could be a tiny bit better with partitioning but nothing phenomenal) but you might find that you've reduced contention on modifications since you have N indexes and N table segments (and hence N freelists at least and so on)


If you have my new book -- you'll laugh at how closely your example above mirrors the one in the book, almost scary (but I only did 8 partitions, to show an 8 times increase in IO)

Rating

  (41 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

hello and thx for reply

A reader, December 06, 2003 - 2:15 pm UTC

Hello again

Glad you reply so fast, wasnt expecting this kind of help from you on weekends. Wish Metalink's support has this sort of quality help as yours... BUT that's another story

Index range index.... I understand we could use it, the thing is if I could index range partition in the first place I would have not used hash particion at all. I would range partitioned the tables. The problem which worried me was Peoplesoft has these funny column IDs, using varchar2(25) and it goes like this

'1000000000000000000040001'
'1000000000000000000040002'
'1000000000000000000040003'

I also thought of using range partitioning on the indexes but it involves maintenance, our production environment is outsourced to some company and as you may know under these situations the production guys just wanna sit and doing nothing except reciveing maintenance scripts from development (outsourcing is bad, hope people will understand in the next couple of years). Anyway the thing is the tables involved has around 100 indexes... I am sure if I say I am going to range partitioned those indexes I will start getting attacks from everywhere hehe

I will give this a try tomorrow and I will post the statspack results again.

Thank you

Regards



forgot to say

A reader, December 06, 2003 - 2:22 pm UTC

I forgot to say that the indexes are already global partitioned in the test case I posted and the performance on hash partitioned is still poor.

Will index range partitioning at least equals the non-hash-partitioned table results?

In god we trust :-)

Regards

Note:
Just want to give a real life example so people can use it to decide on partitioning tables and indexes.

When used local indexes and hash partitioning a query statistics was this

123 minutes

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
12936039 consistent gets
775258 physical reads
0 redo size
341 bytes sent via SQL*Net to client
460 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed


after making into global

0.5 seconds

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
123 consistent gets
0 physical reads
0 redo size
341 bytes sent via SQL*Net to client
460 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed

both used same plan

However the global index PLUS hash partitioned table performance is still worse than plain tables and plain indexes

Tom Kyte
December 06, 2003 - 2:46 pm UTC

using globally partitioned indexes will meet or beat the non-partitioned table (he says with a high degree of confidence)

it is all about being able to eliminate all but ONE index partition from consideration. If you can do that -- it'll be just like it wasn't partitioned.

At the simplest level, a single global index.

I dont think this is good to range partition

pedro, December 06, 2003 - 3:36 pm UTC

I just looked the indexes on the hash partitioned tables. Just saw that many of indexes are composite index of 2 to 4 columns... BIG PROBLEM

To range partition multi columns keys is way too complicated plus some of them are foreign keys which means that if I dont equi-partition the referenced PKs I am not sure how would Peoplesoft batch jobs joins gonna work. In other words if I have this column

000001
000002 etc to 001000

If I partitioned the FK in

000001 ~ 000100
000101 ~ 000200 etc to 001000

I would have to partitioned my PKs as well in the same fashion otherwise I bet this is going to be very very slow when the batch jobs runs join between big tables

Hmm, so complicated, guess will stick with my simple global partition indexes :-\



Tom Kyte
December 06, 2003 - 3:52 pm UTC

just range on the leading edge OR use a single partition.

where index_c1 = :x and index_c2 = :y and index_c3 = :z

will use index partition elimination purely on the index_c1 = :x part.

I meant this is NOT good to range partition

pedro, December 06, 2003 - 3:37 pm UTC


stick with plain tables pedro

Marko, December 07, 2003 - 4:39 am UTC

Hi Pedro

My suggestion is stick with plain heap tables in OLTP and an application such as Peoplesoft which is OLTP/Batch mixed.

Some good indexes should make your Peoplesoft run fast enough. Actually Peoplesoft's default indexes are quite good already.

If you start range partitioning your indexes in Peoplesoft as Tom says, only the leading columns you will make things much worse, as you may know the composite indexes are default from Peoplesoft which are only selective when they are composite so if you go and index (and range partition) only the leading column this is going to be a disaster.

Hash partitioning table + Range partitioning indexes are only good if your application is simple, Peoplesoft is way too complicated. When those massive batch jobs joining 3 or 4 20 million rows tables between them in Peoplesoft you will feel the wrath of partitioning in a Hybrid system, probably runs ok in OLTP but your batch jobs will take days to finish :-)


do this in SQL

pedro, December 07, 2003 - 1:05 pm UTC

Thank you Marko for your comments.

I am still going to give this a try, partitioning the index into ranges.

Tom, I am trying to determine the range of values, i wrote a pl/sql block I would like to know if it is possible to do with simply SQL? (The FOR LOOP part)

declare
l_number number;
l_no_blanks number := 0;
l_start varchar2(20) := '0000000000';
l_stop varchar2(20) := '1000000000';
begin
for i in 1..15
loop
select count(*)
into l_number
from sysadm.PS_RF_INST_PROD
where
parent_inst_prodid >= l_start
and parent_inst_prodid < l_stop;
l_start := to_number(l_start) + 1000000000;
l_stop := to_number(l_stop) + 1000000000;
l_no_blanks := l_no_blanks + l_number;
dbms_output.put_line('From ' || l_start || ' to ' || l_stop ||': ' || l_number);
end loop;
select count(*)
into l_number
from sysadm.PS_RF_INST_PROD
where parent_inst_prodid = ' ';
dbms_output.put_line('No values (blanks): ' || l_number);
dbms_output.put_line(lpad('-', 50, '-'));
dbms_output.put_line('Total number of rows is: ' || to_char(l_no_blanks + l_number));
end;
/

The results is as follows as you can see the data is screwed :-)


From 1000000000 to 2000000000: 696
From 2000000000 to 3000000000: 1418631
From 3000000000 to 4000000000: 1429009
From 4000000000 to 5000000000: 1406610
From 5000000000 to 6000000000: 1396820
From 6000000000 to 7000000000: 1429143
From 7000000000 to 8000000000: 1433470
From 8000000000 to 9000000000: 616147
From 9000000000 to 10000000000: 395680
From 10000000000 to 11000000000: 0
From 11000000000 to 12000000000: 1418631
From 12000000000 to 13000000000: 0
From 13000000000 to 14000000000: 0
From 14000000000 to 15000000000: 0
From 15000000000 to 16000000000: 0
No values (blanks): 5077656
--------------------------------------------------
Total number of rows is: 16022493

greets


Tom Kyte
December 07, 2003 - 2:37 pm UTC

fruedian slip?  "data is 'screwed'" -- do you mean "skewed" :)

if you would like to "bin" the existing data into more or less equal ranges, ntile is very handy:

  1  select nt, min(id), max(id), count(*)
  2    from (
  3  select object_id id, ntile(4) over (order by object_id) nt
  4    from all_objects
  5         )
  6*  group by nt
ops$tkyte@ORA9IR2> /
 
        NT    MIN(ID)    MAX(ID)   COUNT(*)
---------- ---------- ---------- ----------
         1          3       7515       7340
         2       7516      14854       7339
         3      14855      22195       7339
         4      22196      30906       7339
 
ops$tkyte@ORA9IR2>


is that what you are trying to do?
 

sorry results should be like this

pedro, December 07, 2003 - 1:08 pm UTC

From 0000000000 to 1000000000: 696
From 1000000000 to 2000000000: 1418631
From 2000000000 to 3000000000: 1429009
From 3000000000 to 4000000000: 1406610
From 4000000000 to 5000000000: 1396820
From 5000000000 to 6000000000: 1429143
From 6000000000 to 7000000000: 1433470
From 7000000000 to 8000000000: 616147
From 8000000000 to 9000000000: 395680
From 9000000000 to 10000000000: 0
From 10000000000 to 11000000000: 1418631
From 11000000000 to 12000000000: 0
From 12000000000 to 13000000000: 0
From 13000000000 to 14000000000: 0
From 14000000000 to 15000000000: 0
No values (blanks): 5077656
--------------------------------------------------
Total number of rows is: 16022493

hmm dont get your SQL

A reader, December 07, 2003 - 2:48 pm UTC

Hi

never used ntile before...

what I am trying is

select count(*)
into l_number
from sysadm.PS_RF_INST_PROD
where
parent_inst_prodid >= l_start
and parent_inst_prodid < l_stop;


make l_start and l_stop dynamic but using SQL, not sure if that´s possible

this is acadmeic question by the way :-)

Tom Kyte
December 07, 2003 - 7:12 pm UTC

ok, my question is "why", what is the goal

looks like you are trying to figure out "how should I range partition, by what values, to get even segment sizes"

if so, NTILE does that.

how about range range

A reader, December 08, 2003 - 2:33 pm UTC

quote:
(Hash partitioned tables) PLUS (global range partitioned INDEXES) EQUALS (success in OLTP)

How about

Range partitioned tables PLUS Global range partitioned INDEXES

Range partitioned tables PLUS Global INDEXES

Range partitioned tables PLUS locally range partitioned INDEX on the partition key AND Global range partitioned INDEXES

Range partitioned tables PLUS locally range partitioned INDEXES

Anyone with some experience?

Tom Kyte
December 09, 2003 - 5:48 am UTC

ok, let me change it, simplified

global partitioned indexes equals success.

but even that is too simplified -- since locally partitioned indexes work nicely if the data is partitioned by the index keys.

but -- really the equation should be:

knowledge of the physic behind partitioning = success.


period. it is a physical structure, once you look at how it is stored, it becomes rather "intuitive" at times how Oracle will need to access it. Once you have this knowledge down, you can use this tools to achieve

o easier admin
o more available
o more concurrent
o perhaps even better performance on some queries



global partitioned indexes

A reader, December 09, 2003 - 6:07 am UTC

I think global partitioned indexes are only feasible if the indexes are NOT composite.



Global partitioned index

A reader, December 09, 2003 - 6:09 am UTC

I think global partitioned indexes are only feasible if the indexes are NOT composite.

For example a composite 3 column indexes is very very difficult to make that partitioned!


Tom Kyte
December 09, 2003 - 6:42 am UTC

why do you think that? (when something starts with "i think" and is not followed by "for these reasons" or "because as this example shows" -- well, i do not believe it)

I think they are feasible, becase as this example shows:

ops$tkyte@ORA9IR2> create table t nologging
  2  partition by hash(object_id) partitions 8
  3  as
  4  select * from all_objects;
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create index t_idx on t(owner,object_type,object_name)
  2  global partition by range (owner)
  3  ( partition values less than ( 'F' ),
  4    partition values less than ( 'M' ),
  5    partition values less than ( 'T' ),
  6    partition values less than ( MAXVALUE )
  7  ) compute statistics;
 
Index created.



You partition on the LEADING EDGE.  and partition elimination kicks in

ops$tkyte@ORA9IR2> create table t nologging
  2  partition by hash(object_id) partitions 8
  3  as
  4  select * from all_objects;
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> analyze table t compute statistics for table;
 
Table analyzed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create index t_idx on t(owner,object_type,object_name)
  2  global partition by range (owner)
  3  ( partition values less than ( 'P' ),
  4    partition values less than ( 'S' ),
  5    partition values less than ( 'U' ),
  6    partition values less than ( MAXVALUE )
  7  ) compute statistics
  8  /
 
Index created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> delete from plan_table;
 
8 rows deleted.
 
ops$tkyte@ORA9IR2> explain plan for select * from t where owner = 'SCOTT' and object_type = 'TABLE';
 
Explained.
 
ops$tkyte@ORA9IR2> select * from table(dbms_xplan.display);
 
PLAN_TABLE_OUTPUT
--------------------------------------

 
------------------------------------------------------------------------------
| Id  | Operation                      |  Name  |Rows|Bytes|Cost|Pstrt| Pstop |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |        |   3| 288 |  3 |     |       |
|   1 |  TABLE ACCESS BY GLBL INDX ROWD| T      |   3| 288 |  3 |ROWID| ROW L |
|*  2 |   INDEX RANGE SCAN             | T_IDX  |   1|     |  2 |   3 |     3 |
------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("T"."OWNER"='SCOTT' AND "T"."OBJECT_TYPE"='TABLE')
 
Note: cpu costing is off
 
15 rows selected.



 

example

alan, December 09, 2003 - 7:26 am UTC

PK is formed by

a =====> 2 distinct values (AXA and SHARE)
b =====> 55000 distinct values

If we partition this PK then just partitoning a will do no good!

If we partition the PK we would have to do this

CREATE INDEX y_idx1 ON y(a, b)
GLOBAL PARTITION BY RANGE(a, b)
( partition arbor_a0 values less than ('AXA', 4000),
partition arbor_a1 values less than ('AXA', 8000),
partition arbor_a2 values less than ('AXA', 12000),
partition arbor_a3 values less than ('AXA', 16000),
partition arbor_a4 values less than ('AXA', 20000),
partition arbor_a5 values less than ('AXA', 24000),
partition arbor_a6 values less than ('AXA', 28000),
partition arbor_a7 values less than ('AXA', 32000),
partition arbor_s0 values less than ('SHARE', 4000),
partition arbor_s1 values less than ('SHARE', 8000),
partition arbor_s2 values less than ('SHARE', 12000),
partition arbor_s3 values less than ('SHARE', 16000),
partition arbor_s4 values less than ('SHARE', 20000),
partition arbor_s5 values less than ('SHARE', 24000),
partition arbor_s6 values less than ('SHARE', 28000),
partition arbor_s7 values less than ('SHARE', 32000),
partition arbor_max values less than (maxvalue, maxvalue)
);




Tom Kyte
December 09, 2003 - 8:21 am UTC

or simply flip flop the columns in the index and partition by b alone of course.

A reader, December 09, 2003 - 11:27 am UTC

Very nice discussion!

Regarding
o easier admin ---> CLEAR
o more available ---> CLEAR
o more concurrent
o perhaps even better performance on some queries

More concurrent on what?
What queries can improve their performance?


hash partition local indexes

A reader, December 26, 2003 - 6:56 pm UTC

Hi

Is it a good idea local partition Primary Keys? I think it should be fast because if I join with other tables it's like doing

where PK in (............)

which is some sort of equality operation (IN-List), am I right?

Tom Kyte
December 26, 2003 - 7:15 pm UTC

it is a function of "what is the PARTITION KEY"

if you partition by X
but you have a primary key Y
the locally partitioned approach, well, it won't even fly.....

Your partition key is pretty safe being indexed locally in both DW and OLTP.

If your partition key is your primary key, then the primary key "inherits" this niceness.

how about local non-prefixed composite index

A reader, December 27, 2003 - 1:12 pm UTC

If we partition by let's say DEPT_ID but the PK is a composite key composed of two columns set_id and dept_id where distinct value of set_id is just one if the unique index is created such

create unique index x_idx1 on x(set_id, dept_id)
local;

Will this affect performance since this is non-prefixed? (although the leading column has only one distinct value)

Tom Kyte
December 27, 2003 - 1:30 pm UTC

the optimizer won't use the index in as many cases perhaps -- there are times when a local prefix index is called for, but the raw performance of the index when used will be the same.

local index no good

A reader, January 03, 2004 - 2:51 pm UTC

this is a test case

select a.bo_id ps_bo_id, b.bo_id b_bo_id, b.cm_id, a.cust_status
from sysadm.ps_bc a,
(select X.BO_ID, X.CM_ID
from (select bo_id, cm_id, PROFILE_CM_SEQ
from sysadm.PS_BO_CM_PROFL_DTL
where CM_TYPE_ID in (1, 2)
and PRIMARY_IND ='Y'
and EFFDT < sysdate) X) b
where a.bo_id = b.bo_id
and b.cm_id between 2102032000000007960641 and 2102032000000008500000;


LOCAL
==================================================================================================
CREATE UNIQUE INDEX SYSADM.PS_BC ON SYSADM.PS_BC
(BO_ID)
TABLESPACE PSINDEX
NOLOGGING
LOCAL STORE IN (PSINDEX)
PARALLEL 2;
ALTER INDEX SYSADM.PS_BC LOGGING;
ALTER INDEX SYSADM.PS_BC NOPARALLEL;

146696 rows selected.

Elapsed: 00:00:14.79

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=7 Card=1 Bytes=54)
1 0 NESTED LOOPS (Cost=7 Card=1 Bytes=54)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'PS_BO_CM_PROFL_DTL' (Cost=5 Card=1 Bytes=39)
3 2 INDEX (RANGE SCAN) OF 'PSDBO_CM_PROFL_DTL' (NON-UNIQUE) (Cost=3 Card=2)
4 1 PARTITION HASH (ITERATOR)
5 4 TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'PS_BC' (Cost=2 Card=1 Bytes=15)
6 5 INDEX (UNIQUE SCAN) OF 'PS_BC' (UNIQUE) (Cost=1 Card=1)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
633661 consistent gets
36035 physical reads
0 redo size
8290653 bytes sent via SQL*Net to client
108224 bytes received via SQL*Net from client
9781 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
146696 rows processed

GLOBAL
==================================================================================================
CREATE UNIQUE INDEX SYSADM.PS_BC ON SYSADM.PS_BC
(BO_ID)
TABLESPACE PSINDEX
NOLOGGING
PARALLEL 2;

ALTER INDEX SYSADM.PS_BC LOGGING;
ALTER INDEX SYSADM.PS_BC NOPARALLEL;

146696 rows selected.

Elapsed: 00:00:13.02

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=7 Card=1 Bytes=54)
1 0 NESTED LOOPS (Cost=7 Card=1 Bytes=54)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'PS_BO_CM_PROFL_DTL' (Cost=5 Card=1 Bytes=39)
3 2 INDEX (RANGE SCAN) OF 'PSDBO_CM_PROFL_DTL' (NON-UNIQUE) (Cost=3 Card=2)
4 1 TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'PS_BC' (Cost=2 Card=1 Bytes=15)
5 4 INDEX (UNIQUE SCAN) OF 'PS_BC' (UNIQUE) (Cost=1 Card=1)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
496745 consistent gets
36346 physical reads
0 redo size
8290653 bytes sent via SQL*Net to client
108224 bytes received via SQL*Net from client
9781 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
146696 rows processed


basically with local indexes you need roughly one consistent get more than normal global index..... now dont tell why would someone use local index at all in OLTP

Tom Kyte
January 03, 2004 - 3:41 pm UTC

think "primary key" and then tell us what you think about using local indexes for example...

there are valid reasons, a test case showing that in a single example you produce more LIO's does not mean something is "evil" -- far from it. It just shows "don't use it in this case"


Take a primary key, say populated via a sequence, that is used to define the HASH key for the table. Hmmm, that local index is looking pretty good -- instead of a single index structure (eg: point of contention) I have N of them -- i've spread the contention way out, I've perhaps reduced the height of my b*trees (you might find a keyed read doing LESS LIO), hmmm. Local indexes must be good for OLTP....

we are right back to my original statement:

It is funny -- this is almost verbaitim the example I used in my new book
"Effective Oracle By Design". I was trying to drive home that........


But your logic is doubly flawed - you cannot prove something is "evil" in that fashion -- you just had a bad design decision there. Used in other ways, that local index is going to be simply AWESOME and a way to increase scalability, benefit performance, all of the good things.

to the test reader

A reader, January 04, 2004 - 10:07 am UTC

Hi

I thiink your statistics are screwed up, the cardinality doesnt seem to match at all, returning over 140000 rows and it says 1 as cardinality LOL :DDDD

full table scanning hash partiion tables

Ryan Gaffuri, August 27, 2004 - 10:31 am UTC

I did some tests and noticed signifcant performance reductions on full table scans on hash partioned tables vs, the same data in a normal. Is this verifiable? Could it just be my hardware?

I understand why hash partitioning can adversely affect performance for the reasons stated above, but why would it affect a full table scan?



Tom Kyte
August 27, 2004 - 10:39 am UTC

it would not affect the performance of a full scan.

use a 10046 level 12 trace and tkprof and show us the measurable differences (with waits)

just looking

Nayan, December 02, 2004 - 8:55 am UTC

I was looking for info on partitioning, I can't say I understood a lot, I got a good insight.
Thanks Tom

partitioning.....

whizkid, May 20, 2005 - 10:26 am UTC

tom,

we have an accounting table AC_INTERNAL_ACCOUNTS containing about 14 million rows in our OLTP system. This table is very frequently used in our accounting queries. 

SQL> desc ac_internal_accounts
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 COMPANY_ID                                NOT NULL NUMBER(10)
 SWIFT_CODE                                NOT NULL VARCHAR2(3)
 ACCOUNT_CATEGORY_CODE                     NOT NULL VARCHAR2(10)
 SUBFACTOR_1_VAL                                    VARCHAR2(10)
 SUBFACTOR_2_VAL                                    VARCHAR2(10)
 SUBFACTOR_3_VAL                                    VARCHAR2(10)
 SUBFACTOR_4_VAL                                    VARCHAR2(10)
 INTERNAL_ACCOUNT_ID                       NOT NULL NUMBER(10)
 EXPL_GL_MPNG_ONLY_IND                              CHAR(1)
 AUTO_ASSIGN_ACCT_IND                      NOT NULL CHAR(1)
 GL_MPNG_ACCT_IND                          NOT NULL CHAR(1)
 CREATION_DATE                             NOT NULL DATE
 SUBFACTOR_1_EXTN_VAL                               VARCHAR2(500)
 SUBFACTOR_2_EXTN_VAL                               VARCHAR2(500)
 SUBFACTOR_3_EXTN_VAL                               VARCHAR2(500)
 SUBFACTOR_4_EXTN_VAL                               VARCHAR2(500)

the queries are typically like

SELECT c.subfactor_1_val, b.receipt_no receipt_no, SUM(a.posting_amt)*-1 balance
FROM   ACC_POSTINGS a , BJAZ_ACX_TCODE b , AC_INTERNAL_ACCOUNTS c
WHERE  c.account_category_code = '3110111105'
AND    a.internal_account_id = c.internal_account_id
AND    a.event_no = b.event_no
AND    a.posting_no = b.posting_no
AND    a.batch_id = b.batch_id
AND    a.effective_date <= to_date('30-APR-2005')
GROUP BY c.subfactor_1_val, b.receipt_no

is it a good idea to partition the table on the basis of account_category_code? the queries will be similar to about with different codes.. but the data for account_category is highly skewed.. 

ACCOUNT_CATEGORY_CODE    COUNT(*)
1063232001    1
1063232002    1
1063232003    1
1072100001    1
1072100003    1
1072100006    1
1072100007    1
1072100008    1
1072100009    1
1072100010    1
1072100011    1
1072100012    1
1072100013    1
1072100014    1
1072100015    1
1072100016    1
1072100017    1
1072100018    1
1072100019    1
1072100020    1
1072100021    1
1072100022    1
1072100023    1
1072100024    1
1072100025    1
1072100026    1
1072100027    1
1072100028    1
1072100029    1
1072100030    1
1072100031    1
1072100032    1
1072100033    1
1072100034    1
1072100035    1
1072100036    1
1072100037    1
1072100038    1
1072100039    1
1072100040    1
1072100041    1
1072100042    1
1072100043    1
1072100044    1
1072100045    1
1072100046    1
1072100047    1
1072100048    1
1072100049    1
1072100050    1
1072100051    1
1072100052    1
1072100053    1
1072100054    1
1072100056    1
1072100057    1
1072100058    1
1072100061    1
1072100063    1
1072100064    1
1072100065    1
1072100070    1
1072100073    1
1072100074    1
1072100077    1
1072100078    1
1072100081    1
1072100084    1
1072100088    1
1072100089    1
1072100100    1
1072100103    1
1072200000    1
1072300000    1
1074100001    1
2111100001    1
2115220007    1
2115220013    1
2115220019    1
3110411110    1
3110711114    1
3110711120    1
3110712228    1
3110712300    1
3110712301    1
3110712303    1
3110712304    1
3110712305    1
3110712306    1
3110712359    1
3110712362    1
3110712917    1
3130503202    1
3130509205    1
3130509206    1
MISCSUPPUW    1
3110312001    2
3110314004    2
MISCSUPCON    2
2111300012    3
3110712351    4
1061300012    5
2053120002    6
3110111102    8
1061300010    13
1061300008    17
2111300008    22
1061300009    29
1061300011    42
1061300014    42
2111300007    50
2111300014    55
2111200500    71
2111300010    118
1061300007    119
2111300011    129
2115220004    136
1061100000    172
1061200000    227
2111300004    242
2111300001    431
2111300009    462
3110711116    516
1061300005    563
1074100000    583
3110312000    584
2111300003    627
1061300003    647
1061300001    730
1061300006    745
1061300004    1121
2111300005    1125
2111200002    1304
2111300006    1500
3130503201    2092
1061300002    2174
2115220002    3605
2111300002    3632
2111200001    3635
2111200000    3915
2112000002    4727
2115220008    5696
3110711130    5841
3110314002    7321
3130212000    10541
1061100001    12795
1063232005    21595
2111100000    41810
3110411125    44998
2115220010    100156
3110111104    116315
3110314000    116481
3110411123    121369
3110111101    204646
3110551001    220345
2053120000    236021
3110411111    236152
3110412120    236344
2053113000    430288
3110412113    430358
3110711111    494019
3110711112    833558
3110111105    1952345
3110111100    2746747
2115220001    2760610

what do you suggest? if we hash partition this table on the basis of account_category_code and sub partition on ac_internal_account_id (this id is used when joining with other tables)? im going to create it and test it out.. just had a doubt because of such a high skewed data.. 
what do you suggest?  

Tom Kyte
May 20, 2005 - 6:25 pm UTC

what are you hoping to gain by partitioning (that is what we must start from -- that is the only way to decide how or what to partition on)

partitioning..... reasons....

whizkid, May 21, 2005 - 2:43 am UTC

we have identified performance problems 

SQL> l
  1  select object_name, statistic_name, value
  2  from v$segment_statistics
  3* where object_name = 'AC_INTERNAL_ACCOUNTS'
SQL> /

OBJECT_NAME                    STATISTIC_NAME                                                        VALUE
------------------------------ ---------------------------------------------------------------- ----------
AC_INTERNAL_ACCOUNTS           logical reads                                                       4467728
AC_INTERNAL_ACCOUNTS           buffer busy waits                                                     14716
AC_INTERNAL_ACCOUNTS           db block changes                                                       4048
AC_INTERNAL_ACCOUNTS           physical reads                                                      2180137
AC_INTERNAL_ACCOUNTS           physical writes                                                         272
AC_INTERNAL_ACCOUNTS           physical reads direct                                                     0
AC_INTERNAL_ACCOUNTS           physical writes direct                                                    0
AC_INTERNAL_ACCOUNTS           global cache cr blocks served                                           236
AC_INTERNAL_ACCOUNTS           global cache current blocks served                                   143650
AC_INTERNAL_ACCOUNTS           ITL waits                                                                 0
AC_INTERNAL_ACCOUNTS           row lock waits                                                            0

11 rows selected.

Elapsed: 00:00:00.37

SQL> select * from v$system_event where event like '%read%' or event like '%get%' or event like '%buffer%';

EVENT                TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED AVERAGE_WAIT TIME_WAITED_MICRO
-------------------- ----------- -------------- ----------- ------------ -----------------
ges LMON to get to F          20             11          28            1            277538
control file sequent      373172              0       11260            0         112598582
buffer deadlock              221            216           0            0               221
buffer busy waits         215275           3525      428686            2        4286857723
buffer busy global c         611              0        1026            2          10264518
buffer busy global C      169437              0       51034            0         510339247
log file sequential         1781              0        2085            1          20850389
log buffer space              34              0          21            1            208399
db file sequential r     6643159              0     2089555            0        2.0896E+10
db file scattered re     3774175              0     1232425            0        1.2324E+10
db file parallel rea      229078              0       48059            0         480594451
direct path read           34944              0        9197            0          91965280
direct path read (lo           9              0           3            0             30456
PX Deq Credit: need        99644              4        1847            0          18469518

14 rows selected.

Elapsed: 00:00:00.11

SQL> select to_char(startup_time, 'dd-mon-yyyy hh24:mi:ss') from V$instance;

TO_CHAR(STARTUP_TIME
--------------------
20-may-2005 22:20:42

Elapsed: 00:00:00.00

the table is experiencing high amount of logical reads & buffer busy waits.. so we thought partitioning would help.. 

Tom Kyte
May 21, 2005 - 8:51 am UTC

why do you believe that partitioning would help alleviate physical io? (you either have to read it or not. it does not matter if the block is in partition 1, partition 2 or a single segment table).

the buffer busy waits could be because 5 sessions want to read the same block -- one session will register a db file wait, the other 4 a buffer busy wait.

or the buffer busy waits could be because 5 sessions want to get the block in current mode simultaneously -- 4 will get buffer busy waits.

or the buffer busy waits could be because dbwr isn't cleaning out the buffer cache fast enough and before we can do the physical IO was have to wait....


but I don't see how partitioning will reduce your logical io, or physical io against this segment.


and you are looking at statistics over many many many hours.

(but then again, you just started up too -- one expects the most physical IO in the hours immediately following startup)

but......

whizkid, May 22, 2005 - 5:39 am UTC

wouldnt partitioning helping in reducing the IO? if we parition by account category code and a local index on that, given the kind of queries mostly used, wouldnt it mean lesser number of blocks to be read? would it not be better than an index full scan? partition elimination would kick in and reduce the number of blocks to be read.......... this is what i thought will happen... if im wrong, then i have to re-read the docs....

i had read somewhere on this site that buffer busy waits can be eliminated by hash paritioning.. (will search & post the link).. though couldnt understand why... in our case it's the first case.. 5 seesion want to read the same block.. i see the block in p2 column of v$session_wait being read by someone else.. what i thought could help out was to increase the number of inittrans in the table which would help reduce this contention... also this is more of a problem also because it is a rac.. will this help in minimizing the buffer busy waits?????

i had not taken level 7 statspack during that time, so couldnt give you the segment level wait event for a particular period and gave for the entire duration the instance was up..

Tom Kyte
May 22, 2005 - 8:22 am UTC

are you actually doing index full or index fast full scans?

You said:
.... rows in our OLTP system......

if your OLTP system is full scanning frequently, you have a problem there.

Also, the query has
WHERE c.account_category_code = '3110111105'


there are MANY reasons for buffer busy waits -- I listed but a few above, and none of those would be affected.

and I would say those cumulative numbers above, from the v$ view, are teeny tiny numbers. they are small, I would need more detail before saying "you have identified a performance bottleneck"




The comment you might have read was "if you have a right hand side index (say on a sequence) AND you hash partition on that key, you will have say 16 indexes now, the inserts will be spread over 16 indexes and concurrency on the right hand side will be 1/16th of what it was -- potentially reducing buffer busy waits".

but you can achieve the side with a reverse key index.

details..

whizkid, May 22, 2005 - 10:24 am UTC

Ok Tom.. let me get a bit organised and plan out everything properly before starting anything...

what details would you require to identify the performance problems? i look at the statspack and take the top5 segments in physical io / buffer busy waits to find the problem spot.. is this the correct approach? before giving you the details, maybe ill do research myself so as to not take your time..

another thing.. maybe this is a new question.. if so, do not answer and ill wait till the time you take new questions..
Once we have identified by problem spot, how do we dig deeper? this is a very general question.. for eg. if we have identified one segment suffering from buffer busy waits.. like you told there are lots of reasons why this can happen.. how do we narrow down and find the exact reason for it? currently i use statspack for query optimization.. but when it comes to the real stuff, really dont know how do we drill down to find the root cause.. have read some notes in metalink for reading statspack report but no help... your suggestions???

Tom Kyte
May 22, 2005 - 11:02 am UTC

my initial approach:

a) unless there is something plum obvious in the stats pack go directly to application that isn't running as fast as we'd like
b) trace it
c) tkprof it
d) see what it is actually waiting on.

good idea..

whizkid, May 23, 2005 - 12:43 am UTC

I use a small script to monitor the wait events

SQL> l
  1  select
  2  s.sid, s.serial#, p1, p2, p.spid, s.program, s.terminal, s.username,
  3  s.osuser osuser, w.event event
  4  from v$session_wait w,
  5       v$session s,
  6       v$process p
  7  where s.sid = w.sid
  8  and p.addr=s.paddr
  9  and (event not like '%SQL*Net%'
 10        and event not like '%timer%'
 11        and event not like '%ipc%'
 12        and event not like '%dblink%'
 13        and event not like '%jobq slave%'
 14        and event not like '%message%'
 15        and event not like '%SENDREQ%'
 16        and event not like '%wakeup time manager%'
 17        and event not like '%async disk IO%'
 18      )
 19* order by event, program
SQL>

here I can see the events for each session.. if I see some long running query, I trace it, but then the trace will miss the query that is currently running.. ours is combination of third party & own built application.. nobody in the application team knows where the query is getting fired from.. what will you do in this case?
 

Tom Kyte
May 23, 2005 - 8:28 am UTC

trace the application from the get go (trace the entire application)? I mean, what do you want to do?

you have all of the session information -- you cannot from that figure out "who" is running the application and ask them perhaps?

how does hash partitioning and the number of partitions affects DML

A reader, March 02, 2006 - 6:20 am UTC

Hi

Yesterday I hash partitioned a couple of tables, one to 16 partitions and the other 4.

I have a purge process which runs every night, after partitioning this process fails because of Snapshot Too Old. This did not happen in plain heap tables.

What is causing this error, hash partitioning or the number of partitions?

The delete statement is as follows:

delete from
(select ID
from t1 -- 4 partitions, t2 -- 16 partitions
where t1.id = t2.regid
and t1.status in (2, 3, 4, 5)
;

This run in 20 minutos in plain tables, with hash partitions took almost 4 hours and ended up with snapshot too old.



Tom Kyte
March 02, 2006 - 12:25 pm UTC

gee, I don't know.

given we don't know what the table is.
what indexes you have.
how it is partitioned (beyond "it is now hash")

basically - nothing.

partitioning is not fast=true, in fact, in most cases you must take care not to NEGATIVELY affect performance. Do you understand what you've done physically to the table/indexes and what Oracle must now do to access this information?

If you have access to my current book Expert Oracle database architecture - I go over the "physics of partitioning" (or effective oracle by design)

Where does additional get come from?

A reader, May 19, 2006 - 10:00 am UTC

>>basically with local indexes you need roughly one consistent get more than
normal global index..... now dont tell why would someone use local index at all
in OLTP

I have recently questioned the index strategy on a hash partitioned table in our system (local PK index). In some tests I observe the same behaviour as that which led to the above quote. My question is where does the additional consistent get come from?

Thanks.

Tom Kyte
May 19, 2006 - 12:16 pm UTC

throw me a bone guys.  EXAMPLES please.

IF partition elimination kicked in
AND we scan one local index partition
THEN
    the global index should have the same (or MORE since it could be bigger)
ELSE
IF partition elimination did not kick in and hence we hit many local indexes
THEN
    one should expect more lio's from the local index as you are doing N 
    range scans, not one
END IF


ops$tkyte@ORA10GR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   char(25)
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /

Table created.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> insert into t
  2  select to_date( '12-mar-2003','dd-mon-yyyy' )+mod(rownum,3), rownum, rownum
  3    from all_objects;

50083 rows created.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> exec dbms_stats.gather_table_stats(user,'T');

PL/SQL procedure successfully completed.

ops$tkyte@ORA10GR2> create index t_idx on t(x,y) LOCAL;

Index created.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> set autotrace traceonly statistics;
ops$tkyte@ORA10GR2> select * from t where x = 1 and dt = to_date('12-mar-2003','dd-mon-yyyy');

no rows selected


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          2  consistent gets
          1  physical reads
          0  redo size
        373  bytes sent via SQL*Net to client
        374  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

ops$tkyte@ORA10GR2> select * from t where x = 1 and dt = to_date('12-mar-2003','dd-mon-yyyy');

no rows selected


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        373  bytes sent via SQL*Net to client
        374  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

ops$tkyte@ORA10GR2> set autotrace off
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> drop index t_idx;

Index dropped.

ops$tkyte@ORA10GR2> create index t_idx on t(x,y) GLOBAL;

Index created.

ops$tkyte@ORA10GR2> set autotrace traceonly statistics;
ops$tkyte@ORA10GR2> select * from t where x = 1 and dt = to_date('12-mar-2003','dd-mon-yyyy');

no rows selected


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          2  consistent gets
          1  physical reads
          0  redo size
        373  bytes sent via SQL*Net to client
        374  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

ops$tkyte@ORA10GR2> select * from t where x = 1 and dt = to_date('12-mar-2003','dd-mon-yyyy');

no rows selected


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        373  bytes sent via SQL*Net to client
        374  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

ops$tkyte@ORA10GR2> set autotrace off
 

No bone needed...

A reader, May 20, 2006 - 6:05 pm UTC

The example you needed was already on the thread and it showed roughly one extra consistent get per fetched row with a local index over a global index. I was simply following this example up to say I'd experienced the same and was wondering why it happened. Hence the review starting with a quote from further up the thread. So follow up the quote and you'll see the example.

Tom Kyte
May 21, 2006 - 10:30 am UTC

you do understand that I read these fast, I read your comment, not the entire page. I don't have the context you have. I cannot read the entire page every time for every comment.

so, if you give me your example, I'll be glad to take a peek. I've given you mine. It is not true that there is always one extra IO. Give me something to work work and I'll be glad to respond.

but when I have to hit page up over 18 times (i've got a big big screen, so you would likely have to hit it more than 18) to try and figure out what the context is, I move on...

Give me a test case to work with, I haven't gotten one yet. Neither from you or from the original poster.


And I'll likely come right back to what I said:

"But your logic is doubly flawed - you cannot prove something is "evil" in that
fashion -- you just had a bad design decision there. Used in other ways, that
local index is going to be simply AWESOME and a way to increase scalability,
benefit performance, all of the good things. "



Test case

A reader, May 21, 2006 - 7:26 pm UTC

Here's a test case. Nested loop into a hash partitioned table. Both tables have 100K rows. Autotrace once with local unique index and once with global unique index. Local has nearly 100K more consistent gets. Question is why? What is the partition iterator step in the plan doing to generate the additional get?

SQL> create table t1 as select rownum as x from dual connect by rownum < 100000;

Table created.

SQL> create table t2 partition by hash(x) partitions 16 as select x, x as y from t1;

Table created.

SQL> create unique index it2 on t2(x) local;

Index created.

SQL> exec dbms_stats.gather_table_stats(user,'T1');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(user,'T2',cascade=>true);

PL/SQL procedure successfully completed.

SQL> set autotrace traceonly
SQL> select /*+ ordered use_nl(t2) */
  2         t1.x, t2.y
  3  from   t1, t2
  4  where  t1.x = t2.x;

100000 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1872442621

------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |      |   100K|  1269K|   100K  (1)| 00:20:03 |       |       |
|   1 |  NESTED LOOPS                       |      |   100K|  1269K|   100K  (1)| 00:20:03 |       |       |
|   2 |   TABLE ACCESS FULL                 | T1   |   100K|   390K|    39   (3)| 00:00:01 |       |       |
|   3 |   PARTITION HASH ITERATOR           |      |     1 |     9 |     1   (0)| 00:00:01 |   KEY |   KEY |
|   4 |    TABLE ACCESS BY LOCAL INDEX ROWID| T2   |     1 |     9 |     1   (0)| 00:00:01 |   KEY |   KEY |
|*  5 |     INDEX UNIQUE SCAN               | IT2  |     1 |       |     0   (0)| 00:00:01 |   KEY |   KEY |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("T1"."X"="T2"."X")


Statistics
----------------------------------------------------------
        385  recursive calls
          0  db block gets
     306883  consistent gets
        214  physical reads
          0  redo size
    1918220  bytes sent via SQL*Net to client
      73695  bytes received via SQL*Net from client
       6668  SQL*Net roundtrips to/from client
         12  sorts (memory)
          0  sorts (disk)
     100000  rows processed

SQL>
SQL> set autotrace off
SQL>
SQL> drop index it2;

Index dropped.

SQL> create unique index it2 on t2(x) compute statistics;

Index created.

SQL>
SQL> set autotrace traceonly
SQL> select /*+ ordered use_nl(t2) */
  2         t1.x, t2.y
  3  from   t1, t2
  4  where  t1.x = t2.x;

100000 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 707801250

------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |      |   100K|  1269K|   100K  (1)| 00:20:03 |       |       |
|   1 |  NESTED LOOPS                       |      |   100K|  1269K|   100K  (1)| 00:20:03 |       |       |
|   2 |   TABLE ACCESS FULL                 | T1   |   100K|   390K|    39   (3)| 00:00:01 |       |       |
|   3 |   TABLE ACCESS BY GLOBAL INDEX ROWID| T2   |     1 |     9 |     1   (0)| 00:00:01 | ROWID | ROWID |
|*  4 |    INDEX UNIQUE SCAN                | IT2  |     1 |       |     0   (0)| 00:00:01 |       |       |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."X"="T2"."X")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     213492  consistent gets
        264  physical reads
          0  redo size
    1918220  bytes sent via SQL*Net to client
      73695  bytes received via SQL*Net from client
       6668  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     100000  rows processed 

Tom Kyte
May 21, 2006 - 8:32 pm UTC

well, if we let the cbo do it's job in this case - neither index would be used...

but TKPROF shows us what is different here:

select /*+ ordered use_nl(t2) local */ t1.x, t2.y from t1, t2 where t1.x = t2.x

Rows Row Source Operation
------- ---------------------------------------------------
99999 NESTED LOOPS (cr=306821 pr=0 pw=0 time=3700415 us)
99999 TABLE ACCESS FULL T1 (cr=6824 pr=0 pw=0 time=300112 us)
99999 PARTITION HASH ITERATOR PARTITION: KEY KEY (cr=299997 pr=0 pw=0 ti..
99999 TABLE ACCESS BY LOCAL INDEX ROWID T2 PARTITION: KEY KEY (cr=299997 ..
99999 INDEX UNIQUE SCAN IT2 PARTITION: KEY KEY (cr=199998 pr=0 pw=0 t...
********************************************************************************
select /* local */ * from t2 where t2.x = 1

Rows Row Source Operation
------- ---------------------------------------------------
1 PARTITION HASH SINGLE PARTITION: 8 8 (cr=3 pr=0 pw=0 time=142 us)
1 TABLE ACCESS BY LOCAL INDEX ROWID T2 PARTITION: 8 8 (cr=3 pr=0 pw=0
1 INDEX UNIQUE SCAN IT2 PARTITION: 8 8 (cr=2 pr=0 pw=0 time=74
********************************************************************************
select /*+ ordered use_nl(t2) global */ t1.x, t2.y from t1, t2 where t1.x = t2.x

Rows Row Source Operation
------- ---------------------------------------------------
99999 NESTED LOOPS (cr=213490 pr=264 pw=0 time=2600410 us)
99999 TABLE ACCESS FULL T1 (cr=6824 pr=0 pw=0 time=200088 us)
99999 TABLE ACCESS BY GLOBAL INDEX ROWID T2 PARTITION: ROW LOCATION ROW LOCATION (cr=206666 pr=264 pw=0 time=1965070 us)
99999 INDEX UNIQUE SCAN IT2 (cr=106667 pr=264 pw=0 time=963424 us)(object id 65900)
********************************************************************************
select /* global */ * from t2 where t2.x = 1

Rows Row Source Operation
------- ---------------------------------------------------
1 TABLE ACCESS BY GLOBAL INDEX ROWID T2 PARTITION: 8 8 (cr=3 pr=0 pw=0 time=58 us)
1 INDEX UNIQUE SCAN IT2 (cr=2 pr=0 pw=0 time=36 us)(object id 65900)



The extra work was done as we hip-hopped from index partition to index partition. We lost the ability to hang onto the index root - we had to restart the range scan as we flipped from row to row during the full scan.

It is not that you always have to do "more", just in this sort of case - especially with the massive hip hopping and the lack of partition elimination - it is true.




Alexander the ok, June 06, 2006 - 3:20 pm UTC

Tom,

Can you grant to partitions or does that have to be done at the table level? The docs lead me to believe you would have to create a view of the partition.

Also, for restricting who can see what data, would you recommend partitioning the tables or using FGAC for a small OLTP? Thank you.

Tom Kyte
June 06, 2006 - 3:27 pm UTC

you would have to create a view of the partition, yes.


partitioning and FGAC are complimentary technologies??

partitioning cannot restrict who can see what. fgac can.

partitioning can be used to create physical separation/clustering of related data. fgac cannot.

Partitioning is now complimentary? Wow!!!

Vladimir Andreev, June 07, 2006 - 10:01 am UTC

<quote>
partitioning and FGAC are complimentary technologies??
</quote>

My local Oracle Sales rep insists that they are not, so may I quote you when I go to renew my Oracle license? ;-)

complimentary

adj 1: conveying or resembling a compliment; "a complimentary remark" [ant: uncomplimentary] 2: costing nothing; "complimentary tickets"

Flado
(jokingly)

Alexander the ok, June 07, 2006 - 2:23 pm UTC

Tom,

I have researched FGAC a little and I was able to create a test case to suit our needs. However, it looks as though the ad_policy procedure basically just throws an additional predicate on to whatever query hits your table. Won't this dramatically effect the performance of your application?

Say we have big table t. User a queries select * from t where sport = 'basketball' and that returns most of the rows from t, we get a FTS, all is well.

Now we have to implement our new policy on t for user b. User b queries select * from t where sport = 'basketball' but the policy also sticks a "and player = 'Paul Pierce'" at the end. Suppose there are few records returned for Paul, now an index would be needed and now the query sluggish. I hope that is the correct example but you probably get my drift?

Is this where you say, "And this is why you would deploy this in your test environment first....."?

Tom Kyte
June 07, 2006 - 3:37 pm UTC

...
Won't this
dramatically effect the performance of your application?
...

or dramatically speed it up
or do nothing at all to performance

"sure".


look at your example, the changing of:

select * from t where sport = 'basketball'

into
select * from t where sport = 'basketball' and player = 'Paul Pierce'

would almost certainly run FASTER - regardless of the indexing scheme since so much less data is transfered to the client.


And further - you have the opportunity now to index play and make it even better.


Last paragraph - words to live by :)


partition pruning

Ryan, July 10, 2006 - 3:43 pm UTC

If I have a tkprof output that is accessing a global temp table joined to a hash partitioned table and I see the following row source, am I getting partition pruning?

INDEX RANGE SCAN HASH_TEST PARTITION: KEY KEY

The datwarehouse guide says that KEY(I) in the explain plan means that Oracle will decide at run time to eliminate partitions. However, most of the time I see a partition number if the partition has been eliminated?

Tom Kyte
July 11, 2006 - 7:24 pm UTC

it doesn't know at parse time what partition(s) will be used since the data used to eliminate the partitions doesn't exist then! (it is in the temporary table at runtime right...)

so, it is saying "we'll eliminate, but we don't know what we will eliminate yet"

thanks

Ryan, July 12, 2006 - 1:19 pm UTC

so do I have to go look at the blocks in the raw trace file to see if partition elimination occurred. is there an easier way?

Tom Kyte
July 12, 2006 - 3:58 pm UTC

it *is* occuring, it says so right there in the plan!

depending on the data used by the query - different partitions will be eliminated (eg: sometimes maybe all of them are involved, sometimes NONE OF THEM, sometimes 1, sometimes 2 - it is all about the data encountered at runtime)

looking at blocks in the trace won't be conclusive (LOGICAL IO)

global vs. local indexes on hash partitioned tables.

Ryan, July 17, 2006 - 5:08 pm UTC

Here is my test

1. global temp table with 5,000 rows.
2. hash partitioned table. Partitioned on primary key with 8 partitions with 20 million rows.
3. All the rows in a GTT are rows from partition 1 of the hash partitioned table.
4. The primary key values are perfectly ordered in both the GTT and the hash partition table.

insert into gtt2
select cols...
from gtt a, hash_table b
where a.pk = b.pk

I ran the following test:
1. Local index on primary key
2. global index on primary key.

I found that the global index produces about 2/3s of the logical IO that the same query run against a local index does. We have repeatedly this several times.
I get the same plan accept for the hash iterator.

Nested Loop
Fast Full scan of GTT
Index Range scan of Hash Partition table

If all the rows in the GTT are in one hash partition why would I get more LIOs with a global index? I also found that when I explicitly tell Oracle what partition to use in the from clause I do not get a reduction in LIOs for the local indexes which tells me that Oracle is not having any problem with partition pruning.

Tom Kyte
July 18, 2006 - 7:59 am UTC

I don't see a test to test with ?

in any case, use tkprof, statistics_level set to all and see where the additional IO's are coming from, what part of the steps is doing it.

Part of your advice is being used but...

Reader, November 26, 2007 - 11:24 am UTC

Tom,
We work on an Oracle Retail product and we have an item_loc table:
CREATE TABLE ITEM_LOC
(
  ITEM                  VARCHAR2(25 BYTE)       NOT NULL,
  LOC                   NUMBER(10)              NOT NULL,
  ITEM_PARENT           VARCHAR2(25 BYTE),
  ITEM_GRANDPARENT      VARCHAR2(25 BYTE),
  LOC_TYPE              VARCHAR2(1 BYTE)        NOT NULL,
  UNIT_RETAIL           NUMBER(20,4),
  SELLING_UNIT_RETAIL   NUMBER(20,4),
  SELLING_UOM           VARCHAR2(4 BYTE),
  PROMO_RETAIL          NUMBER(20,4),
  PROMO_SELLING_RETAIL  NUMBER(20,4),
  PROMO_SELLING_UOM     VARCHAR2(4 BYTE),
  CLEAR_IND             VARCHAR2(1 BYTE)        NOT NULL,
  TAXABLE_IND           VARCHAR2(1 BYTE)        NOT NULL,
  LOCAL_ITEM_DESC       VARCHAR2(100 BYTE)      NOT NULL,
  LOCAL_SHORT_DESC      VARCHAR2(20 BYTE),
  TI                    NUMBER(12,4),
  HI                    NUMBER(12,4),
  STORE_ORD_MULT        VARCHAR2(1 BYTE)        NOT NULL,
  STATUS                VARCHAR2(1 BYTE)        NOT NULL,
  STATUS_UPDATE_DATE    DATE                    NOT NULL,
  DAILY_WASTE_PCT       NUMBER(12,4),
  MEAS_OF_EACH          NUMBER(12,4),
  MEAS_OF_PRICE         NUMBER(12,4),
  UOM_OF_PRICE          VARCHAR2(4 BYTE),
  PRIMARY_VARIANT       VARCHAR2(25 BYTE),
  PRIMARY_COST_PACK     VARCHAR2(25 BYTE),
  PRIMARY_SUPP          NUMBER(10),
  PRIMARY_CNTRY         VARCHAR2(3 BYTE),
  RECEIVE_AS_TYPE       VARCHAR2(1 BYTE),
  CREATE_DATETIME       DATE                    NOT NULL,
  LAST_UPDATE_DATETIME  DATE                    NOT NULL,
  LAST_UPDATE_ID        VARCHAR2(30 BYTE)       NOT NULL,
  SOURCE_METHOD         VARCHAR2(1 BYTE),
  SOURCE_WH             NUMBER(10)
)
PARTITION BY HASH (LOC) 
(  
  PARTITION ITEM_LOC_PH1,  
  PARTITION ITEM_LOC_PH2,  
  PARTITION ITEM_LOC_PH3,  
  PARTITION ITEM_LOC_PH4,  
  PARTITION ITEM_LOC_PH5,  
  PARTITION ITEM_LOC_PH6,  
  PARTITION ITEM_LOC_PH7,  
  PARTITION ITEM_LOC_PH8,  
  PARTITION ITEM_LOC_PH9,  
  PARTITION ITEM_LOC_PH10,  
  PARTITION ITEM_LOC_PH11,  
  PARTITION ITEM_LOC_PH12,  
  PARTITION ITEM_LOC_PH13,  
  PARTITION ITEM_LOC_PH14,  
  PARTITION ITEM_LOC_PH15,  
  PARTITION ITEM_LOC_PH16
)
NOCOMPRESS 
NOCACHE
NOPARALLEL
NOMONITORING;


Prompt Index PK_ITEM_LOC;
CREATE UNIQUE INDEX PK_ITEM_LOC ON ITEM_LOC
(LOC, ITEM)
  LOCAL (  
  PARTITION ITEM_LOC_PH1,  
  PARTITION ITEM_LOC_PH2,  
  PARTITION ITEM_LOC_PH3,  
  PARTITION ITEM_LOC_PH4,  
  PARTITION ITEM_LOC_PH5,  
  PARTITION ITEM_LOC_PH6,  
  PARTITION ITEM_LOC_PH7,  
  PARTITION ITEM_LOC_PH8,  
  PARTITION ITEM_LOC_PH9,  
  PARTITION ITEM_LOC_PH10,  
  PARTITION ITEM_LOC_PH11,  
  PARTITION ITEM_LOC_PH12,  
  PARTITION ITEM_LOC_PH13,  
  PARTITION ITEM_LOC_PH14,  
  PARTITION ITEM_LOC_PH15,  
  PARTITION ITEM_LOC_PH16
)
NOPARALLEL
REVERSE;

I know we have 435 possible values for loc. So thats about 27 locs per partition.
Then the index is global on loc, item.
There are 403,000,000 plus rows in production for this table.

The DBAs do the below for stats
ownname (schema owner)
tabname (table name)
partname (partition name) - default value of null
estimate_percent - 10%
degree - 5
cascade - true

If I have predicates that are "where item_loc.loc = :x" then I should see in my autotrace results partition elimination should'nt I? As long as I lead with loc I should be fine?

We have massive batch jobs that run nightly ( most in ProC )

Tom Kyte
November 26, 2007 - 1:10 pm UTC

... If I have predicates that are "where item_loc.loc = :x" then I should see in my autotrace results partition elimination should'nt I? As long as I lead with loc I should be fine? ....

depends on what autotrace you are using 9i or 10g and above. The output is very different.

use DBMS_XPLAN to explain (which is what 10g sqlplus uses) and you will see it in 9i too.

Composite partitioning

Amit, February 05, 2010 - 1:12 am UTC

Hi Tom
Your posting is really helpful.
I am working on a partitioning design for my oltp table which is having 2 billion record and is around 1.6TB in size.
I have a table transaction which has
transaction_id number primary key
cache_item_id number
create_date date
last_updated date
account_id number

I am looking at 2 option for partitioning

1) hash partition on cache_item_id as this is the column which is used most places but it doesnot give me the flexiblity to drop any parition

2) composite range-hash
I want to parition by create_date and sub partition by cache_item_id.Here I am getting confused looking at the explain plan for the query

select * from transaction where cache_item_id=57575757

SELECT STATEMENT | | 1 | 394 | 18 (0)| 00:00:01 | |
|

| 1 | PARTITION RANGE ALL | | 1 | 394 | 18 (0)| 00:00:01 | 1 | 8
|

| 2 | PARTITION HASH SINGLE | | 1 | 394 | 18 (0)| 00:00:01 | 3 | 3
|

| 3 | TABLE ACCESS BY LOCAL INDEX ROWID| TRANSACTION | 1 | 394 | 18 (0)| 00:00:01 | |
|

|* 4 | INDEX RANGE SCAN | BANKTXN_CIDLSTUPDTTXNCATID_INT | 1 | | 17 (0)| 00:00:01 | |
|

In this explain plan i see partition range all but its doing an partition hash single .Is the range all scan bad or its fine
Tom Kyte
February 07, 2010 - 11:53 pm UTC

you didn't tell me about the index :(

how about a small create table and create index to detail what is really structurally in place..

A reader, February 09, 2010 - 3:59 pm UTC

The index is on cache_item_id,last_updated column
Tom Kyte
February 15, 2010 - 2:58 pm UTC

read last lines above again

Hash partitioning

Sam, June 22, 2010 - 9:59 pm UTC

I have a Tabl A ( Person_id, Account_no, ......)

Persion_id, Account_no,.....
1 , 100
1 , 300
1 , 400
2 , 500
2 , 600
3 , 700

The table has 200 million records
WHAT partition scheme makes the proper design
HASH PARTITION ON PERSON_ID
and local index on ACCOUNT_NO

So, that the data when the accounts numbers are searched
for PERSON_id, it will be retrieved from the same partition

Thanks

Tom Kyte
June 24, 2010 - 6:28 am UTC

what is the where clause? It is not clear.

do you use "where account_no = ?"

or do you use "where account_no = ? and person_id = ?"


If the former, then you want a global index on account_no, else we'd have to search EVERY local index partition (N range scans per query where N = number of partitions, instead of ONE range scan per query)

If the latter, then a local index on account_no would be fine, we'd be able to perform local index partition elimination, range scan a single index partition and get the data.

partioning

babloo, July 06, 2011 - 9:42 am UTC

Hi Tom,
we have a table in OLTP system where most of the queries are on release id which is a sequence generated number. Table will be heavily used for inserts for updates in real time from UI screens. the table has batch jobs reading from it too for unprocessed data.
Finally I need to purge this data.
I am thinking of partitning the table by create_date to enable the purge (older partiiions can be dropped)and for batch jobs( almost unprocessed data would be of last two days)
I want to have a global index on release id .
Should I have the global index hash partitioned on release id .
It seems to me that I should not partition it as select * form table where release_id=:rel_id may need to go in n Hash partitions
the other option is to partition by vendor id as the release_id always belong to a vendor. But there will be thousands of vendors in our system. May be a range partion on Vendor will work . what do you think

Tom Kyte
July 08, 2011 - 11:30 am UTC

if you use "release_id = :rel_id", you may hash partition that global index - we will index partition eliminate all but one index partition - get a global index - and access the data.

if you use "release_id between ? and ?" or release_id > or release_id < (range scans) then the hash partitioning would not be a good idea.


if you want to partition for purging, I don't see why you would consider to partition by vendor? What would that accomplish for you?

number of partitions

babloo, July 08, 2011 - 7:52 pm UTC

Thank you Tom,
we do not use any where clase with release id checking for a range.
Hi Tom ,
we do not use any where clase with release id checking for a range.
so will go with global index on release id and hash partitiom it.
while for purge , will partition the table on create_date.

I left your book at work , so aplogize if asking obvious question,

1. any criteria on number of hash partitions?
2. gloabl index would become unsable and we will use UPDATE INDEXES : how much is the performance impact of that
3. if the index on release id was local, that would create the scenario where release_id=:x needs to look n parttions. Am I right
Tom Kyte
July 11, 2011 - 1:42 pm UTC

1) powers of two, 2, 4, 8, 16, 32, ... and so on

2) it is an online operation - so from the end use perspective, it will not impact them (other than the fact the index is being maintained which will be extra work on the system of course). It will generate extra redo, undo and will consume CPU. The same as it would to maintain the index if you did a delete or whatnot. It is a function of the amount of work being done.

3) yes, since you do not include the partition key in the where clause, we'd have to look at every index.

Partition Split and Global Indexes

Rajeshwaran, Jeyabal, July 01, 2012 - 2:45 pm UTC

Tom:
We have a table like below with 7 global index.
We got some 10 distinct values(with each values having half to one million records) into this default partition which we are planning to split now (during weekends).

approach#1
a) alter all global index and set un-usable.
b) do the partition pmax split here (for all 10 values).
c) then rebuild all global index once.

approach#2
a) do the partition pmax split here along with UPDATE GLOBAL INDEXES clause

Which approach would you prefer? and why?

create table t
( c1 number ,
  c2 date,
  c3 varchar2(700),
  c4...c17  )
partition by list(c1)
( partition p1 values (1),
  partition p2 values (2),
  partition p3 values (3),
  partition pmax values (default) );

Tom Kyte
July 02, 2012 - 7:40 am UTC

Which approach would you prefer? and why?


it depends of course, if one were always preferable, we would not have invented the others.


Do you need continuous uptime? Then approach 2

It is ok to take an outage long enough to do the entire split operation and index rebuild? Then approach 1 (a modified approach 1)... Use a create table as select to select from the default partition into a new partitioned table (or insert /*+ APPEND */ from it), or use a multi-table insert append to load 10 tables - don't split 10 times - then just exchange them. better than splitting over and over and over


More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.