When should a table be partitioned? Version 8.1.7.0.0
April 29, 2002 - 3pm Central time zone
Reviewer: Drew from Trevose, PA USA
Hi Tom,
We have some tables that have a few million records and keep growing. We may need to partition
these tables. My question is: based on your knowledge, when do we need to think of partitioning a
table, e.g., about howmany records? How can we tell if a table needs partitioning or performance
problems are caused by not partitioning the table? Thank you very much!
Followup April 29, 2002 - 8pm Central time zone:
It is not necessarily a number of records or even a size in gigabytes -- it is whether the things
you do would be more easily done with it.
do you need to age data off? classic partitioning example...
could you benefit from partition elimination? eg: all or most of your queries always use some date
range, or some criteria. Rather then adding that to each index, you could partition by it.
are your tables getting larger then you feel comfortable managing? eg: it might take longer to
restore a 100gig tablespace then 1-10 gig tablespace (and the other 90gig of data is online whilst
doing this)
and so on. It is when the benefits of partitioning become tangible to you -- that is when
partitioning should be used.

February 19, 2003 - 1pm Central time zone
Reviewer: A reader
Hi Tom,
Am reading Datawarehousing guide as per your suggestion and its helping me a lot. Thanks tom. I
have some doubts.
http://otn.oracle.com/docs/products/oracle9i/doc_library/release2/server.920/a96520/transfor.htm#131
38
As per docs
<quote>
Parallel Load Scenario
This section presents a case study illustrating how to create, load, index, and analyze a large
data warehouse fact table with partitions in a typical star schema. This example uses SQL*Loader to
explicitly stripe data over 30 disks.
The example 120 GB table is named facts.
The system is a 10-CPU shared memory computer with more than 100 disk drives.
Thirty disks (4 GB each) are used for base table data, 10 disks for indexes, and 30 disks for
temporary space. Additional disks are needed for rollback segments, control files, log files,
possible staging area for loader flat files, and so on.
The facts table is partitioned by month into 12 partitions. To facilitate backup and recovery, each
partition is stored in its own tablespace.
Each partition is spread evenly over 10 disks, so a scan accessing few partitions or a single
partition can proceed with full parallelism. Thus there can be intra-partition parallelism when
queries restrict data access by partition pruning.
Each disk has been further subdivided using an operating system utility into 4 operating system
files with names like /dev/D1.1, /dev/D1.2, ... , /dev/D30.4.
Four tablespaces are allocated on each group of 10 disks. To better balance I/O and parallelize
table space creation (because Oracle writes each block in a datafile when it is added to a
tablespace), it is best if each of the four tablespaces on each group of 10 disks has its first
datafile on a different disk. Thus the first tablespace has /dev/D1.1 as its first datafile, the
second tablespace has /dev/D4.2 as its first datafile, and so on, as illustrated in Figure 13-5.
Step 1: Create the Tablespaces and Add Datafiles in Parallel
The following is the command to create a tablespace named Tsfacts1. Other tablespaces are created
with analogous commands. On a 10-CPU machine, it should be possible to run all 12 CREATE TABLESPACE
statements together. Alternatively, it might be better to run them in two batches of 6 (two from
each of the three groups of disks).
CREATE TABLESPACE TSfacts1
DATAFILE /dev/D1.1' SIZE 1024MB REUSE,
DATAFILE /dev/D2.1' SIZE 1024MB REUSE,
DATAFILE /dev/D3.1' SIZE 1024MB REUSE,
DATAFILE /dev/D4.1' SIZE 1024MB REUSE,
DATAFILE /dev/D5.1' SIZE 1024MB REUSE,
DATAFILE /dev/D6.1' SIZE 1024MB REUSE,
DATAFILE /dev/D7.1' SIZE 1024MB REUSE,
DATAFILE /dev/D8.1' SIZE 1024MB REUSE,
DATAFILE /dev/D9.1' SIZE 1024MB REUSE,
DATAFILE /dev/D10.1 SIZE 1024MB REUSE,
DEFAULT STORAGE (INITIAL 100MB NEXT 100MB PCTINCREASE 0);
...
CREATE TABLESPACE TSfacts2
DATAFILE /dev/D4.2' SIZE 1024MB REUSE,
DATAFILE /dev/D5.2' SIZE 1024MB REUSE,
DATAFILE /dev/D6.2' SIZE 1024MB REUSE,
DATAFILE /dev/D7.2' SIZE 1024MB REUSE,
DATAFILE /dev/D8.2' SIZE 1024MB REUSE,
DATAFILE /dev/D9.2' SIZE 1024MB REUSE,
DATAFILE /dev/D10.2 SIZE 1024MB REUSE,
DATAFILE /dev/D1.2' SIZE 1024MB REUSE,
DATAFILE /dev/D2.2' SIZE 1024MB REUSE,
DATAFILE /dev/D3.2' SIZE 1024MB REUSE,
DEFAULT STORAGE (INITIAL 100MB NEXT 100MB PCTINCREASE 0);
...
CREATE TABLESPACE TSfacts4
DATAFILE /dev/D10.4' SIZE 1024MB REUSE,
DATAFILE /dev/D1.4' SIZE 1024MB REUSE,
DATAFILE /dev/D2.4' SIZE 1024MB REUSE,
DATAFILE /dev/D3.4 SIZE 1024MB REUSE,
DATAFILE /dev/D4.4' SIZE 1024MB REUSE,
DATAFILE /dev/D5.4' SIZE 1024MB REUSE,
DATAFILE /dev/D6.4' SIZE 1024MB REUSE,
DATAFILE /dev/D7.4' SIZE 1024MB REUSE,
DATAFILE /dev/D8.4' SIZE 1024MB REUSE,
DATAFILE /dev/D9.4' SIZE 1024MB REUSE,
DEFAULT STORAGE (INITIAL 100MB NEXT 100MB PCTINCREASE 0);
...
CREATE TABLESPACE TSfacts12
DATAFILE /dev/D30.4' SIZE 1024MB REUSE,
DATAFILE /dev/D21.4' SIZE 1024MB REUSE,
DATAFILE /dev/D22.4' SIZE 1024MB REUSE,
DATAFILE /dev/D23.4 SIZE 1024MB REUSE,
DATAFILE /dev/D24.4' SIZE 1024MB REUSE,
DATAFILE /dev/D25.4' SIZE 1024MB REUSE,
DATAFILE /dev/D26.4' SIZE 1024MB REUSE,
DATAFILE /dev/D27.4' SIZE 1024MB REUSE,
DATAFILE /dev/D28.4' SIZE 1024MB REUSE,
DATAFILE /dev/D29.4' SIZE 1024MB REUSE,
DEFAULT STORAGE (INITIAL 100MB NEXT 100MB PCTINCREASE 0);
</quote>
My question is should'nt the second tablespace's first datafile be on /dev/D2.2
Am not getting the math here how come /dev/D4.2
Thanks for all your help.
Followup February 19, 2003 - 3pm Central time zone:
kidding right?
if it were me, there would be one big device and someone else would worry about striping it for me.
The example is a little botched as well -- a 100mb extent size in a datafile of 1024mb ? hmm, 24mb
of wastage.
But anyway, it is just an example - I'm not really going to dig into their theoretical disk layout
if you don't mind ;)

February 20, 2003 - 9am Central time zone
Reviewer: A reader
Tom,
Thanks for the repy.Can you please explain what do you mean by one big device.
Followup February 20, 2003 - 10am Central time zone:
just give me a big mount point -- /tom/put/your/data/here -- with inifinite storage.
don't give me
/tom/put/it/here
/tom/or/put/it/here
/tom/or/maybe/over/here
..........
/tom/now/your/really/confused
;)
big mount point with inifinite storage and the hardware guys in the background moving disks or
whatever they use these days to get all of the little lights on the disk array flashing equally.
I'll put all of my data there. I'll still think about redo and arch and control files (don't ever
want to lose them and need them to be really super optimized in many cases -- no contention
desired)....

February 20, 2003 - 10am Central time zone
Reviewer: A reader
Thanks Tom. I like your style of answering questions ;)
? re: Table Partition
April 21, 2003 - 7pm Central time zone
Reviewer: VJ from Freedom Land
Hi Tom,
I want to create a table which has two fields id number, flag char(1) with partition on flag. The
values in the flag will be 'R' and 'T'. so the data would look like
Id Flag
-----------
1 R
2 R
3 R
4 R
5 R
6 R
7 R
8 R
9 R
10 R
11 T
12 T
13 T
14 T
15 T
.............
So i want two partitions one for flag='R' and one for flag='T'. Can you give me an example how to
achieve this.
Thanks a million in advance
VJ
Followup April 21, 2003 - 10pm Central time zone:
ops$tkyte@DEV816> CREATE TABLE table2000
2 ( x int,
3 y int,
4 z DATE
5 )
6 PARTITION BY RANGE (z)
7 (
8 PARTITION p1 VALUES LESS
9 THAN('S'),
14 PARTITION p2 VALUES LESS
15 THAN('U')
16 )
17 /
Mixed datatype in partition key
April 22, 2003 - 2am Central time zone
Reviewer: A reader
This table creation in 8.1.7.4 gives an ORA-01858: a non-numeric character was found where a
numeric was expected.
Followup April 22, 2003 - 7am Central time zone:
so use a little initiative, take the obvious cut and paste, fix the datatype of Z to be appropriate
for you.
My point here was -- the question was already answered above, i had an example of a very very
simple partition clause -- you need to read about partitioning.
I can see what is going to happen here -- you will apply partitioning, not understand much about
it, system performance will go down the tubes, you will blame partitioning and another myth starts
(partitioning is 'complex', partitioning is 'slow').
This is a trivial syntax question ...
ops$tkyte@ORA920LAP> CREATE TABLE table2000
2 ( x int,
3 y int,
4 z varchar2(5)
5 )
6 PARTITION BY RANGE (z)
7 (
8 PARTITION p1 VALUES LESS
9 THAN('S'),
10 PARTITION p2 VALUES LESS
11 THAN('U')
12 )
13 /
Table created.
suggest you read a ton more before going down this path.

June 16, 2003 - 4am Central time zone
Reviewer: sara from Canada
Hi
I am a biginer to Oracle. As I am trying to partitioning a table it says :
ORA-00439: feature not enabled: Partitioning
how to enable the partitioning feature.
sorry for the basic question.
Followup June 16, 2003 - 8am Central time zone:
you need to
a) be on the enterprise edition
b) have purchased it
if you have a & b, then using OUI (the Oracle Universal Installer) is the way to enable it.
if either of a or b is not true, you don't have it.
indexes while partitioning
June 30, 2003 - 1pm Central time zone
Reviewer: Vipin from New york
Hi Tom,
Could you please tell me when should one consider about the Global indexes and when about the local
indexes while partitioning there tables.
Followup June 30, 2003 - 2pm Central time zone:
well, think about how you use them. but search for
global local performance
to read some articles about them.
Partitioning Strategy
September 21, 2003 - 3am Central time zone
Reviewer: A reader
Hi Tom,
Say if I have two very large tables transactions and transaction items, which I want to partition.
txns(txnid, txndate, txntype)
txnitems (txnid, amount, accountid)
Most queries are for the sum of amount within certain date range.
I can range partion transactions table by transaction date, but how should I partition the
transaction items table? Is it possible that I partition it using the txndate in the txns table? Or
do you think it is helpful to partition the txns table by range partition (txndate) hash
subpartition (txnid) and hash partition (txnid) on the txns items table?
Followup September 21, 2003 - 9am Central time zone:
"Most queries are for the sum of amount within certain date range."
sounds like a screaming call for a teeny tiny bit of denormalization doesn't it?
think about the amount of work you could save every day if you made it so that most queries consume
50% or less resources!!! You might carry that date down into the details.
I do not blindly recommend partitioning schemes. I wrote about this in my last book "Effective
Oracle By Design". Partitioning tables, without knowledge of how they are used (for real, not just
a single sentence), is a recipe for certain disaster!
given as it stands, range partitioning txns by date SEEMS to make sense and it would follow that
hash partitioning TXNITEMS by TXNID MIGHT make sense -- but one cannot say for sure.
How to get the archived off data back to a partitioned table
September 24, 2003 - 6pm Central time zone
Reviewer: AD from UK
Tom,
I have a partitioned table with partitioned key on the year_month.
e.g. 200301 - for jan 2003's data
200302 - for feb 2003's data etc.
Partitioning is implemented in such a way that anything older than 2 yrs of data is archieved off
to a table.
e.g. A_arch_200106 - archived data for jun 2001
A_arch_200107 - archived data for jul 2001 etc.
Where A is the table name.
Now, I have a requirement to get the archived off data back to the partitioned table A, so that
A_arch_200106 will represent data for partitioned 200106, A_arch_200107 will represent data for
partitioned 200107 etc. in the same way they were in the table before the archiving took place.
What should be my approach to handle this?
Thanks & Regards
Followup September 25, 2003 - 5am Central time zone:
to archive the data you did an alter table exchange partition right -- so you swapped a totally
empty table with the full partition -- ending up with a full table and an empty partition.
Now, you could have either:
a) left the empty partition -- in which case -- you just alter exchange back and you are done
b) dropped the empty partition -- in which case you must now SPLIT the last partition in two,
resulting in an empty and full partition and then exchange the new empty slice for the full table
you've restored
c) merged the empty partition in to a "mega" empty partition which can be split very fast since it
is empty -- repeat step b) above.
Partition Granule
October 23, 2003 - 7pm Central time zone
Reviewer: A reader
Hi Tom,
Our database stores history transactions of about 100 years. The typical queries are reports
searching data within range of 1 year, 5 years, 10 years, and 20 years, start from any calendar
year.
I am considering to partition the table by transaction date.
What factors will you consider on how big the partition granule should be?
I am considering to partition the data by 1 year or 5 years. Which one do you recommend?
Followup October 23, 2003 - 8pm Central time zone:
maybe by the month.
else many queries might have to hit 24 months of data -- instead of just 12.
Partitioned table and Replication
October 27, 2003 - 4am Central time zone
Reviewer: Alvin from Philippines
I have both your books. And i just implemented partitioning on one big table. My problem is since
partitioning my table i didn't define any columns as unique or as primary key.
My problem now is i want to replicate that table on another machine for read only purposes and
oracle 8i docs states that
"Considerations for Replicated Tables
The following sections discuss considerations for tables you plan to use in a replication
environment.
Primary Keys
If possible, each replicated table should have a primary key. Where a primary key is not possible,
each replicated table must have a set of columns that can be used as a unique identifier for each
row of the table. If any of the tables that you plan to use in your replication environment do not
have a primary key or a set of unique columns, alter these tables accordingly. In addition, if you
plan to create any primary key snapshots based on a master table, that master table must have a
primary key.
"
Do i need PK defined ? or does oracle documentation stated this just to ease on conflict resolution
?
Followup October 27, 2003 - 7am Central time zone:
why do you have a table with no primary or unique constraints???????
Partitioned table, Primary Keys and Replication
October 27, 2003 - 9pm Central time zone
Reviewer: Alvin from Philippines
Sorry i got confused. I re-read on pp643 (Local indexes and Uniqueness)of Expert one-on-one that PK
and UK CAN be enforced but only per partition (never across)and the partitioning key must be
included.
Suppose i have a table
Create table t
( bill_num number <- PK
, bill_date date )
partition by range ( bill_date)
....
and my query uses a lot of bill_num in its predicate i can create a non-prefixed composite pk
constraint like below
alter table t add constraint t_pk primary key ( bill_num, bill_date);
By doing so i can ensure uniqueness on a partition level and i can 'age-out' some data without the
need of rebuilding any indexes ? Right ?
Followup October 28, 2003 - 7am Central time zone:
If you want a unique constraint (primary key, whatever), you will either:
o use that column(s) in the partitioning scheme so that a local index can be used
o not use the column(s) in the partitioning schema and then use a global index on that column(s)
those are your choices. If you use a global index, you can still perform partition level
operations in 9i using the maintain global indexes clause. It will not cause a rebuild, it will
maintain the index as you perform the partition level operations.
Partiioning
August 7, 2004 - 1am Central time zone
Reviewer: A Reader from canada
I have large table with column ID, ......
I am planning to partition it on ID column, I can either go with Range or List.
Can you please let me know the benefit of one over the other(range vs list)
Followup August 7, 2004 - 10am Central time zone:
depends on how you plan on using partitioning totally.
depends on how you admin it
depends on how you age it
depends on how you query it
depends entirely on how you use it. Look at the options
a) range. where x < N part1, where N <= x < M part2, where M <= x < O part3 and so on.
what if X was a state code. Might want Alaska, Rhode Island and Wisconsin in one partition (few
people), California gets it's own partition (composite actually, hashed underneath maybe) --
california is big.
range would not be very good for that -- well, actually, range would not work unless we encoded the
state codes as something sortable to facilitate that.
what if X was a date, and we purged data old then 3 years. ranging on X would be penultimate.
b) list. where x in ( a,q,r) part1, where x in (z) part2, where x in (l,m,n,o) part3 -- see a)
not knowing what your goals are, what ID represents, the only thing I can suggest is you understand
the physics of the data (read the data warehousing guide, they have good info on partitioning, in
each of my books I went into the physics of partitioning as well)....
partition
August 7, 2004 - 7pm Central time zone
Reviewer: A Reader from canada
I apologize, I forgot to mention that the ID will be numeric value from from 1 to 6.
The ID = 1 has say 10million records
ID = 2 has 8 mill
..
...
..
ID = 6 has 5 million recs.
In this case I can have the range partion created with the ID being the Partition key
and also by using the new feature of 9.2 I can use the LIST partition.
If I consider the performance which route should I go
range vs list
which has more advantages performance wise.
many users slect the data based on the specic id ( ID = 1 0r 2 0r .... 6) ---> partition
elimination on ID
Only few users access the entire data
(ID in (1,2,3,4,5,6)) --> scanning all the partitions.
I will really appreciate if you could give some insight about the performance for range vs list in
my scenerio
Followup August 8, 2004 - 9am Central time zone:
still insufficient data -- "performance"
well, unless you are full scanning this table *frequently* by ID = somevalue, ID < somevalue, ID in
(some list) -- partitioning is probably going to be something you have to use with extreme care so
as to not *negatively* impact performance!
You see, you really need to understand how you access your data and then apply "common sense" to
that.
Say you index range scan this data via an index. How would chopping it into 6 partitions help?
Sure, *maybe* it might reduce the height of an individual index by 1 -- so on a range scan of 20
rows that right now takes 15 LIO's, it might take 14 instead.
If you FULL SCAN this table frequently, AND your users use ID=value or ID in (values) range and
list will perform equivalently -- both can eliminate partitions.
Partitioning
August 8, 2004 - 3pm Central time zone
Reviewer: Hans from USA
I have a similar situation
I have a table say customers(deptid, customerid, ,,,,)
deptid is numeric with values from 1 to 7, each deptid has millions of records.
The table is partioned by range on deptid and has local indexes on customerid and some date
columns.
the users access the data using deptid where partitioning elemination takes place, local indexes
are used for accessing the data.
based on the earlier message, if you say that the performance of Range and LIST are same then I
would like to change my table to LIST partition using new Oracle feature in 9i.
Your advice is appreciated
Followup August 8, 2004 - 6pm Central time zone:
as stated, as long as you are using "where partition_key = :x" or "where partition_key in ( :x, :y,
:z)", range and list will perform equivalently.
it is when you get into ranges (between, <, >) that range has more opportunities than list.
HOWEVER, seems you have a table
IT is already populated
the work is done
you would gain nothing other than "prettier syntax" from a reorg.
So, unless you can come up with some benefit of going to list, i would "stay" right where I was.
Partitioning hundreds of millions of rows
August 9, 2004 - 6pm Central time zone
Reviewer: Vince Mallet from St. Louis, MO, USA
Tom,
I have the following situation (names simplified):
drop table props;
drop table items;
create table items(item_id number, run_id number, ref_id number);
alter table items add constraint items_fk primary key (item_id);
create table props(prop_id number, item_id number, type_id number, value number);
alter table props add constraint props_pk primary key (prop_id);
alter table props add constraint props_fk foreign key (item_id) references items;
(other constraints to run and reference tables omitted)
It's a batch-oriented system. We perform 'runs' 2 to 3 times a month. Each run generates around 10
million items, each item having between 5 and 15 properties of different types. That means every
time we insert data into these tables, it is
- about 10 millions in items
- about 50-150 millions in props
A run has never more than 15 distinct property types.
When we query we usually focus on one or two runs at a time and one or two property types at a
time.
insert into other_table (...)
select i.ref_id, max(p.value)
from items i, props p
where i.item_id = p.item_id
and i.run_id = :x
and p.type_id = :y
group by i.ref_id;
or
(export all properties for all items of run x)
At this point I am trying to reduce querying time. It appears to me that this problem would be a
good candidate for partitioning but after reading the chapters in your two books and all the
answers here, I figured I'd ask a question or two first.
What I had in mind:
- list partition the items table on run_id, one partition per run,
- denormalize the props table and add a run_id column to it,
- list partition the props table on run_id, one partition per run,
- every time we add a new run, create a new partition for it (the application would take care of
that)
and possibly,
- list-subpartition every props partition on type_id
That would give me one partition per type per run which could end up being 1,500 partitions and
subpartitions after three years if the rate of 3 runs per month stays the same.
1. Is that even realistic? With or without subpartitioning?
2. Do you have enough information to answer the question? :)
3. Would it be following the "Good Oracle Practices"?
Any advice would be very valuable. I'm no Oracle expert and still have a great deal to read before
I can pretend 'knowing' about all the Oracle concepts, let alone 'understand' them :-)
Thanks,
Vince.
Followup August 9, 2004 - 8pm Central time zone:
I don't see how partitioning will help:
select i.ref_id, max(p.value)
from items i, props p
where i.item_id = p.item_id
and i.run_id = :x
and p.type_id = :y
group by i.ref_id;
that looks like (to me, just guessing) a query plan of something like either:
sort/aggregate
use index on items(run_id,item_id,ref_id) to
nested loops join to index on props(item_id,type_id,value)
or
sort/aggregate
use index on props(type_id,value,item_id) to
nested loops join to index in items(items_id,run_id,ref_id)
partitioning *might* decrease an index height by 1 (might) but really -- given no full scans, big
joins, data warehouse type of stuff -- how could it dramatically affect anything?
How would you envision partitioning helping here in query performance?
Re: Partitioning hundreds of millions of rows
August 10, 2004 - 12pm Central time zone
Reviewer: Vince Mallet from St. Louis, MO, USA
Thanks for such fast feedback.
I am seeing a lot of full scans on PROPS which lead me to investigate the partition option. Because
the most often used queries consider runs in their entirety (for one or two types), the full scans
didn't seem out of place and removing a large part of the work by pruning partitions looked
appealing.
For example for the query mentioned previously (select ref_id, max(value) ... ):
240510 rows selected.
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=121052 Card=265860 Bytes=6646500)
1 0 SORT (GROUP BY) (Cost=121052 Card=265860 Bytes=6646500)
2 1 HASH JOIN (Cost=116630 Card=1085931 Bytes=27148275)
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'ITEMS' (Cost=2381 Card=755336 Bytes=10574704)
4 3 INDEX (RANGE SCAN) OF 'ITEMS_FK2' (NON-UNIQUE) (Cost=905 Card=755336)
5 2 TABLE ACCESS (FULL) OF 'PROPS' (Cost=91226 Card=19318815 Bytes=212506965)
(items_fk2 is index on run_id)
From your answer I understand that the queries shouldn't even touch the tables and that the indexes
should be enough to answer the questions. Is that right?
In that case, I need to seriously revisit the indexes. Right now, the only indexes defined are:
items: PK, index on run_id, index on ref_id
props: PK, index on item_id, index on type_id (planned for removal)
There is no composite index.
With an average row length of 23 bytes on PROPS and most rows needing to be accessed through large
ranges of item_id's, my feeling was that indexes would hurt more than help compared to reducing the
set of data that queries have to work with in the first place.
By denormalizing and adding run_id to PROPS for the partition key, I was also able to not have to
join against ITEMS anymore to answer a few questions and I figured this would be a plus, too.
I'm in the process of getting rid of 'feelings' and setting up some tests that will provide some
real numbers. Your feedback already helped look at things differently.
Thanks,
Vince.
Followup August 10, 2004 - 3pm Central time zone:
look at the selectivity of the proposed indexes -- are they selective?
Re: Partitioning hundreds of millions of rows
August 12, 2004 - 3pm Central time zone
Reviewer: Vince Mallet from St. Louis, MO, USA
You got me thinking on that one. For some reason I believe that Oracle will be faster at reading
all partition blocks sequentially than scattered-reading almost as many index blocks.
I believe the indexes are selective given that they are unique. (item_id is unique in items;
item_id+type_id is unique in props).
I now have four indexes:
exp_items_u1 on items(run_id, item_id, ref_id)
exp_items_u2 on items(item_id, run_id, ref_id)
exp_props_u1 on props(item_id, type_id, value)
exp_props_u2 on props(type_id, value, item_id)
I have created and populated a new table props_pt which is a copy of props partitioned on run_id
(one partition per run_id) with no index and no subpartition.
create table props_pt(prop_id, run_id, item_id, ref_id, type_id, value) ....
I have analyzed 'for table for all indexes for all indexed columns' all three tables (items, props,
props_pt);
Now here's the same query against items+props first and props_pt second (this is tested on 8174):
select count(*) from props_pt;
COUNT(*)
----------
100000000
select count(*) from props_pt where run_id = 44;
COUNT(*)
----------
9620400
select count(distinct ref_id) from props_pt where run_id = 44;
COUNT(DISTINCTREF_ID)
----------------------
240510
select type_id, count(*) from props_pt where run_id = 44 group by type_id;
TYPE_ID COUNT(*)
---------- ----------
41 2405100
42 2405100
43 2405100
47 2405100
select i.ref_id, max(p.value) from items i, props p where i.pose_id = p.pose_id and run_id = 44 and
type_id = 42 group by i.ref_id;
240510 rows selected.
Elapsed: 00:14:857.90
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=47219 Card=285341 Bytes=5992161)
1 0 SORT (GROUP BY) (Cost=47219 Card=285341 Bytes=5992161)
2 1 MERGE JOIN (Cost=34799 Card=3413847 Bytes=71690787)
3 2 INDEX (RANGE SCAN) OF 'EXP_ITEMS_U1' (UNIQUE) (Cost=3963 Card=2405100
Bytes=26456100)
4 2 SORT (JOIN) (Cost=30836 Card=8115300 Bytes=81153000)
5 4 INDEX (RANGE SCAN) OF 'EXP_PROPS_U2' (UNIQUE) (Cost=12599 Card=8115300
Bytes=81153000)
Statistics
----------------------------------------------------------
0 recursive calls
234 db block gets
18086 consistent gets
69227 physical reads
2 sorts (memory)
2 sorts (disk)
240510 rows processed
select ref_id, max(value) from props_pt where run_id = 44 and type_id = 42 group by ref_id;
240510 rows selected.
Elapsed: 00:00:45.07
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=11551 Card=248903 Bytes=2986836)
1 0 SORT (GROUP BY) (Cost=11551 Card=248903 Bytes=2986836)
2 1 TABLE ACCESS (FULL) OF 'PROPS_PT' (Cost=5368 Card=2405100 Bytes=28861200)
Statistics
----------------------------------------------------------
28 db block gets
22413 consistent gets
23472 physical reads
1 sorts (memory)
1 sorts (disk)
240510 rows processed
A little more consistent reads but a third physical reads and 20 times faster by the clock wall!
Just some precisions about the data:
Each run performed against 1 million ref_ids generates 10 millions items (10 items per ref_id),
each item coming with k properties (k fixed per run).
1 run -> n ref_ids -> n * 10 items -> n * 10 * k properties (4<=k<=15, k fixed per run).
If you could explain how you see indexes helping me out better in this situation it would be great.
I can't seem to see why.
Now apart from performance I can see how partitioning could help:
- backing up: older partitions should rarely change
- deleting a run: drop related partitions as opposed to massive deletes
- loading in a new run: might be faster because of fewer indexes to update. And potentially, if my
indexes are local then I could build partition-indexes after all the data is loaded.
Your comments as always are greatly appreciated.
Thanks,
Vince.
Followup August 12, 2004 - 4pm Central time zone:
yes, with those cardinalities -- partitioning would help, indexes not so good (would have hoped for
a fast full scan maybe)
normally when joining and looking for a max, there are "relatively few" rows involved. there were
not in this case. there were "lots"
so yes, partition elimination would be beneficial (very much so) for the full scans.
Re: Partitioning hundreds of millions of rows
August 12, 2004 - 5pm Central time zone
Reviewer: Vince Mallet from St. Louis, MO, USA
Thanks for your feedback.
I am still wondering about those aspects:
1. Is programmatically creating a new partition for every new run at load time an acceptable
solution in terms of administration and/or practices?
2. Should I consider subpartitioning on types or not? (or 'test and see'?)
Thanks again,
Vince.
Followup August 13, 2004 - 8am Central time zone:
1) absolutely -- done all of the time. end of month, end of day -- people do the "partition slide"
using automated procedures all of the time.
2) if it can result in further pruning, sure -- if you use hash subpartitions, "TYPE" needs to have
*lots* of values (to hash evenly) and use a power of 2 for the number of subpartitions
(2,4,8,16,32,..)

August 13, 2004 - 11am Central time zone
Reviewer: A reader
Tom,
I want to create a table t with 31 partitions(range) for each day, can i do it or there is some
catch to it.
Thanks.
Followup August 13, 2004 - 5pm Central time zone:
no catches?
you need a column that has the "day" in it
you partition on that column
Partitioned select
August 23, 2004 - 6pm Central time zone
Reviewer: Jim Shave from MI USA
I'm selecting from a table with 2.7 million rows, partitioned by mdl_year; partitions: PART_2005 (
values < 2005 ) and PART_MAXVALUE ( values < MAXVALUE )
Instead of 2.7 million rows returned, I get 5.4 million, and the select is only on this one table.
It seems that it's reading both partitions. How can I limit my select to reading just one set of
data?
Select count(*) returns 2.7 million, and checking the results shows duplication of every single
row.
Thanks.
Followup August 23, 2004 - 7pm Central time zone:
er? not following you here, can you be more explicit?
Is "multiple table" partitioning possible?
August 24, 2004 - 8am Central time zone
Reviewer: Oleksandr Alesinskyy from Bad Soden a. Ts., Germany
Hi Tom,
I want to ask a question that is very similar to one of already asked ("Partitioning strategy"),
but more general. I have a master table that can be partitioned on date and a lot (tens) of detail
tables that has only relation to master based on master ID (no dates in details). Is it possible to
ensure that records for details go to the same partition as their master record? I need this not so
for improving performance, as for simplicity of maintenance of aged data. And denormalization is
clearly not an option.
Followup August 24, 2004 - 8am Central time zone:
why is denormalization "clearly not an option" -- seems very very fuzzy to me that it isn't?
How to findout whether a query is using a particular partition or not?
August 24, 2004 - 12pm Central time zone
Reviewer: RB from NJ
Tom:
Assume that I have N number of range partitions (ex: 12 partitions, 1 per month ). Now when I do
select, where I have specified the range which should only search the partition where the data
within that range.
ex: select * from tableA where dt between 1-Jan-2004 and 30-Mar-2004.
How do I verify whether the system is only doing search in the first three partiions - Jan, Feb and
Mar?
I am not planning to send the partition key in the where clause.
Followup August 24, 2004 - 3pm Central time zone:
delete from plan_table;
explain plan for your_query;
@utlxpls
you'll see "pstart" and "pstop" for partition start/stop keys.
i don't know what you mean by "I am not planning to send the partition key in the where clause."
why is denormalization "clearly not an option"?
August 29, 2004 - 6am Central time zone
Reviewer: Oleksandr Alesinskyy from Bad Soden a. Ts., Germany
Denormalization is clearly not an option because, as I have mentioned, several records in in each
of dozens details tables hang on single record of master table. And some details has details of
their own. As result, in case of denormalization database size will go to terabyte range (now it
dozens GBs). One possible solution is to propagate date information to details, details of details
and so on (really I have to admit, this is some type of denormalization anyway). But eve nthis
solution is not very viable as date in master may change, and details may migrate from one master
to another on any level (data by nature of this system come to it in not very clean state, so a
lot of corrections may be done). so this propagation may be cumbersone.
Followup August 29, 2004 - 11am Central time zone:
do the math, get back to me. people way overestimate.
that -- and you didn't really present anything above that would make anything "obvious" to
"anyone".
adding a little tiny date column would not "obviously" make a multiple gigabyte sized database grow
into terabytes. so, make it obvious for me.
If you want to partition a table
then
The column you want to partition on must be in the table
So if you want to partition by this date column from a master table in the child tables
YOU WILL propagate that value down.
Otherwise, you will not be partitioning on that value.
that the propagation is "cumbersome" is not really meaningful IF your goal is to partition by this
value. You write a "update this date" stored procedure once that makes it "trivial" to accomplish
(programmer calls "update_this_date( pk => ?, new_date => ?);" it is less than cumbersome.
my suggestion
April 20, 2005 - 3am Central time zone
Reviewer: riyaz from india
my suggestion for the above query: Ofcourse the below text is based on Tom's book only (+ my own
experience).
a- Header (single row per transaction) table is having date field, (primary key doc_no)
b- Detail table is NOT having date field (multiple rows per transaction)- primary key doc_no, prod
code
For the above:
Solution - 1
1- range partion with date column for header table.
2- datail table - HASH partion by primary key column of header table. ie., doc_no
3- Create a view with both header and detail table,which will have DATE column (of header) table
also.
4- Geneate all reports (which require detail information based on date range) using the view.
Solution - 2
De-normalize the table by adding date column in the DETAIL table.
Note: On top of it, I have tried both methods in test server and bench marked the performance
(using trace/tkprof) & then implemented 1st method in production, since it suits my application.
Tom, please correct me if anything wrong.
please correct
April 22, 2005 - 5am Central time zone
Reviewer: riyaz from india
Hi Tom,please confirm my understanding. (above)
Followup April 22, 2005 - 10am Central time zone:
I frankly lost track of the thread here (1/2 year gap and all)
so I'm not really sure which part of this page you were addressing.

November 22, 2005 - 4pm Central time zone
Reviewer: Smita Acharya from MO USA
Hi Tom,
We have a data warehouse of about 50GB which is expected to grow to about 150gb
by next year.
The storage being used is Hitachi Data Systems Lightning 9980V connected over a
Fibre Channel SAN. We are using RAID 5.
My question here is:
Do we need to create different tablespaces (each having files on different mount
points) for indexes, partitions considering the above storage.
OR
Can we keep them all in one tablespace or separate them into different tablespaces taking only
database
backup and recovery into account, without considering performance.
I think putting them into a single tablespace will not be a performance problem( Can be a recovery
problem though).
Please correct me if I am wrong and please explain why.
Followup November 23, 2005 - 9am Central time zone:
50-150gb is relatively small.
raid-5 isn't the best choice for warehousing (lots of direct path writes and reads in general, not
buffered IO).
But given this size, it'll likely be "ok".
Use tablespaces to make your life easier - what makes your life "better". You have a valid point
on the recovery issue - you could just rebuild indexes rather than back them up if something goes
wrong.
Since a tablespace may contain files from many mountpoints....
And your mountpoints are already striped file systems (raid-5)....
You would not really be using tablespaces for distributing IO - use tablespaces to make your
administrative jobs easier for you.
Tablespaces in data warehouse
November 23, 2005 - 10am Central time zone
Reviewer: Smita Acharya from MO USA
Hi Tom,
Thanks a lot for the answers. They were really useful. I will use a different tablespace for
indexes, mainly considering the recovery part.
I do not have much idea about RAID systems. Could you please point me to some easy to understand
documentation which describes RAID in detail. I wish everyone could explain things as simply and
easily as you do.

November 26, 2005 - 1pm Central time zone
Reviewer: Chris from Fairbanks Alaska
We have a partitioned table (range partitioned by week..each partition has around 70 million
rows..3 years of data);
All the partitions except the most recent partitions are compressed.Now we have a situation wherein
we need to add some
fields to the table.Since its not possible to add fields to a compressed table and we cannot
uncompress the table because
of space limitations, Is there any way to get around this problem?
I was thinking along the lines of creating a new table(with additional fields) and start populating
it going forward
and somehow exchange the partitions from old table to new table? but the old and new table doesnt
have the same table
structures...any ideas??
thanks
Followup November 26, 2005 - 4pm Central time zone:
excellent reason for considering a new release ;)
ops$tkyte@ORA10GR1> CREATE TABLE t
2 (
3 dt date,
4 x int,
5 y varchar2(25)
6 )
7 compress
8 PARTITION BY RANGE (dt)
9 (
10 PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
11 PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
12 PARTITION junk VALUES LESS THAN (MAXVALUE)
13 )
14 /
Table created.
ops$tkyte@ORA10GR1>
ops$tkyte@ORA10GR1> insert /*+ append */ into t select to_date( '12-mar-2003', 'dd-mon-yyyy') +
mod(rownum,5), rownum, rownum from all_users;
28 rows created.
ops$tkyte@ORA10GR1>
ops$tkyte@ORA10GR1> alter table t add z int;
Table altered.
Else, you'll be looking at doing this a partition at a time, create a new partitioned (empty) table
with the column.
insert /*+ APPEND */ into partition 1 the old data from the old partition1. Truncate old partition
1. Repeat.

March 16, 2006 - 7am Central time zone
Reviewer: Murali from Singapore
Tom:
I have table with 4 millions and table is done hash partition.no i want to find out howmany records
in each partition.
when i give select count(*) from table_name <partition_name> . record countt is always same for
with partition clause and with out partition clause .
why?
Followup March 16, 2006 - 2pm Central time zone:
prove it?
cut and paste an example, unless you hashed on something with very very low distinct values - one
would expect a spread of data.
Or perhaps you literally issued:
select count(*) from table_name partion_1;
(same as "select count(*) from table_name XXXX") which is very very different from
select count(*) from table_name PARTITION(partition_1);

March 17, 2006 - 4am Central time zone
Reviewer: A reader
Tom:
You are great.
Its my mistake, sorry. correct one is
SQL> select COUNT(1) from table_name PARTITION(partiton_name);
for example , emp table is done 10 HASH partition's and
if i give select statement with out PARTITION clause
i.e select emp_no from emp where emp_no=134;
how oracle scan's emp table ? i.e it will search all partition until it find emp_no=134 or is it
always starts first sacning in first partition if value not found then search next
partition..and soo on until result found .How oracle scans table if i don't give partition clause
in select statement?
Followup March 17, 2006 - 5pm Central time zone:
do not do 10 partitions, do
2, 4, 8, 16, 32, 64, 128, .... (*2)
10 will not distribute the data very well.
If you hashed the table on emp_no - then "where emp_no=134" will take the 134, hash it, figure out
the only partition it could be in and look there.
Should I Partition?
March 21, 2006 - 6pm Central time zone
Reviewer: RG
Hi:
I am trying to find out if I should partition a table and if so how?
This table will be used to track who accesses certain tables.
Below are the details:
drop table t;
create table t
(
id_type number
,cd_type char(1)
,ts_cr date
,id_user number
);
insert into t values (1,'C',SYSDATE,1);
insert into t values (1,'B',SYSDATE,1);
insert into t values (1,'P',SYSDATE,1);
commit;
id_type identifies the unique id of the row being accessesd on a different table.
cd_type identifies the table being accessed by the user and it can only have values ('B','C' or
'P').
ts_cr the time the table was accessed
id_user id of the user
A report will be generated to determine if a particular row in a table has been accessed (based on
id_type, cd_type and ts_cr).
select * from t where id_type = 1 and cd_type = 'C' and ts_cr between
to_date('01/01/2006','MM/DD/YYYY') AND to_date('12/31/2006','MM/DD/YYYY');
ID_TYPE C TS_CR ID_USER
---------- - --------- ----------
1 C 21-MAR-06 1
or based on id_user and ts_cr
select * from t where id_user = 1 and ts_cr between to_date('01/01/2006','MM/DD/YYYY') AND
to_date('12/31/2006','MM/DD/YYYY');
ID_TYPE C TS_CR ID_USER
-------- - --------- ----------
1 C 21-MAR-06 1
1 B 21-MAR-06 1
1 P 21-MAR-06 1
The table will be populated every week with 26,000 rows (about 1.2 million rows every year) as
follows:
CD_TYPE = 'C' ~ 15,000 rows
CD_TYPE = 'B' ~ 2,000 rows
CD_TYPE = 'P' ~ 9,000 rows
Also, data from this table will be archived (no details on how often, any suggestions)
create table t_archive
(
id_type number
,cd_type char(1)
,ts_cr date
,id_user number
);
Thanks for your help!!
Followup March 22, 2006 - 3pm Central time zone:
why do you want to partition - what do you NEED that partitioning can PROVIDE.
It can make purging of data easier. (ease administration)
It can make the data more available in some cases.
It can make certain specific classes of queries potentially faster.
It can help increase concurrency in a high insert environment sometimes.
But most of the above may be mutually exclusive. What is your number one important goal.
Re: Should I Partition?
March 23, 2006 - 12pm Central time zone
Reviewer: RG
My number one important goal is to make the queries for the reports I had mentioned (below) fast
(if it is possible through partitions).
.....
A report will be generated to determine if a particular row in a table has been
accessed (based on id_type, cd_type and ts_cr).
select * from t where id_type = 1 and cd_type = 'C' and ts_cr between
to_date('01/01/2006','MM/DD/YYYY') AND to_date('12/31/2006','MM/DD/YYYY');
ID_TYPE C TS_CR ID_USER
---------- - --------- ----------
1 C 21-MAR-06 1
or based on id_user and ts_cr
select * from t where id_user = 1 and ts_cr between
to_date('01/01/2006','MM/DD/YYYY') AND to_date('12/31/2006','MM/DD/YYYY');
Followup March 23, 2006 - 1pm Central time zone:
do you plan on having your report full scan or index range scan to find these rows.
full scan => partitioning could be employed to help speed up the full scan via partition
elimination
index range => partitioning would have to be very carefully employed so as to not make it go
slower, the odds of it speeding up the query are slim to none.
Re: Should I Partition?
March 23, 2006 - 3pm Central time zone
Reviewer: RG
Normally I use indexes to make queries fast. I have never used partitions and since I know that
this table is going to be huge over time I wanted to find out if partitioning the table right when
it is created would be better for performance when compared to having indexes on the table. I ran
the following sql for test:
SQL> create table t
2 (
3 id_type number
4 ,cd_type char(1)
5 ,ts_cr date
6 ,id_user number
7 );
Table created.
SQL>
SQL> insert into t values (1,'C',SYSDATE,1);
1 row created.
SQL> insert into t values (1,'B',SYSDATE,1);
1 row created.
SQL> insert into t values (1,'P',SYSDATE,1);
1 row created.
SQL> commit;
Commit complete.
SQL> set autotrace traceonly;
SQL> select * from t where id_type = 1 and cd_type = 'C' and ts_cr between
2 to_date('01/01/2006','MM/DD/YYYY') AND to_date('12/31/2006','MM/DD/YYYY');
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (FULL) OF 'T'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
8 consistent gets
0 physical reads
0 redo size
581 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
SQL> select * from t where id_user = 1 and ts_cr between
2 to_date('01/01/2006','MM/DD/YYYY') AND to_date('12/31/2006','MM/DD/YYYY');
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (FULL) OF 'T'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
8 consistent gets
0 physical reads
0 redo size
622 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
SQL>
SQL>
SQL> create index indx1_t on t(id_type,cd_type,ts_cr);
Index created.
SQL> create index indx2_t on t(id_user,ts_cr);
Index created.
SQL> exec dbms_stats.gather_table_stats(user,'T',cascade=>true);
PL/SQL procedure successfully completed.
SQL> select * from t where id_type = 1 and cd_type = 'C' and ts_cr between
2 to_date('01/01/2006','MM/DD/YYYY') AND to_date('12/31/2006','MM/DD/YYYY');
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=16)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=2 Card=1 Bytes=
16)
2 1 INDEX (RANGE SCAN) OF 'INDX1_T' (NON-UNIQUE) (Cost=1 Car
d=1)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
581 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
SQL> select * from t where id_user = 1 and ts_cr between
2 to_date('01/01/2006','MM/DD/YYYY') AND to_date('12/31/2006','MM/DD/YYYY');
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=3 Bytes=48)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=2 Card=3 Bytes=
48)
2 1 INDEX (RANGE SCAN) OF 'INDX2_T' (NON-UNIQUE) (Cost=1 Car
d=3)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
622 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
SQL>
Followup March 23, 2006 - 3pm Central time zone:
like I said - are you planning on using indexes to retrieve your data (big table, but you need a
relatively small piece of it)
or are you planning on full scanning, because it is a big table and you need LOTS of it.
if you index scan - partitioning must be applied in a way to NOT degrade query performance.
if you full scan - perhaps we can partition to IMPROVE query performance
Re: Should I Partition?
March 23, 2006 - 5pm Central time zone
Reviewer: RG
I am planning on using indexes to retrieve small pieces of data. So based on what you are saying it
looks like partitioning is not going to help me a whole lot (to make the reports run faster). Am I
correct?
Thanks!!
Followup March 23, 2006 - 6pm Central time zone:
correct - it would be HIGHLY unlikely partitioning would do anything positive for your query
retrieval.
Re: Should I Partition
March 23, 2006 - 6pm Central time zone
Reviewer: RG
Thanks Tom!!
Partition Exchange to partitioned tabled
March 24, 2006 - 2am Central time zone
Reviewer: Murali from Singapore
Tom:
a)
we have 20 million records non partition table. just assume T1 . we would like to add partition to
non partitiion table how can we do this?
Can we use ALTER Table to add partition?
2) We have one partition table just assume table_1 with 4 range partitions on Date column. we want
to move partion1 and partition2 of table1 to second table called table2 which is already partitiond
on same date column as table1
FYI,table2 structure,datatypes is same as table1
How can we do this?
pls help
Followup March 24, 2006 - 9am Central time zone:
a) and then 2) ;)
you have to create a partitioned table, you cannot take an existing table and make it suddenly be a
partitioned table
You can
a) create a new partitioned table and then EXCHANGE this existing table for a partition in the
newly created table - hence converting the table T1 into a partition of a partitioned table.
b) use dbms_redefinition to online redefine the data into a partitioned table. Search this site
for dbms_redefinition - examples are there showing that.
for the second question - you would alter table T1 and exchange partition 1 with an empty table
(turn partition 1 into a table), then you can exchange that table with the other partitioned table.
Repeat for partition 2.
Mohamed From France
March 24, 2006 - 10am Central time zone
Reviewer: A reader
Dear Tom,
Exchanging partition is not as easy as you wrote. I have tested it. This is the method I've
followed.
I have a master table T1 as follows
IDE NUMBER
DAT_PART DATE
.../...
range partitioned on dat_partition (3 months). This master table has several other child table like
T2
Master_IDE
Master_Dat_part
also range partitioned by dat_partition
1.set constraints of master and slave tables to no validate (disable constraints).
2.create empty clone non partitioned tables with relevant indexes in place.
3.set constraints of clone tables to a non validate state (disable constraints)
4. exchange each child partition with empty clone table including indexes but without validation
5.exchange parent partition with corresponding empty clone table.
-- disable constraints of partitioned tables
alter table CHILD_TABLE disable constraint Child_Fk;
alter table PARENT_TABLE disable constraint Parent_Pk;
-- create empty clone non partitioned tables
create table PARENT_TABLE_CLONE as select * from PARENT_TABLE where 1 = 2;
create table CHILD_TABLE_CLONE as select * from CHILD_TABLE where 1 = 2;
-- create indexes and constraints of empty clone non partitioned tables
ALTER TABLE CHILD_TABLE_CLONE ADD (
CONSTRAINT CHILD_FK_01_clone FOREIGN KEY (mast_ide, mast_dat_part)
REFERENCES PARENT_TABLE_CLONE (IDE,DAT_PART));
CREATE UNIQUE INDEX PK_MAST_CLONE ON PARENT_TABLE_CLONE
(IDE, DAT_PART);
ALTER TABLE PARENT_TABLE_CLONE ADD (
CONSTRAINT PK_MAST_CLONE PRIMARY KEY (IDE, DAT_PART)
);
-- disable constraints of empty clone non partitioned tables
ALTER TABLE CHILD_TABLE_CLONE disable constraint CHILD_FK_01_clone;
ALTER TABLE PARENT_TABLE_CLONE disable constraint PK_MAST_CLONE;
-- Exchange the partition with the empty clone tables starting by child ones
ALTER TABLE CHILD_TABLE
EXCHANGE PARTITION P_CHLD_20050331
WITH TABLE CHILD_TABLE_CLONE
INCLUDING INDEXES
WITHOUT VALIDATION
-- Exchange the partition of the master table
ALTER TABLE PARENT_TABLE
EXCHANGE PARTITION P_MAST_20050331
WITH TABLE PARENT_TABLE_CLONE
INCLUDING INDEXES
WITHOUT VALIDATION
Followup March 24, 2006 - 10am Central time zone:
what was terribly difficult.
It was asked "how to", this is "how to" (one way anyway)
Mohamed from France
March 24, 2006 - 11am Central time zone
Reviewer: A reader
Because if you do not follow the indicated order you will get the following error
ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION
Even if your tables to be exchanged are exactly the same.
Kind Regards
Followup March 24, 2006 - 3pm Central time zone:
I'm still not getting it, you have to execute commands in the correct order? so?

March 29, 2006 - 1am Central time zone
Reviewer: Murali from singapore
Tom:
If my current table is not partitioned but My table T1 have 20 millions.
What is the best method to transfer data from NON_PARTITIONED TABLE TO PARTITIONED TABLE . ( both
partitioned and nonpartitioned tables have same columns names and same datatypes)
CAN I use Alter EXCHANGE PARTITION to transfer my data into partitioned table?
or insert /*+ append */ into partitioined_table select * from non_partitioned table?
Which one is the best
Followup March 29, 2006 - 7am Central time zone:
"best" always depends on "need"
If there were a BEST way in all cases, we would not have wasted our developers cycles giving you
more than one.
IF downtime is permitted
then
stop modifications to non-partitioned
create table partitioned as select from non-partitioned
index partitioned
grant on it
constrain it
drop non-partitioned
rename partitioned
else
dbms_redefinition with the same steps as before except for the last two
dbms_redefinition finish redef to replace last two
end if
would be the two primary paths to consider.

March 29, 2006 - 9am Central time zone
Reviewer: Mohamed from France from FRANCE
This is just an example showing how to exchange partitions
-- This is the master table to be archived
CREATE TABLE MHO_ARC_MAST
(IDE NUMBER,
DAT_PART DATE,
NMFI_IDE NUMBER,
PTFL_IDE NUMBER)
PARTITION BY RANGE (DAT_PART)
(PARTITION P_20050331 VALUES LESS THAN (TO_DATE(' 2005-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS',
'NLS_CALENDAR=GREGORIAN'))
)
-- Index
CREATE UNIQUE INDEX PK_MAST ON MHO_ARC_MAST
(IDE, DAT_PART)
-- Primary Key
ALTER TABLE MHO_ARC_MAST ADD (
CONSTRAINT PK_MAST PRIMARY KEY (IDE, DAT_PART)
);
-- with the following data in it
11:54:02 > select * from MHO_ARC_MAST;
IDE DAT_PART NMFI_IDE PTFL_IDE
-------- --------- --------- ---------
1 01-JAN-05 56 89
2 13-JAN-05 55 89
3 15-JAN-05 44 55
-- This is the slave table to be archived
CREATE TABLE MHO_ARC_CHILD1
(MAST_IDE NUMBER,
MAST_DAT_PART DATE,
DAT_TYP VARCHAR2(6),
VLU DATE)
PARTITION BY RANGE (MAST_DAT_PART)
(PARTITION P_CHLD_20050331 VALUES LESS THAN (TO_DATE(' 2005-04-01 00:00:00','SYYYY-MM-DD
HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
)
-- Index
CREATE UNIQUE INDEX PK_CHILD1 ON MHO_ARC_CHILD1
(MAST_IDE, MAST_DAT_PART);
-- Primary Key
ALTER TABLE MHO_ARC_CHILD1 ADD (
CONSTRAINT PK_CHILD1 PRIMARY KEY (MAST_IDE, MAST_DAT_PART);
-- Foreign Key
ALTER TABLE MHO_ARC_CHILD1 ADD (
CONSTRAINT CHILD_FK_01 FOREIGN KEY (MAST_IDE, MAST_DAT_PART)
REFERENCES MHO_ARC_MAST (IDE,DAT_PART));
-- with the data in it
11:54:26 > select * from MHO_ARC_CHILD1;
MAST_IDE MAST_DAT_ DAT_TY VLU
--------- --------- ------ ---------
1 01-JAN-05 000010 29-NOV-05
2 13-JAN-05 000010 29-NOV-05
--Now I would like to archive those two tables using exchange partition (oracle 8.1.7).
-- first create the clone tables
-- create master clone non partitioned table MHO_ARC_MAST_CLONE ans its constraints
13:09:17 > CREATE TABLE MHO_ARC_MAST_CLONE
13:09:40 2 ( IDE NUMBER,
13:09:40 3 DAT_PART DATE,
13:09:40 4 NMFI_IDE NUMBER,
13:09:40 5 PTFL_IDE NUMBER);
Table created.
13:10:13 > CREATE UNIQUE INDEX PK_MAST_CLONE ON MHO_ARC_MAST_CLONE
13:10:13 2 (IDE, DAT_PART);
Index created.
13:10:13 > ALTER TABLE MHO_ARC_MAST_CLONE ADD (
13:10:34 2 CONSTRAINT PK_MAST_CLONE PRIMARY KEY (IDE, DAT_PART)
13:10:34 3 );
Table altered.
-- create slave clone non partitioned table mho_arc_child1_Clone and its constraints
13:10:34 > create table mho_arc_child1_Clone (Mast_ide number,Mast_Dat_Part Date
13:11:33 2 ,dat_typ VARCHAR2(6), VLU DATE);
Table created.
create unique index PK_CHILD1_CLONE ON MHO_ARC_CHILD1_CLONE(mast_ide, mast_dat_part);
Index created.
13:12:11 > alter table mho_arc_child1_Clone add constraint PK_CHILD1_Clone
13:15:04 2 primary key (mast_ide, mast_dat_part);
Table altered.
13:16:10 > ALTER TABLE mho_arc_child1_Clone ADD (
13:16:24 2 CONSTRAINT CHILD_FK_01_clone FOREIGN KEY (mast_ide, mast_dat_part)
13:16:24 3 REFERENCES mho_arc_mast_Clone (IDE,DAT_PART));
Table altered.
-- Now I will start exchanging partitions with all constraints on. I will also start exchanging the
master table
13:28:12 > ALTER TABLE MHO_ARC_MAST
13:28:13 2 EXCHANGE PARTITION P_20050331
13:28:13 3 WITH TABLE MHO_ARC_MAST_CLONE
13:28:13 4 WITHOUT VALIDATION;
ALTER TABLE MHO_ARC_MAST
*
ERROR at line 1:
ORA-02266: unique/primary keys in table referenced by enabled foreign keys
-- Then I will try to exchange the child one first
13:28:16 > ALTER TABLE MHO_ARC_CHILD1
13:28:52 2 EXCHANGE PARTITION P_CHLD_20050331
13:28:52 3 WITH TABLE MHO_ARC_CHILD1_CLONE
13:28:52 4 WITHOUT VALIDATION;
ALTER TABLE MHO_ARC_CHILD1
*
ERROR at line 1:
ORA-14128: FOREIGN KEY constraint mismatch in ALTER TABLE EXCHANGE PARTITION
-- Then I start disabling all the constraints
13:30:12 > alter table MHO_ARC_CHILD1_CLONE disable constraint pk_child1_clone;
Table altered.
13:30:42 > alter table MHO_ARC_CHILD1_CLONE disable constraint child_fk_01_clone
13:30:52 2 ;
Table altered.
13:30:55 > alter table MHO_ARC_CHILD1 disable constraint child_fk_01;
Table altered.
13:31:41 > alter table MHO_ARC_CHILD1 disable constraint pk_child1;
Table altered.
13:32:01 > alter table MHO_ARC_MAST_CLONE disable constraint pk_mast_clone;
Table altered.
13:33:07 > alter table MHO_ARC_MAST disable constraint pk_mast;
Table altered.
-- And now I will exchange the partition starting by child one
13:33:32 GSPD> ALTER TABLE MHO_ARC_MAST
13:34:28 2 EXCHANGE PARTITION P_20050331
13:34:28 3 WITH TABLE MHO_ARC_MAST_CLONE
13:34:28 4 WITHOUT VALIDATION
13:34:28 5 ;
Table altered. ===> tables have been exchanged
-- Lets go the master tables
13:34:30 GSPD> ALTER TABLE MHO_ARC_CHILD1
13:35:17 2 EXCHANGE PARTITION P_CHLD_20050331
13:35:17 3 WITH TABLE MHO_ARC_CHILD1_CLONE
13:35:17 4 WITHOUT VALIDATION
13:35:17 5 ;
Table altered. ===> tables have been exchanged
-- At this step partioned tables have been put into non partitioned ones
-- By executing the same Exchange partition the inverse process is done
16:11:27 > ALTER TABLE MHO_ARC_CHILD1
16:11:27 2 EXCHANGE PARTITION P_CHLD_20050331
16:11:27 3 WITH TABLE MHO_ARC_CHILD1_CLONE
16:11:27 4 WITHOUT VALIDATION;
Table altered.
16:11:28 > ALTER TABLE MHO_ARC_CHILD1
16:11:28 2 EXCHANGE PARTITION P_CHLD_20050331
16:11:28 3 WITH TABLE MHO_ARC_CHILD1_CLONE
16:11:28 4 WITHOUT VALIDATION;
Table altered.
Best Regards

March 30, 2006 - 5am Central time zone
Reviewer: Murali from singapore
Tom:
Thanks for your examples.
I would like to regarding performance of "index partitioned" when i insert 0.5 millions of records
into table T1 . which one works better i.e partition index or non partition index which one is
faster when i am inserting records. option 'a' or option 'b'
a) if my table T1 have 20 millions of records is done 8 partitions and my index' is is not
partitioned
b) My table T1 have 8 partitions and my index's also have 8 partition's
Thanks
Followup March 31, 2006 - 11am Central time zone:
answer is:
it depends.
say you have thousands of concurrent users inserting these 0.5 million rows (very small amount of
data, not too much to worry about). Then having a partitioned index could spread the points of
contention (index splits, hot right hand sides and such) over many physical structures. it could
increase performance.
say you have a single process inserting all of this data in bulk. Then, figuring out WHICH
partition to put each entry into will incurr some runtime overhead that would not otherwise be
there.

April 2, 2006 - 11pm Central time zone
Reviewer: Murali from Singapore
Tom:
Thanks for your respone.
Lets say we have a single process inserting all of this data row by row i.e in loop , not bulk
insert
is it Ok to use partition index's than normal global index?
Why i am stressing on "use of partition index is"
we have a batch process that insert 0.5 mill records into T1 and select records from same table T1
with different where conditions. we have currently table T1 have 5 partition's. but index is 4
global index on T1 i.e index is not partitioned.
If i used partition index than normal index , is it improve my performance
Followup April 3, 2006 - 8am Central time zone:
it would do nothing to increase performance, it would likely make the process a little slower - but
not nearly as slow as the slow by slow processing you are doing is (it would be hard to make it
slower than you already have!)
I need your opinion
April 19, 2006 - 5pm Central time zone
Reviewer: Mike from Chicago, IL
Tom,
I have a table with few columns. Two indexes in this table and one of them is primary key index.
I will add new column in this table and create range partitions using new column as a partitioning
key.
All queries where this table appears will contain new column in their WHERE clause. Every time new
column value will point only on one partition.
I will make all existing indexes to be a local.
Is this a good idea?
From my point of view, since each query will select the records only from one partition, local
indexes can produce only benefit.
Thanks,
Followup April 19, 2006 - 6pm Central time zone:
"I will make all existing indexes to be a local."
well, except for the index on the primary key of course.... how will you get that index to be used
for enforcement of uniqueness to be local? Unless you partition by the primary key - that won't
happen.
ops$tkyte@ORA10GR2> CREATE TABLE t
2 (
3 dt date,
4 x int,
5 y varchar2(25)
6 )
7 PARTITION BY RANGE (dt)
8 (
9 PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
10 PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
11 PARTITION junk VALUES LESS THAN (MAXVALUE)
12 )
13 /
Table created.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> alter table t add constraint t_pk
2 primary key(x)
3 using index ( create unique index t_pk on t(x) local );
alter table t add constraint t_pk
*
ERROR at line 1:
ORA-14039: partitioning columns must form a subset of key columns of a UNIQUE
index
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> alter table t add constraint t_pk
2 primary key(x)
3 using index ( create index t_pk on t(x) local );
alter table t add constraint t_pk
*
ERROR at line 1:
ORA-14196: Specified index cannot be used to enforce the constraint.
the rest - fine, if the queries all include that column and we can partition eliminate down to a
single index partition - that is good.

April 19, 2006 - 7pm Central time zone
Reviewer: althaf from India
Is there a way I can test that the records are going to the correct in partitions(ramge)?

April 19, 2006 - 7pm Central time zone
Reviewer: Althaf from India
Is there a way I can test that the records are going to the correct partitions?
Followup April 19, 2006 - 7pm Central time zone:
they will go into the correct partition - I assume you mean "the partition I expected them to" :)
you can
select from table_name partition(partition_name) where key = value;
it'll only query that partition for that record.
Partitioning Existing table
June 13, 2006 - 2am Central time zone
Reviewer: ravs from India
Hi Tom,
Thanks alot for this greate site!
I have a table with no partitions on which snapshots (version 8i) are defined. These snapshots are
used by other interfaces. Now i want to partition this table with out full refreshing the snapshots
used by other interfaces.
if i drop the original table then the snapshots logs on the table will become unusable and
requires a full refresh(correct me if i am wrong)
is there any way to partition this table without dropping ?
- Thanks in advance
ravi
Followup June 13, 2006 - 12pm Central time zone:
you will have to full refresh the materialized views, you will be creating an entirely new object
there.
help with partitioning strategy
June 14, 2006 - 11am Central time zone
Reviewer: Ryan from virginia
We are developing a multi-TB bulk load system(minimal OLTP queries, no oltp DML). We will use BCVs
to copy data to a reporting system and transform. It is in development. I have some bulk data, but
not all of it. Right now I am working on a partitioning strategy to optimize bulk data loads. I
have not worked extensively with partitioning.
We have an an ID column that is a key to many of our tables. We want to partition on this column so
that when we do joins we can take advantage of partition wise joins.
We get large data loads. We have discussed splitting these up into small pieces(we have a seperate
ETL tool on a seperate server that can do that). So we can open say 5-10 threads at once each
processing part of the data.
We talked about doing the following partitioning strategy:
mod(id,100) part_key
This way we have 10 partitions. We use list partitions of the form:
PARTITION p1 VALUES (0, 10, 20, 30, 40, 50, 60, 70, 80, 90)
PARTITION p2 VALUES (1, 11, 21, 31, 41, 51, 61, 71, 81, 91)
Here is the process:
1. the loading database recieves 10 files each seperated by the partition scheme above
2. each thread uses an external table to load the data into a global temp table(each thread will
share the same global temp table DDL
3. we then have to do joins/transformation to tables in the system and then mainly bulk inserts.
Most of this can be done with SQL. We need some pl/sql
We did 2 tests, where we checked the plans and ran 10046 traces and tkprofs on some of the queriers
with non-partitioned tables and some with partitioned tables. The non-partitioned tables were
significantly better in everything from response time to logical IO. When we analyzed the plans
here is what we found were the differences in the plans:
Here is a sample query:
insert into <table>
select col1,col2,col3
from globalTempTable a, applicationTable b
where a.ID = b.ID
and b.part_key in (0, 10, 20, 30, 40, 50, 60, 70, 80, 90);
take out b.part_key for the non partitioned table. We also tried using the 'partitioning keyword'.
Here are the plans:
Non-partitioned table:
INSERT STATEMENT
NESTED LOOPS
INDEX FAST FULL SCAN | globalTempTable
TABLE ACCESS BY INDEX ROWID | applicationTable
INDEX UNIQUE SCAN |applicationTable.ID
On partitioned Table
INSERT STATEMENT
TABLE ACCESS BY LOCAL INDEX ROWID | applicationTable
NESTED LOOPS
INDEX FAST FULL SCAN | globalTempTable
INLIST ITERATOR
PARTITION LIST ITERATOR | KEY(I) |KEY(I)
INDEX RANGE SCAN | uses local index
Looks like the difference is between the unique vs. range scan. We tested this on 5000 co-located
rows(we inserted rows to the application table and the global temp table ordered by ID)
1. any way to get the partitioned table to use a unique scan.
2. I don't quite understand the partitioned plan. I think it's saying that it is using partition
elimination? I don't understand why we have 'INLIST ITERATOR' and why we we have the clause
'PARTITION LIST ITERATOR'
3. Any suggestions? We want to break it up in part to reduce waits on the same blocks.
Primary Key
July 14, 2006 - 1am Central time zone
Reviewer: Hitesh Bajaj from INDIA
Hi Tom,
We in the warehouse has a table which has more than 1 billion rows.
The table structure is something like this:
CREATE TABLE P_Data
(
P_tag VARCHAR2(10),
P_timestamp TIMESTAMP,
Insert_date Date
)
PARTITION BY RANGE (Insert_date)
(
PARTITION part1 VALUES LESS THAN (to_date('13-mar-2006','dd-mon-yyyy')) ,
PARTITION part2 VALUES LESS THAN (to_date('14-mar-2006','dd-mon-yyyy')) ,
and so on...
)
/
This insert_date is the default sysdate when the record is inserted in the table. And the
P_Timestamp is the timestamp when the record was generated by the user community.
& we have a Primary Key based on P_tag which is huge and when ever we do play with some kind of
partition maintenance the PK becomes unusable, and it becomes very difficult to build the PK also,
it takes huge 12-13 hrs to build the PK. We even did try to use UPDATE GLOBAL INDEXES when doing
the partition maintenance but that takes even more longer to build the index.
Now we are thinking to partition the table by the P_Timestamp column and make a local PK index of
P_Timestamp + P_tag, since we feel that any file if trying to get loaded twice by the SQL*LOADER
will fail since the P_timestamp and P_tag will always be unique through out the application in
years to come.
What are your thoughts on this? This will also save us considerable down time in the warehouse.
Also suggest if there is any way to convert a table partitioning from one column to another OR it
is just to create another table and reload the data from the old table to the new constructed
table.
Appreciate a reply. Thanks
Followup July 14, 2006 - 8am Central time zone:
but so what if it takes longer with "update global indexes" - the goal of that command is simply:
"TAKE LONGER, but don't preclude access. Let people continue to work"
so - the question is "so"? It is supposed to take longer - but your end users don't know or care.
You have achieved your goal of "save us considerable down time in the warehouse" with update global
indexes - I'm confused?
Isn't the approach of having local indexes beneficial
July 14, 2006 - 11am Central time zone
Reviewer: Hitesh from INDIA
Hi Tom,
Agreed with you, but isn't the local PK index is beneficial in all cases where in the maintenance
time is negligible.
More importantly with the index becoming so huge the wait events for 'db file sequential read'
while inserting the records via sqlldr are increasing day by day.
Is that the idea of local PK index beneficail.
Thanks
Followup July 14, 2006 - 12pm Central time zone:
But having a local primary key index implies you partitioned by the primary key which means the
mainanence operations you want to do by time - well CANNOT BE DONE since the data isn't partitioned
by what you need it partitioned by !!!!
the primary key index need not be a single monster one either - it can be range partitioned (and in
10g hash partitioned) globally by itself.
The idea of a local index in support of your primary key only makes sense IF YOU CAN PARTITION BY
YOUR TABLE BY PRIMARY KEY - but you cannot, you want to partition by TIME.
Local PK Index
July 14, 2006 - 1pm Central time zone
Reviewer: Hitesh from INDIA
Hi Tom,
But having the PK index on the prefixed column (on which the partition has been done) + P_tag will
solve the problem.
What happens is the data files comes to us which needs to get loaded in the warehouse using sqlldr.
In these data files we have fixed format of a tag and the timestamp when that tag got generated.
These files needs to get loaded and the tag will come to us once thru' out the application
existence. Thus it makes sense to have a local PK index so that the same file should not get
reloaded twice by the different file name.
That what the validation we are looking forward.
Thanks
Followup July 14, 2006 - 1pm Central time zone:
but - you have changed your partitioning scheme, why did you partition in the first place?
what was the LOGIC behind your original partitioning scheme? Because you just just blown that
logic out of the water (one would presume "purging")
Local indexes
July 14, 2006 - 2pm Central time zone
Reviewer: hitesh from INDIA
Hi Tom,
Earlier logic was based on the purging of old data based on Insert_date which is the system date
and now the logic would shift to purging of partitions based on the dates for which the files were
created, or you may say when the users placed their request.
What used to happen was that sqlldr used to lag behind by 2-3 days so we will see non-uniform
patterns of partition data since we used to load all the data for the past two days in a single day
by removing the PK of the table.
Thus the partitons would become something like this, for a given day it will have 0 rows and for
the other day it will have 6 million rows.
Now with this new partitioning method the data will always be uniformly scattered across all
partititions.
Since , we would be implementing this approach just wanted to confirm from one of the greats in
oracle whether I am going to do the right thing
Appreciate for the answers
Followup July 14, 2006 - 2pm Central time zone:
what is the new partitioning scheme then?
and how will it facilitate purging?
or are you never going to purge again.
Purging Mechanism
July 14, 2006 - 3pm Central time zone
Reviewer: Hitesh from INDIA
New Partitioning Scheme would be based on column
P_Timestamp TIMESTAMP, which is when user placed the request
instead of Insert_date when the data got loaded, which will be always behind the P_Timestamp.
Earlier purging used to happen based on the insert_date now it will happen based on P_timestamp
column with no maintenance of maintaining a big PK key for nearly 1 billion rows.
Now to trace the file if it has been loaded twice it will fail since the P_timestamp along with the
P_tag will always be UNIQUE through out the application.
Hence there we are clear that the combination of P__timestamp + P_tag can give us the advantage of
Primary Key local indexed.
Thanks
Followup July 14, 2006 - 3pm Central time zone:
so basically you are entirely changing everything - the primary key, the partitioning scheme,
everything.
Hmm.
that is fine.
will it be "uniform" - depends, only you can answer that (you know the partitioning scheme, you
know your data, you should be able to answer that bit?)
Local Indexes
July 15, 2006 - 11am Central time zone
Reviewer: hitesh from INDIA
Hi Tom,
Yes the data will be quite uniform across partitions, we definitely get about 30 million rows
everday, but may or may not be able to load them. This results in loading sometimes backlog data
for 2-3 days in a given day making the partitioning scheme not so uniform.
One partition may have data 0 rows and the other will have the data pertaining to 6 million rows
since the data was loaded on that particular day, because the partitioning on Insert_date is by
default the sysdate.
Do we have a method to quickly change the partitioning scheme and the rows shifts accordingly
depending on the new partitioning scheme of things, OR we have to create a new table with new
define partitioning architecture and then insert the rows from the old table to this new table.
What should be the fastest approach!
Thanks
Followup July 15, 2006 - 3pm Central time zone:
you have to create a brand new table and load it. Every single row goes "somewhere else" now. The
primary key field - disappears. This is a new implementation, a re-architecture, a do-over.
what kind partition for this table
July 20, 2006 - 10pm Central time zone
Reviewer: zb from GA
I have the table PA_BUDGET_LINES for OLTP and batch job mix database. The table is over 50 miliion
records and size of 14GB.
I consider to partition this table and not sure what kind partition to be used.
There 3 indexes on the table now:
1) uqueue
on columns RESOURCE_ASSIGNMENT_ID, START_DATE
2) on
START_DATE, END_DATE
3) RESOURCE_ASSIGNMENT_ID, PERIOD_NAME
Based on the current indexes, what kind partition would you recommend?
it's 8.1.7.4 and we are soon to upgrade it 9.2.0.7
Thanks,
Zb
Followup July 22, 2006 - 5pm Central time zone:
you very much have to answer this question first and foremost:
What is it you wish to achieve by partitioning, what business problem are you trying to solve.
From there, we can start talking about "how". (if you have access to my book "Expert Oracle
Database Architecture" or "Effective oracle by Design", read the partitioning chapters so you can
fully understand what I'm trying to say here)
Partitioning is a tool.
Before we use a tool - we need to have something to "fix".
So, what is broken here that we are trying to fix.
what kind of partition method would be?
July 22, 2006 - 9pm Central time zone
Reviewer: ZB
<qoute
What is it you wish to achieve by partitioning, what business problem are you
trying to solve.
>
the table grows at the rate of more than 4 million records monthly even under the daily purge job
to maintain the size of the table.
Having that said, the table grows like it is now and we cannot do much more on controling the rate
growth unless the business rule altered -- that is very unlikely .
I am concerned the performance impact of the size of non-partitioned talbe, which eventually will
hit us. we have not noticed the performace issue yet for the current size of 50 million
records(14GB). We might be hit by performace once the size as 100 million resords or more.
I wonder if the partition-elimination will help on the performance if the partition implementated
on the table.
The applications queies use the table as the indexes intended:
There 3 indexes on the table now:
1) unqeue
on columns RESOURCE_ASSIGNMENT_ID, START_DATE
2) on columns
START_DATE, END_DATE
3) on columns
RESOURCE_ASSIGNMENT_ID, PERIOD_NAME
Any methods of partition could be helpful for the situation? Or do you think we will be OK on
performance perspective for non-partitioned table grows like it does, even in the future it would
be more than 100 million records next year?
TIA
ZB
Followup July 23, 2006 - 8am Central time zone:
why? why are you worried. If you retrieve say by primary key (or any fixed number of rows via an
index) then it is true that the time to retrieve data from a 5 row table can be about the same as
it takes to retrieve from a 5,000,000 row table or as a 5,000,000,000 row table.
So, do you plan on purging data?
What is the goal you wish to achieve via partitioning - given that in a transactional system - it
is more likely you negatively impact performance by applying partitioning if you are not very
careful, rather than positively affect transactional performance. It can be very useful to ease
administrative things, but in general is not a "performance" tool in OLTP.
ORA-14097 Error
September 12, 2006 - 4pm Central time zone
Reviewer: Vivek
I think what Mohamed (Mohamed from France March 24, 2006 )was trying to tell was that we need to
actually disable the constraints on the main table and the clone table using
alter table <main_table>
disable constraint <constraint_name>
alter table <clone_table>
disable constraint <constraint_name>
before we can execute the following:
alter table <main_table>
exchange partition <main_table_partition>
with table <clone_table>
including indexes without validation
command.
We cannot just do the "alter table exchange partition ... without validation" without physically
disabling the constraints on the main table and the clone table.
Otherwise we will get the "ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE
PARTITION" error.
I experienced this first hand.
Partition View
October 11, 2006 - 4am Central time zone
Reviewer: pradikan from India
Hi Tom,
one quick question. Is there any way to check what are all the columns included in a table
partition.
thanks in advance
sasikanth
Followup October 11, 2006 - 8am Central time zone:
all of them?
or do you mean "what is the partition key"?
user_part_key_columns
user_subpart_key_columns
reader
October 29, 2006 - 4pm Central time zone
Reviewer: A reader
exchanging partition is very good and time effective
Is there a similar method by which data can be appended to partitions. That is, can data in a
table be appended to the partition quickly besides utilizing , insert /*+ append */ method
Followup October 29, 2006 - 5pm Central time zone:
ok, I'm curious, why isn't the documented way to append with sql good enough?
there are direct path loads (sqlldr)
there are direct path loads (insert /*+ APPEND */ )
there are direct path loads (OCI - custom code)
Exchange partition
November 25, 2006 - 12pm Central time zone
Reviewer: Mohamed from France
Dear Tom,
I've just get your new book Expert Oracle Database Architecture. I jumped directly to Chapter 13
(Partitioning) particularly in its 612 page (Auditing and Segment Space Compression). After reading
this part of this Chapter, I have to ask you a question.
In our application we implemented a range partition on a master table say table T_master(ide,
dat_partition, ...)
where
pk : ide + dat_partition
partition key : dat_partition
locally partitioned index on : ide + dat_partition
partition : three months for each partition
To this master table are attached several partitioned child tables having ide +dat_partition as
Primary and Foreign key and are partitioned the same way as the master table
We are in Oracle 8.1.7.4.
We would like to archive 2 years of data (24 partitions by table)
Our goal is
: reduce disk space (because it cost)
: increase performance as far as tables will contain less data.
We have created 24 non partitioned tables For each partitionned table.
We intend to use
alter table T_master exchane partition P_T_master1 with
non_partitioned_Table_that_corresponds_To_P_T_master1;
Then we can drop P_T_master1 partition.
However it seems for me that this will certainly reduce the volume of data in the partitioned table
but will not gain disk space.
In addition we have developed screens on these partitioned tables and when we archive 24
partitions, corresponding data will not be available for end users.
And I have no idea how to bring them back so that the same screen will show the corresponding data
when partitions are back.
My questions are:
1. How can we imagine this archive in order to gain disk space ?
2. How can we put back archived partitions when the need to have them online directly from
existing screens will raise up?
3. How can we use read write and read only table space
Thanks in advance for your precious advises
Followup November 25, 2006 - 1pm Central time zone:
1) unless you actually remove data, how can you "gain disk space"? you have just changed the name
of the segment, you haven't taken it out.
2) you alter the table and put the partition back and exchange it in.
3) when the data is read only - make the tablespace read only? not sure what you mean.
Exchange partition
February 23, 2007 - 4pm Central time zone
Reviewer: CT VELU from MI
Hi Tom
I am following this site and your books for one of requirement in doing exchange partition. I am getting a error message while I try do exchange partition between a partitioned table with primary key (and global index) and a non partitioned table (no index). I tried few different options with the help of your book and this site but not quite successful ..........(either I am getting a column mis match or index mismatch). If I drop the constraint everything works fine.
Here is the script which generates error message.......
drop table my_partitioned;
create table my_partitioned
(run_nbr number(10),
cst_nbr number(10),
itm_nbr number(10),
yr number(4),
mth number(2),
sales number(4))
partition by list (run_nbr)
(
partition part_1 VALUES (1001),
partition part_2 VALUES (1002)
);
create index my_part_indx
on my_partitioned(run_nbr,cst_nbr,itm_nbr,yr,mth)
GLOBAL;
drop table dummy;
create table dummy
as
select *
from my_partitioned
where 1=2;
--create index dummy_index on dummy(run_nbr,cst_nbr,itm_nbr,yr,mth);
insert into my_partitioned (run_nbr,cst_nbr,itm_nbr,yr,mth,sales)
values(1001,5501,3456,2007,01,35);
insert into my_partitioned (run_nbr,cst_nbr,itm_nbr,yr,mth,sales)
values(1001,5501,3457,2007,01,35);
insert into my_partitioned (run_nbr,cst_nbr,itm_nbr,yr,mth,sales)
values(1001,5502,3456,2007,01,35);
insert into my_partitioned (run_nbr,cst_nbr,itm_nbr,yr,mth,sales)
values(1001,5503,3456,2007,01,35);
insert into my_partitioned (run_nbr,cst_nbr,itm_nbr,yr,mth,sales)
values(1001,5504,3456,2007,01,35);
commit;
select count(*)
from my_partitioned
where run_nbr = 1001;
select count(*)
from my_partitioned
where run_nbr = 1002;
select count(*)
from dummy;
ALTER TABLE my_partitioned
EXCHANGE PARTITION
PART_1 WITH TABLE dummy
including indexes
without validation
UPDATE GLOBAL INDEXES;
alter table my_partitioned add constraint
my_partitioned_pk
primary key(run_nbr,cst_nbr,itm_nbr,yr,mth);
-- Error pops up here either way you do the exchange-------
ALTER TABLE my_partitioned
EXCHANGE PARTITION PART_1 WITH TABLE dummy
including indexes
without validation
UPDATE GLOBAL INDEXES;
alter table my_partitioned drop constraint
my_partitioned_pk;
ALTER TABLE my_partitioned
EXCHANGE PARTITION PART_1 WITH TABLE dummy
including indexes
without validation
UPDATE GLOBAL INDEXES;
I tried creating a local unique index and a primary key on the table it works fine, but In the production table I would like to have local bitmap indexes on itm_nbr,yr, mth ( result set will be 2.5 to 5 million rows and 1000 distinct itm_nbr) and global primary key on cst_nbr(this will become a header number),itm_nbr,yr,mth.
Thank you very much for your time.
CT
Create a partitioned table using a select statement from another partitioned table
May 17, 2007 - 2pm Central time zone
Reviewer: Ryan from QC, CANADA
Tom,
is it possible to create a partitioned table using a select statement from another partitioned table.
Example:
create table2 --this the table I want to be partitioned
as
select * from table1 --this my partitioned table
thx
Followup May 18, 2007 - 3pm Central time zone:
yes, it is.
ops$tkyte%ORA10GR2> CREATE TABLE t
2 ( dt , x , y )
3 PARTITION BY RANGE (dt)
4 (
5 PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
6 PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
7 PARTITION junk VALUES LESS THAN (MAXVALUE)
8 )
9 as
10 select created dt, user_id x, username y from all_users
11 /
Table created.
Partitioned table question
June 7, 2007 - 5pm Central time zone
Reviewer: Menon
Hi Tom
Do you know off hand if in Oracle different partitions of a table can be stored in the same block?
Followup June 9, 2007 - 10am Central time zone:
that would be clustered tables, not partitions.
and would obviate the usefulness of partitioning in the first place.
Thanx!
June 11, 2007 - 10am Central time zone
Reviewer: Menon
I knew about clustered tables - in case of clustered tables it is different tables altogether sharing the same block. But I have not seen any doc/discussions on the partitions of the same table sharing a block. It is not very obvious to me why sharing the same block would completely defeat the purpose of paritioning.
Followup June 11, 2007 - 10am Central time zone:
tell me why you partition?
to do things like
a) partition elimination to speed a full scan. but wait, if you comingle the rows from many partitions on the same block - that is defeated
b) administer large data sets better - like 'truncate partition', but wait, you cannot truncate a table in a cluster, you can only truncate the cluster (which defeats all of the partitioning administration features)
c) higher availability via partition elimination, but you defeated that
d) colocation of related data - but you again defeat that by comingling it with other data
what possible advantage, real world use case - could you envision?
Thanx Tom!
June 14, 2007 - 11am Central time zone
Reviewer: Menon
Sometimes the simplest things come to you after a long time:) Let me ask you one further question which would clarify for me the whole issue. Is it true that for a normal unpartitioned heap table (not a cluster), there can not be sharing of block? That is, can one block contain rows from two different normal, unparitioned heap tables? I suspect the answer is no and that would explain why I was confused about partitioned tables.
Followup June 14, 2007 - 4pm Central time zone:
a cluster is the segment type that allows a block to hold rows from more than one table.
if the segment type is not cluster (eg: it is table) then the blocks in that segment will have rows from only one table.
Perfect, thanx!
June 14, 2007 - 4pm Central time zone
Reviewer: Menon
exchanging partitions
June 20, 2007 - 1am Central time zone
Reviewer: martina from vienna, austria
hello tom,
i have a package which handles our partitioned tables. it adds partitions, splits partitions and gets old partitions ready to be exported. to do this last part there is a procedure to exchange partitions with tables and this has given me some trouble.
it works in short like this:
1. create table new_table as select * from part_table where 1=2;
2. copy indices
3. copy not nulls
4. copy foreign keys
5. exchange old partition with new_table
6. drop old partition
i quickly found that i must not copy lob-indices. 2 days ago i realized i must not copy mv-function based indices. i know that i cannot handle function based indices and iots (so far). i know that a primary key or unique constraint must have the same name as the underlying index. (we still have 2 v8 databases and to put the cursor and loop into an anonymous block executed immediately did not work with v10).
given the cursor which indices to copy, what else am i missing:
procedure cpy_inx (p_fromtable varchar2,p_totable varchar2) is
begin
declare
w_cols varchar2(1000);
w_cname varchar2(30);
w_itsp varchar2(30);
w_inxname varchar2(27);
cursor c_inx is select i.index_name
,decode(i.uniqueness,'UNIQUE','UNIQUE',' ') uniqueness
,nvl(c.constraint_type,'X') constraint_type
,substr(i.index_name,length(i.index_name)-2) postfix
from user_indexes i,user_constraints c
where i.index_name = c.constraint_name (+)
and i.table_name = upper(p_fromtable)
and i.table_name = c.table_name (+)
and i.index_type not like '%LOB%'
and instr(i.index_name,'$') = 0 -- Against indices we don't want to copy
(fb-Index on stat_fil_mv)
and not exists (select 1
from user_indexes
where index_name =
upper(p_totable||substr(i.index_name,length(i.index_name)-2))
);
begin
select av_reorg.get_itsp(t.tablespace_name)
into w_itsp
from user_tables t
where t.table_name = upper(p_totable)
;
for r_inx in c_inx loop
begin
av_reorg.pput_line(r_inx.index_name);
declare cursor c_icols is
select n.column_name
from user_ind_columns i,user_tab_columns o,user_tab_columns n
where i.index_name = r_inx.index_name
and o.table_name = upper(p_fromtable)
and o.column_name = i.column_name
and n.table_name = upper(p_totable)
and n.column_id = o.column_id
order by i.column_position;
begin
w_cols := ' ';
for r_icols in c_icols loop
w_cols := w_cols || r_icols.column_name ||',';
av_reorg.pput_line(r_icols.column_name);
end loop;
w_cols := substr(w_cols,1,length(w_cols) - 1);
av_reorg.pput_line(w_cols);
end; select 'I'||av_reorg_seq.nextval
into w_inxname
from dual;
if r_inx.constraint_type = 'P'
then
bml_views.cre_pk(p_totable,w_inxname||r_inx.postfix,w_itsp,w_cols,r_inx.uniqueness);
else
bml_views.creinx(p_totable,w_inxname||r_inx.postfix,w_itsp,w_cols,r_inx.uniqueness);
end if;
exception when others then
av_reorg.pput_line('wasis');
av_reorg.handle_exception;
end;
end loop;
end;
end cpy_inx;
thank you in advance!
martina
Index rebuilding on table partition
January 15, 2009 - 1am Central time zone
Reviewer: Raja from India
Hi,
With respect to table partitioning, i have doubt on index rebuilding.
I have a Oracle 10gR1 database used for datawarehousing.
There are around 200 tables with partition made already
(partitions/subpartitions made for every month...) and some 200 tables without partitions.
Now i have to create/extend the partition. ( For the tables that have been partitioned already,
some have data in max partition and some dont have data in max partition ).
to create/extend the partition/max partition, before partitioning we have to disable the
constraints & drop the indexes and after partitioning we have to enable the constraints & re-create
those indexes.
Instead of dropping and re-creating the indexes again, i though of rebuilding the indexes. i felt
that this would also save the time.
so to rebuild the indexes, on which partition was done ( now the indexes should be in UNUSABLE
state ), i felt that i should check whether the following scenarios would work out properly (i.e, i
wanted to be sure that rebuilding the indexes should not be a problem under the scenarios that i
have mentioned you ).
a. first i want to know, how to check whether there is data or not in the max partition ?
b. second, how to find whether an index is a local index or a global index ?
I would like to tell the scenarios :
1. rebuilding indexes on a max partition with data
2. rebuilding indexes on a max partition with data on local indexes
3. rebuilding indexes on a max partition with data on global indexes
4. rebuilding indexes on a max partition without data
5. rebuilding indexes on a max partition without data on local indexes
6. rebuilding indexes on a max partition without data on global indexes
Table partition details are as follows :
-------------------------------------------------
partitioning_type subpartitioning_type
list none
range hash
Index partition details are as follows :
-------------------------------------------------
partitioning_type subpartitioning_type
range hash
index type details are as follows :
-------------------------------------------------
NORMAL
IOT - TOP
For example : i have the following create script of a sample table :
------------------------------------------------------------
CREATE TABLE ID_ALL
(
REP_ENT_COD_SK_NB NUMBER(12) NOT NULL,
COST_CENTER_COD_SK_NB NUMBER(12) NOT NULL,
POST_ACCN_MAJOR_COD_SK_NB NUMBER(12) NOT NULL,
PRDMKT_COD_SK_NB NUMBER(12) NOT NULL,
SCENARIO_COD_SK_NB NUMBER(12) NOT NULL,
DATA_TYPE_COD_SK_NB NUMBER(12) NOT NULL,
ACCN_PERIOD_COD_SK_NB NUMBER(12) NOT NULL,
CURRENCY_FUNC_COD_SK_NB NUMBER(12) NOT NULL,
OP_AMT_FAA_NB NUMBER,
USD_AMT_FAA_NB NUMBER,
SESSION_INST_COD_VC VARCHAR2(30 BYTE) NOT NULL
)
PARTITION BY RANGE (SESSION_INST_COD_VC)
SUBPARTITION BY HASH (COST_CENTER_COD_SK_NB)
SUBPARTITION TEMPLATE
(SUBPARTITION SP_01 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION SP_02 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION SP_03 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION SP_04 TABLESPACE ID_DW_FCT_0128_DATA
)
(
PARTITION ID_P_20070800A VALUES LESS THAN ('20070800A-000')
( SUBPARTITION ID_P_20070800A_SP_01,
SUBPARTITION ID_P_20070800A_SP_02,
SUBPARTITION ID_P_20070800A_SP_03,
SUBPARTITION ID_P_20070800A_SP_04 ),
PARTITION ID_P_20070900A VALUES LESS THAN ('20070900A-000')
( SUBPARTITION ID_P_20070900A_SP_01,
SUBPARTITION ID_P_20070900A_SP_02,
SUBPARTITION ID_P_20070900A_SP_03,
SUBPARTITION ID_P_20070900A_SP_04 ),
.
.
PARTITION ID_P_20090400A VALUES LESS THAN ('20090400A-000')
( SUBPARTITION ID_P_20090400A_SP_01,
SUBPARTITION ID_P_20090400A_SP_02,
SUBPARTITION ID_P_20090400A_SP_03,
SUBPARTITION ID_P_20090400A_SP_04 ),
PARTITION ID_P_MAX VALUES LESS THAN (MAXVALUE)
( SUBPARTITION ID_P_MAX_SP_85,
SUBPARTITION ID_P_MAX_SP_88,
SUBPARTITION ID_P_MAX_SP_89,
SUBPARTITION ID_P_MAX_SP_90 )
)
ENABLE ROW MOVEMENT;
CREATE UNIQUE INDEX ID_ALL_PK ON ID_ALL
(COST_CENTER_COD_SK_NB, POST_ACCN_MAJOR_COD_SK_NB, PRDMKT_COD_SK_NB,
ACCN_PERIOD_COD_SK_NB, DATA_TYPE_COD_SK_NB,
SCENARIO_COD_SK_NB, REP_ENT_COD_SK_NB, CURRENCY_FUNC_COD_SK_NB);
ALTER TABLE ID_ALL ADD (
CONSTRAINT ID_ALL_PK
PRIMARY KEY
(COST_CENTER_COD_SK_NB, POST_ACCN_MAJOR_COD_SK_NB, PRDMKT_COD_SK_NB,
ACCN_PERIOD_COD_SK_NB, DATA_TYPE_COD_SK_NB, SCENARIO_COD_SK_NB,
REP_ENT_COD_SK_NB, CURRENCY_FUNC_COD_SK_NB));
the alter script for the table will be as follows :
------------------------------------------------------------
ALTER TABLE ID_ALL
SPLIT PARTITION ID_P_MAX AT ('20090400A-000')
INTO (
PARTITION ID_P_20090400A
NOLOGGING
TABLESPACE ID_DW_FCT_0128_DATA
PCTFREE 10
INITRANS 1
MAXTRANS 255
STORAGE (
BUFFER_POOL DEFAULT
)
( SUBPARTITION ID_P_20090400A_SP_01 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION ID_P_20090400A_SP_02 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION ID_P_20090400A_SP_03 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION ID_P_20090400A_SP_04 TABLESPACE ID_DW_FCT_0128_DATA),
PARTITION ID_P_MAX
NOLOGGING
TABLESPACE ID_DW_FCT_0128_DATA
PCTFREE 10
INITRANS 1
MAXTRANS 255
STORAGE (
BUFFER_POOL DEFAULT
)
( SUBPARTITION ID_P_MAX_SP_85 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION ID_P_MAX_SP_88 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION ID_P_MAX_SP_89 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION ID_P_MAX_SP_90 TABLESPACE ID_DW_FCT_0128_DATA)
);
Since i dont know much about partitioning concepts, also please explain me the above scripts.
Please explain me the above doubts on index rebuilding... in detail.
Thanks in Advance.
With Regards,
Raja.
Followup January 16, 2009 - 5pm Central time zone:
... to create/extend the partition/max partition, before partitioning we have to
disable the constraints & drop the indexes and after partitioning we have to
enable the constraints & re-create those indexes.
...
you lost me on that. Not sure at all what you are trying to say. First, I don't know what it means to "extend" a partition. Second, to add a partition (create a partition) you don't need to do what you say.
a) select 'yes' from table partition(max_partition_name) where rownum = 1;
if that returns 'yes', then yes, there is data in that max partition.
b) query user_part_indexes, locality will tell you global or local.
... Since i dont know much about partitioning concepts, also please explain me the
above scripts.
...
you best has rectify that before you start plodding along - there is a LOT of information out there on partitioning, start with the concepts guide, move onto the warehousing guide - BEFORE you start doing anything. I agree that you don't have the concepts down yet - you need to do that FIRST.
default attributes
February 11, 2009 - 6am Central time zone
Reviewer: martina from vienna, austria
Hi Tom,
I am periodically creating new partitions. For some tables I have to split - they are no problem.
But with some tables I have to add and I take min(tablespace_name) from user_tab_partitions. that
is not very good but max() would be as bad. In which DD view can I find the default attributes ?
thank you!
martina
Followup February 11, 2009 - 10am Central time zone:
I don't know what you mean - why would you take "min(tablespace_name)"???? I'm not following your logic here at all - I have no idea what your goal is.
find a tablespace
February 16, 2009 - 5am Central time zone
Reviewer: martina from vienna, austria
Hi Tom,
i took min(tablespace_name) because i did not know what else to take ... did not know that
dba_part_tables is dba_part_tables - i was searching (the database) for something like '%PART%DEF%'
...
sorry for bothering you!
regards, martina
Followup February 16, 2009 - 12pm Central time zone:
you didn't bother me, but you still haven't told us what you are actually *trying* to accomplish.
If you did, we might be able to add something useful.
partition is out of balance
February 18, 2009 - 12am Central time zone
Reviewer: bhaskar from India
Hi Tom,
I have a hash partition in one of my histry table.There are 16 partitions intaht table.All have hash partitions.
My problem is out of these 16 partitions one partition is getting populated heavily.
i have checked dba_tab_partition view fro initial_extent,min_extent,max_extent,next_extent colums of all the partitons .I found same for all.
But num_rows column of that heavily populated column of the partiton (say p16)is 3.5 times greater than the next largly populated partition(say p15).
And blocks column 0f p16 is 3.80 times grater tahn p15.
Now whatever data p15 contains is normal but is that p16 partition which abnormally containing huge data is of that bolcks column value difference?
Hope i expaline dmy problem clearly.
Thanks in advance.
Followup February 18, 2009 - 8am Central time zone:
you do not describe the table AT ALL. You do not tell us what you partitioned on, nor what this columns domain of values is.
So, I'll guess. You partitioned on something that has low distinct cardinality. Say this column has 32 distinct values. You will probably (almost certainly) NOT get 2 values per partition (I'd be really surprised if you did). You would probably get some partitions that had 4 or 5 of the values, some with zero, some with 1, 2, or 3. You would be really "unbalanced"
when hashing, you want (need, have to) pick something that has a TON of distinct values - else - you will not, cannot, get an even distribution.
consider the following - load one time, hash on something with 32 values, load another time, hash on something unique - very different distributions:
ops$tkyte%ORA10GR2> CREATE TABLE t
2 (
3 dt date,
4 x int,
5 y varchar2(30)
6 )
7 PARTITION BY HASH (x)
8 ( partition p01, partition p02, partition p03, partition p04,
9 partition p05, partition p06, partition p07, partition p08,
10 partition p09, partition p10, partition p11, partition p12,
11 partition p13, partition p14, partition p15, partition p16
12 )
13 /
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t
2 select sysdate, mod(rownum,32), 'x' from all_objects;
49965 rows created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> variable x refcursor
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> declare
2 l_sql long := 'select ''p01'' pname, count(*) cnt from t partition(p01)';
3 begin
4 for i in 2..16
5 loop
6 l_sql := l_sql || ' union all select ''p'||to_char(i,'fm00')||''', count(*)
from t partition(p'||to_char(i,'fm00')||')';
7 end loop;
8
9 l_sql := 'select pname, cnt, substr( rpad(''*'',30*round(
cnt/max(cnt)over(),2),''*''),1,30) from (' || l_sql || ') order by pname';
10 open :x for l_sql;
11 end;
12 /
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> print x
PNA CNT SUBSTR(RPAD('*',30*ROUND(CNT/M
--- ---------- ------------------------------
p01 0
p02 1561 ******
p03 3123 ************
p04 3123 ************
p05 3123 ************
p06 4684 ******************
p07 4684 ******************
p08 7806 ******************************
p09 4685 ******************
p10 4684 ******************
p11 1561 ******
p12 1562 ******
p13 0
p14 4684 ******************
p15 3123 ************
p16 1562 ******
16 rows selected.
ops$tkyte%ORA10GR2> truncate table t;
Table truncated.
ops$tkyte%ORA10GR2> insert into t select sysdate, object_id, 'x' from all_objects;
49965 rows created.
ops$tkyte%ORA10GR2> declare
2 l_sql long := 'select ''p01'' pname, count(*) cnt from t partition(p01)';
3 begin
4 for i in 2..16
5 loop
6 l_sql := l_sql || ' union all select ''p'||to_char(i,'fm00')||''', count(*)
from t partition(p'||to_char(i,'fm00')||')';
7 end loop;
8
9 l_sql := 'select pname, cnt, substr( rpad(''*'',30*round(
cnt/max(cnt)over(),2),''*''),1,30) from (' || l_sql || ') order by pname';
10 open :x for l_sql;
11 end;
12 /
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> print x
PNA CNT SUBSTR(RPAD('*',30*ROUND(CNT/M
--- ---------- ------------------------------
p01 3113 ****************************
p02 3163 *****************************
p03 3159 *****************************
p04 3062 ****************************
p05 3164 *****************************
p06 3114 ****************************
p07 3099 ****************************
p08 3142 *****************************
p09 3128 ****************************
p10 3115 ****************************
p11 3160 *****************************
p12 3115 ****************************
p13 3051 ****************************
p14 3060 ****************************
p15 3245 ******************************
p16 3075 ****************************
16 rows selected.
partition is out of balance
February 19, 2009 - 2am Central time zone
Reviewer: Bhaskar from India
Hi Tom,
First many thanks for such a solution.Sorry for providing lack of informations.
My table has near 1000 million data as it is a history table.
Partition_key (col3) has distinct 926 values.
I'm providing a dummy table structure.
create table tab1(col1 varchar2(5),col2 varchar2(4),col3 number (5));
partion p1 tablespace t1
...
partion p16 tablespace t1
Now say partiton p16 has higest amount of data which is nearly 3.8 time greater tahn the next
partiotiuon (say p15)containing highest amount.
Now i understood as the cardinality is such low data is not evenly distributed.As this table is a
history table in near future this cardinality will increase then i hope to see even data:)
Now another problem is one value (say col3 =1) of col3 from the 926 distinct values only goes to
p16 partition but surprisingly this is not going to any other partitons.We have no other objects on
this table to control the flow of data between partitions.
Followup February 19, 2009 - 8am Central time zone:
... has distinct 926 values.
...
insufficient for 1 billion rows (as is 32 partitions probably)
If this is a history table with 1 billion rows already - that 926 distinct cardinality number doesn't see to be rising fast enough to make a difference.
... Now another problem is one value (say col3 =1) of col3 from the 926 distinct
values only goes to p16 partition but surprisingly this is not going to any
other partitons. ...
really, that is surprising? That when col3=1, all rows go into a single partition? Think about it, how else could it work?
if the data, when col3=1, was to go into different partitions - if the HASH(1) was not deterministic, we'd never be able to find your data again.
What to Partition on
April 16, 2009 - 11pm Central time zone
Reviewer: A Reader from India
Hi Tom,
We have Type-II Dimensions where in we maintain two partitions (currrent Rec and historical Rec)
based on a flag "Is_Current" (values being 1 for current records and null for historical records)
Today's data will be lying in Current Partition and last 13 months of data will be lying in the
historical partition. One such dimension has 2 billion rows in historical rec and 2.5 billion in
the curent rec.
This further has been divided across 64 hash sub partitons based on the SearchTerm_Key.
However most of the queries in the warehouse execute with the WHERE clause to find the
SearchTerm_Keys which are ACTIVE.
effective_start_date <= to_date('04/16/2009','MM/DD/YYYY') and
(effective_end_date > to_date('04/16/2009','MM/DD/YYYY') or effective_end_date is NULL)
effective_start_date is the date when the new SearchTerm gets created.
effective_end_date is the date when the search term gets de-activated in the system.Being null
means that the searchTerm Key is still ACTIVE term.
However when we scan it takes around 57 mins to scan the table and filter out the rows since the
whole table is scanned.
What partition strategy would you recommend for this approach, so that we can take advantage of
partition pruning.
Also every month we need to archive the oldest data before purging it off.
Thanks
Followup April 17, 2009 - 9am Central time zone:
if the question is:
... What partition strategy would you recommend for this approach, so that we can
take advantage of
partition pruning.
...
and the predicate is:
effective_start_date <= to_date('04/16/2009','MM/DD/YYYY') and
(effective_end_date > to_date('04/16/2009','MM/DD/YYYY') or
effective_end_date is NULL)
You would either partition by start date or end date
which one is in general 'more selective'? Which would result in less data being scanned - in general?
And if you do end_date - bear in mind, you'll need to enable row movement and be expecting rows to move as you update the end date.
one more clarification
April 20, 2009 - 12am Central time zone
Reviewer: A Reader from INDIA
<code>Hi Tom,
Thanks for your reply.
Indeed if you see the WHERE clause, the effective_start_date is being scanned for the whole table as it is <= (Current_date) so doing partition by Range on Effective Start date is meaningless.
However if we do it by effective end date the rows will move across partitions as soon as we update them based on Type -II dimension.
how much overhead do your forsee in this approach.
I have another question if you can answer, Do we need bitmap indexes on Colums for dimension tables.
My answer to this is NO, since the fact Keys will need them for reducing the set during bitmap transformation in a single logical operation.
However I had a debate with DBA and he says yes they are needed on Dimension tables. i agree that the Bitmap or B*tree indexes are required on the Dimension tables for which we specify the WHERE clause during star transformation but essentially not on the Ids.
desc DIM_COM_CAMPAIGN
Name Null Type
------------------------------ -------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HISTORICAL_CMPGN_SID NOT NULL NUMBER
CMPGN_SID NUMBER
CMPGN_KEY NUMBER(19)
ACCOUNT_KEY VARCHAR2(300)
CMPGN_NAME VARCHAR2(50 CHAR)
CMPGN_STATUS NUMBER
CMPGN_START_DATE DATE
CMPGN_END_DATE DATE
CMPGN_GOAL VARCHAR2(4000)
CMPGN_SEARCH_ENABLED NUMBER(1)
CMPGN_ADVANCE_ENABLED NUMBER(1)
CMPGN_CONTENT_ENABLED NUMBER(1)
CMPGN_WATCH_FLAG NUMBER(1)
CMPGN_INACTIVE_DATE DATE
CMPGN_LOOKUP_VAL VARCHAR2(300)
CMPGN_SOURCE VARCHAR2(30)
CMPGN_MONTHLY_SPEND_AMT NUMBER(14,4)
CMPGN_OPT_ENABLED NUMBER(1)
CMPGN_DEL_TMSTMP DATE
CMPGN_CM_MAX_BID NUMBER(14,4)
CMPGN_CM_MAX_BID_DATE DATE
CMPGN_SS_MAX_BID NUMBER(14,4)
CMPGN_SS_MAX_BID_DATE DATE
CREATED_DATE DATE
LAST_UPDATE_DATE DATE
GEOSET_SID NUMBER
IS_CURRENT_FLAG NUMBER(1)
EFFECTIVE_START_DATE DATE
EFFECTIVE_END_DATE DATE
POW_ACCOUNT_KEY
Followup April 21, 2009 - 12pm Central time zone:
... Indeed if you see the WHERE clause, the effective_start_date is being scanned
for the whole table as it is <= (Current_date) so doing partition by Range on
Effective Start date is meaningless.
....
I told YOU to evaluate "which would be in general more selective, partition on that". You were to evaluate that - I cannot tell if the ONLY predicate you ever do is the one you quoted above. I cannot tell that "..table as it is <= (Current_date ..." is the rule - the only way you do it.
... how much overhead do your forsee in this approach. ...
only YOU can evaluate that, how often do you update it? how many rows at a time do you update? When you benchmark it, how much does it seem to impact you? The physics are easy - the update will be done internally as a DELETE+INSERT. What impact will that have on you? It totally depends. If you have a process that updates 1,000,000 records right now in one update, it might be a show stopper. On the other hand, if you update 100,000 rows constantly during the day (about 1 or 2 per second), you probably won't notice it. It totally depends
One word: benchmark
... Do we need bitmap indexes on Colums
for dimension tables.
...
of course the answer to that is "no", you never 'need' anything, everything is a choice.
... My answer to this is NO, since the fact Keys will need them for reducing the
set during bitmap transformation in a single logical operation.
...
that I did not understand or follow at all, not sure what you are trying to say
... i agree that the Bitmap or B*tree indexes are required on the Dimension
tables for which we specify the WHERE clause during star transformation but
essentially not on the Ids.
....
but now you bring star transformation into play, if you want star transformation then you WILL have bitmap indexes.
http://download.oracle.com/docs/cd/B19306_01/server.102/b14223/schemas.htm#sthref1591
but it'll be on the fact table...
Queries on partition
April 23, 2009 - 7am Central time zone
Reviewer: A
Hello,
In Oracle 9i when I update a STATUS column (table is partitioned on status column) then row will be
moved from one partition (say p1) to another (say p5), will it delete from p1 and insert in p5? If
this is the case, will it fragement the partition (p1)?
What is the impact of query running full table scan on a partitioned table? As we don't want to end
up in a situation where the full table scan on partitioned table takes more time than the
non-partitioned table?
Thanks in advance
Followup April 27, 2009 - 10am Central time zone:
deletes do no "fragment" anything. The delete will free up space in partition p1 that will be available to be reused by some future insert into p1 (or an update might use it)
... What is the impact of query running full table scan on a partitioned table? ...
pretty much the same as a regular table - with the exception of course that if we can eliminate partitions - the partitioned table full scan might be a lot faster.
Partitioning Question
September 3, 2009 - 3am Central time zone
Reviewer: Prats
I have a table with initial size of 100M records. An insertion of 2M records would happen every
month.
I am going to use partitioning on the basis of date.
Is it beneficial to have each partition stored in a seperate tablespaces [say p1 in t1, p2 in
t2...]. Will there be any benefit in terms of performance, dropping partition maintainence etc
Please help
Followup September 3, 2009 - 7am Central time zone:
tablespaces are a book-keeping device you use to organize data in a fashion that makes you feel better about yourself, that improves your life somehow, that makes you sleep better at night.
They are not about performance really - if two tablespaces reside on the same single disk that is not striped or anything - would it make any difference "performance wise"?
It is all about spreading the IO out. Stripe the file system (use ASM for example) and spread the IO out. Tablespaces don't do it.
Think Bitmap Join Indexes
September 3, 2009 - 3pm Central time zone
Reviewer: Rick from St. Louis, MO
A Reader from India: <<I have another question if you can answer, Do we need bitmap indexes on
Colums for dimension
tables. My answer to this is NO, since the fact Keys will need them for reducing the set during
bitmap
transformation in a single logical operation.
However I had a debate with DBA and he says yes they are needed on Dimension tables. i agree that
the Bitmap or B*tree indexes are required on the Dimension tables for which we specify the WHERE
clause during star transformation but essentially not on the Ids.>>
Look into Bitmap Join Indexes -- they basically pre-join dimensions to fact tables given you turn
on star transform, don't do refresh on commit, etc.
How can I find the unpartitioned constraints?
October 29, 2009 - 1pm Central time zone
Reviewer: Teresa from Toronto, Ontario
Hi Tom,
I'm trying to extract a list of all primary key constraints of partitioned tables that are not
partitioned. Can you please help me out?
Followup October 29, 2009 - 1pm Central time zone:
assumption: indexes are in same schema as table. fix to use dba_ views with owner column if not true:
ops$tkyte%ORA10GR2> CREATE TABLE t1
2 (
3 x int,
4 dt date,
5 y varchar2(30)
6 )
7 PARTITION BY RANGE (x)
8 (
9 PARTITION part1 VALUES LESS THAN (10),
10 PARTITION part2 VALUES LESS THAN (20)
11 )
12 /
Table created.
ops$tkyte%ORA10GR2> create unique index t1_pk_idx on t1(x) global;
Index created.
ops$tkyte%ORA10GR2> alter table t1 add constraint t1_pk primary key(x);
Table altered.
ops$tkyte%ORA10GR2> CREATE TABLE t2
2 (
3 x int,
4 dt date,
5 y varchar2(30)
6 )
7 PARTITION BY RANGE (x)
8 (
9 PARTITION part1 VALUES LESS THAN (10),
10 PARTITION part2 VALUES LESS THAN (20)
11 )
12 /
Table created.
ops$tkyte%ORA10GR2> create unique index t2_pk_idx on t2(x) local;
Index created.
ops$tkyte%ORA10GR2> alter table t2 add constraint t2_pk primary key(x);
Table altered.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select uc.constraint_name, uc.table_name, uc.index_name,
2 nvl( up.partitioning_type, 'not partitioned' ) ptype
3 from user_constraints uc, user_part_indexes up
4 where uc.constraint_type = 'P'
5 and uc.table_name in ( 'T1', 'T2' )
6 and uc.index_name = up.index_name (+)
7 /
CONST TABLE INDEX_NAME PTYPE
----- ----- ------------------------------ ---------------
T1_PK T1 T1_PK_IDX not partitioned
T2_PK T2 T2_PK_IDX RANGE

November 6, 2009 - 1am Central time zone
Reviewer: Helena Marková from Bratislava, Slovakia
|