Skip to Main Content

Breadcrumb

May 4th

Question and Answer

Tom Kyte

Thanks for the question, seshu.

Asked: September 09, 2000 - 11:33 pm UTC

Last updated: August 17, 2012 - 11:48 am UTC

Version: 8.0.5.1

Viewed 10K+ times! This question is

You Asked

Hi,

I've a table : " TABLE2000 "
with Partitions :
TAB_1999_H1( Tablespace TAB_DATA_99)
TAB_1999_H2( Tablespace TAB_DATA_99)
TAB_2000_H1( Tablespace TAB_DATA_00)
TAB_2000_H1( Tablespace TAB_DATA_00)
Now,I've modified the table storage parameters for 1999 year
partitions and
created a NEW table namely, " TABLE2000_NEW"
with Partitions :
NEWTAB_1999_H1( Tablespace NEWTAB_DATA_99)
NEWTAB_1999_H2( Tablespace NEWTAB_DATA_99)

Also Inserted the 1999 year data into new partitions.

Now, I want to drop the earlier( 1999 year ) partitions and
want to append the newly created partitions to OLD table with 2000 year data.

My LATEST table should have structure(partitions) like this :

NEWTAB_1999_H1( Tablespace NEWTAB_DATA_99)
NEWTAB_1999_H2( Tablespace NEWTAB_DATA_99)
TAB_2000_H1( Tablespace TAB_DATA_00)
TAB_2000_H1( Tablespace TAB_DATA_00)


Could you please suggest me the procedural way with example,which
can save my valuable time(I need to finish this before 8am tomorrow ie 10th Sep)

Thanks in advance, Regards

and Tom said...

What we will do is create a table that looks like your partitions and play the old "swap" routine. We can exchange partitions with tables -- turning a parition into a table or a table into a partition or both.... Here is how it might look:

ops$tkyte@DEV816> CREATE TABLE table2000
2 ( x int,
3 y int,
4 z DATE
5 )
6 PARTITION BY RANGE (z)
7 (
8 PARTITION tab_1999_h1 VALUES LESS
9 THAN(to_date('30-jun-1999','dd-mon-yyyy')),
10 PARTITION tab_1999_h2 VALUES LESS
11 THAN(to_date('31-dec-1999','dd-mon-yyyy')),
12 PARTITION tab_2000_h1 VALUES LESS
13 THAN(to_date('30-jun-2000','dd-mon-yyyy')),
14 PARTITION tab_2000_h2 VALUES LESS
15 THAN(to_date('31-dec-2000','dd-mon-yyyy'))
16 )
17 /

Table created.

insert into table2000 values ( 1, 1, '15-jun-1999' );
insert into table2000 values ( 2, 2, '15-dec-1999' );
insert into table2000 values ( 3, 3, '15-jun-2000' );
insert into table2000 values ( 4, 4, '15-dec-2000' );


ops$tkyte@DEV816> CREATE TABLE new_table2000
2 ( x int,
3 y int,
4 z DATE
5 )
6 PARTITION BY RANGE (z)
7 (
8 PARTITION newtab_1999_h1 VALUES LESS
9 THAN(to_date('30-jun-1999','dd-mon-yyyy')),
10 PARTITION newtab_1999_h2 VALUES LESS
11 THAN(to_date('31-dec-1999','dd-mon-yyyy'))
12 )
13 /

Table created.

insert into NEW_table2000 values ( -1, -1, '15-jun-1999' );
insert into NEW_table2000 values ( -2, -2, '15-dec-1999' );


So, how to make the NEW partitions replace the old? we will make A new partition become a table and swap that table for an OLD partition. It'll look like this:

ops$tkyte@DEV816> create table temp ( x int, y int, z date );
Table created.

ops$tkyte@DEV816> alter table new_table2000
2 exchange partition newtab_1999_h1
3 with table temp;

Table altered.

ops$tkyte@DEV816>
ops$tkyte@DEV816> alter table table2000
2 exchange partition tab_1999_h1
3 with table temp;

Table altered.

ops$tkyte@DEV816>
ops$tkyte@DEV816> rename temp to table2000_tab_1999_h1;

Table renamed.

ops$tkyte@DEV816>
ops$tkyte@DEV816> select * from table2000;

X Y Z
---------- ---------- ---------
-1 -1 15-JUN-99
2 2 15-DEC-99
3 3 15-JUN-00
4 4 15-DEC-00

ops$tkyte@DEV816> select * from table2000_tab_1999_h1;

X Y Z
---------- ---------- ---------
1 1 15-JUN-99

ops$tkyte@DEV816> select * from new_table2000;

X Y Z
---------- ---------- ---------
-2 -2 15-DEC-99


So, that shows that table2000 has the new paritition swapped it. We've saved its old data in a table and new_table2000 no longer has that data (it has an empty partition now). We simply repeat the process:

ops$tkyte@DEV816>
ops$tkyte@DEV816> create table temp ( x int, y int, z date );

Table created.

ops$tkyte@DEV816>
ops$tkyte@DEV816> alter table new_table2000
2 exchange partition newtab_1999_h2
3 with table temp;

Table altered.

ops$tkyte@DEV816>
ops$tkyte@DEV816> alter table table2000
2 exchange partition tab_1999_h2
3 with table temp;

Table altered.

ops$tkyte@DEV816>
ops$tkyte@DEV816>
ops$tkyte@DEV816> rename temp to table2000_tab_1999_h2;

Table renamed.

ops$tkyte@DEV816>
ops$tkyte@DEV816>
ops$tkyte@DEV816> select * from table2000;

X Y Z
---------- ---------- ---------
-1 -1 15-JUN-99
-2 -2 15-DEC-99
3 3 15-JUN-00
4 4 15-DEC-00

ops$tkyte@DEV816> select * from table2000_tab_1999_h2;

X Y Z
---------- ---------- ---------
2 2 15-DEC-99

ops$tkyte@DEV816> select * from new_table2000;

no rows selected

ops$tkyte@DEV816>

and the partitions are swapped.

Rating

  (103 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

When should a table be partitioned? Version 8.1.7.0.0

Drew, April 29, 2002 - 3:51 pm UTC

Hi Tom,
We have some tables that have a few million records and keep growing. We may need to partition these tables. My question is: based on your knowledge, when do we need to think of partitioning a table, e.g., about howmany records? How can we tell if a table needs partitioning or performance problems are caused by not partitioning the table? Thank you very much!

Tom Kyte
April 29, 2002 - 8:59 pm UTC

It is not necessarily a number of records or even a size in gigabytes -- it is whether the things you do would be more easily done with it.

do you need to age data off? classic partitioning example...

could you benefit from partition elimination? eg: all or most of your queries always use some date range, or some criteria. Rather then adding that to each index, you could partition by it.

are your tables getting larger then you feel comfortable managing? eg: it might take longer to restore a 100gig tablespace then 1-10 gig tablespace (and the other 90gig of data is online whilst doing this)

and so on. It is when the benefits of partitioning become tangible to you -- that is when partitioning should be used.

A reader, February 19, 2003 - 1:13 pm UTC

Hi Tom,

Am reading Datawarehousing guide as per your suggestion and its helping me a lot. Thanks tom. I have some doubts.

</code> http://otn.oracle.com/docs/products/oracle9i/doc_library/release2/server.920/a96520/transfor.htm#13138 <code>

As per docs

<quote>

Parallel Load Scenario
This section presents a case study illustrating how to create, load, index, and analyze a large data warehouse fact table with partitions in a typical star schema. This example uses SQL*Loader to explicitly stripe data over 30 disks.

The example 120 GB table is named facts.
The system is a 10-CPU shared memory computer with more than 100 disk drives.
Thirty disks (4 GB each) are used for base table data, 10 disks for indexes, and 30 disks for temporary space. Additional disks are needed for rollback segments, control files, log files, possible staging area for loader flat files, and so on.
The facts table is partitioned by month into 12 partitions. To facilitate backup and recovery, each partition is stored in its own tablespace.
Each partition is spread evenly over 10 disks, so a scan accessing few partitions or a single partition can proceed with full parallelism. Thus there can be intra-partition parallelism when queries restrict data access by partition pruning.
Each disk has been further subdivided using an operating system utility into 4 operating system files with names like /dev/D1.1, /dev/D1.2, ... , /dev/D30.4.
Four tablespaces are allocated on each group of 10 disks. To better balance I/O and parallelize table space creation (because Oracle writes each block in a datafile when it is added to a tablespace), it is best if each of the four tablespaces on each group of 10 disks has its first datafile on a different disk. Thus the first tablespace has /dev/D1.1 as its first datafile, the second tablespace has /dev/D4.2 as its first datafile, and so on, as illustrated in Figure 13-5.


Step 1: Create the Tablespaces and Add Datafiles in Parallel
The following is the command to create a tablespace named Tsfacts1. Other tablespaces are created with analogous commands. On a 10-CPU machine, it should be possible to run all 12 CREATE TABLESPACE statements together. Alternatively, it might be better to run them in two batches of 6 (two from each of the three groups of disks).

CREATE TABLESPACE TSfacts1
DATAFILE /dev/D1.1' SIZE 1024MB REUSE,
DATAFILE /dev/D2.1' SIZE 1024MB REUSE,
DATAFILE /dev/D3.1' SIZE 1024MB REUSE,
DATAFILE /dev/D4.1' SIZE 1024MB REUSE,
DATAFILE /dev/D5.1' SIZE 1024MB REUSE,
DATAFILE /dev/D6.1' SIZE 1024MB REUSE,
DATAFILE /dev/D7.1' SIZE 1024MB REUSE,
DATAFILE /dev/D8.1' SIZE 1024MB REUSE,
DATAFILE /dev/D9.1' SIZE 1024MB REUSE,
DATAFILE /dev/D10.1 SIZE 1024MB REUSE,
DEFAULT STORAGE (INITIAL 100MB NEXT 100MB PCTINCREASE 0);
...

CREATE TABLESPACE TSfacts2
DATAFILE /dev/D4.2' SIZE 1024MB REUSE,
DATAFILE /dev/D5.2' SIZE 1024MB REUSE,
DATAFILE /dev/D6.2' SIZE 1024MB REUSE,
DATAFILE /dev/D7.2' SIZE 1024MB REUSE,
DATAFILE /dev/D8.2' SIZE 1024MB REUSE,
DATAFILE /dev/D9.2' SIZE 1024MB REUSE,
DATAFILE /dev/D10.2 SIZE 1024MB REUSE,
DATAFILE /dev/D1.2' SIZE 1024MB REUSE,
DATAFILE /dev/D2.2' SIZE 1024MB REUSE,
DATAFILE /dev/D3.2' SIZE 1024MB REUSE,
DEFAULT STORAGE (INITIAL 100MB NEXT 100MB PCTINCREASE 0);
...
CREATE TABLESPACE TSfacts4
DATAFILE /dev/D10.4' SIZE 1024MB REUSE,
DATAFILE /dev/D1.4' SIZE 1024MB REUSE,
DATAFILE /dev/D2.4' SIZE 1024MB REUSE,
DATAFILE /dev/D3.4 SIZE 1024MB REUSE,
DATAFILE /dev/D4.4' SIZE 1024MB REUSE,
DATAFILE /dev/D5.4' SIZE 1024MB REUSE,
DATAFILE /dev/D6.4' SIZE 1024MB REUSE,
DATAFILE /dev/D7.4' SIZE 1024MB REUSE,
DATAFILE /dev/D8.4' SIZE 1024MB REUSE,
DATAFILE /dev/D9.4' SIZE 1024MB REUSE,
DEFAULT STORAGE (INITIAL 100MB NEXT 100MB PCTINCREASE 0);
...
CREATE TABLESPACE TSfacts12
DATAFILE /dev/D30.4' SIZE 1024MB REUSE,
DATAFILE /dev/D21.4' SIZE 1024MB REUSE,
DATAFILE /dev/D22.4' SIZE 1024MB REUSE,
DATAFILE /dev/D23.4 SIZE 1024MB REUSE,
DATAFILE /dev/D24.4' SIZE 1024MB REUSE,
DATAFILE /dev/D25.4' SIZE 1024MB REUSE,
DATAFILE /dev/D26.4' SIZE 1024MB REUSE,
DATAFILE /dev/D27.4' SIZE 1024MB REUSE,
DATAFILE /dev/D28.4' SIZE 1024MB REUSE,
DATAFILE /dev/D29.4' SIZE 1024MB REUSE,
DEFAULT STORAGE (INITIAL 100MB NEXT 100MB PCTINCREASE 0);

</quote>


My question is should'nt the second tablespace's first datafile be on /dev/D2.2
Am not getting the math here how come /dev/D4.2

Thanks for all your help.

Tom Kyte
February 19, 2003 - 3:37 pm UTC

kidding right?

if it were me, there would be one big device and someone else would worry about striping it for me.

The example is a little botched as well -- a 100mb extent size in a datafile of 1024mb ? hmm, 24mb of wastage.

But anyway, it is just an example - I'm not really going to dig into their theoretical disk layout if you don't mind ;)



A reader, February 20, 2003 - 9:54 am UTC

Tom,

Thanks for the repy.Can you please explain what do you mean by one big device.



Tom Kyte
February 20, 2003 - 10:15 am UTC

just give me a big mount point -- /tom/put/your/data/here -- with inifinite storage.

don't give me

/tom/put/it/here
/tom/or/put/it/here
/tom/or/maybe/over/here
..........
/tom/now/your/really/confused

;)


big mount point with inifinite storage and the hardware guys in the background moving disks or whatever they use these days to get all of the little lights on the disk array flashing equally.

I'll put all of my data there. I'll still think about redo and arch and control files (don't ever want to lose them and need them to be really super optimized in many cases -- no contention desired)....

A reader, February 20, 2003 - 10:34 am UTC

Thanks Tom. I like your style of answering questions ;)

? re: Table Partition

VJ, April 21, 2003 - 7:08 pm UTC

Hi Tom,

I want to create a table which has two fields id number, flag char(1) with partition on flag. The values in the flag will be 'R' and 'T'. so the data would look like

Id Flag
-----------
1 R
2 R
3 R
4 R
5 R
6 R
7 R
8 R
9 R
10 R
11 T
12 T
13 T
14 T
15 T
.............

So i want two partitions one for flag='R' and one for flag='T'. Can you give me an example how to achieve this.

Thanks a million in advance
VJ

Tom Kyte
April 21, 2003 - 10:53 pm UTC

ops$tkyte@DEV816> CREATE TABLE table2000
  2  ( x int,
  3    y int,
  4    z DATE
  5  )
  6  PARTITION BY RANGE (z)
  7  (
  8  PARTITION p1 VALUES LESS
  9  THAN('S'),
 14  PARTITION p2 VALUES LESS
 15  THAN('U')
 16  )
 17  / 

Mixed datatype in partition key

A reader, April 22, 2003 - 2:11 am UTC

This table creation in 8.1.7.4 gives an ORA-01858: a non-numeric character was found where a numeric was expected.

Tom Kyte
April 22, 2003 - 7:42 am UTC

so use a little initiative, take the obvious cut and paste, fix the datatype of Z to be appropriate for you.

My point here was -- the question was already answered above, i had an example of a very very simple partition clause -- you need to read about partitioning.  

I can see what is going to happen here -- you will apply partitioning, not understand much about it, system performance will go down the tubes, you will blame partitioning and another myth starts (partitioning is 'complex', partitioning is 'slow').

This is a trivial syntax question ...


ops$tkyte@ORA920LAP> CREATE TABLE table2000
  2  ( x int,
  3    y int,
  4    z varchar2(5)
  5  )
  6  PARTITION BY RANGE (z)
  7  (
  8  PARTITION p1 VALUES LESS
  9  THAN('S'),
 10  PARTITION p2 VALUES LESS
 11  THAN('U')
 12  )
 13  /

Table created.


suggest you read a ton more before going down this path. 

sara, June 16, 2003 - 4:25 am UTC

Hi

I am a biginer to Oracle. As I am trying to partitioning a table it says :

ORA-00439: feature not enabled: Partitioning

how to enable the partitioning feature.

sorry for the basic question.




Tom Kyte
June 16, 2003 - 8:10 am UTC

you need to

a) be on the enterprise edition
b) have purchased it

if you have a & b, then using OUI (the Oracle Universal Installer) is the way to enable it.

if either of a or b is not true, you don't have it.

indexes while partitioning

Vipin from New york, June 30, 2003 - 1:37 pm UTC

Hi Tom,

Could you please tell me when should one consider about the Global indexes and when about the local indexes while partitioning there tables.

Tom Kyte
June 30, 2003 - 2:12 pm UTC

well, think about how you use them. but search for

global local performance

to read some articles about them.

Partitioning Strategy

A reader, September 21, 2003 - 3:13 am UTC

Hi Tom,

Say if I have two very large tables transactions and transaction items, which I want to partition.

txns(txnid, txndate, txntype)
txnitems (txnid, amount, accountid)

Most queries are for the sum of amount within certain date range.

I can range partion transactions table by transaction date, but how should I partition the transaction items table? Is it possible that I partition it using the txndate in the txns table? Or do you think it is helpful to partition the txns table by range partition (txndate) hash subpartition (txnid) and hash partition (txnid) on the txns items table?






Tom Kyte
September 21, 2003 - 9:50 am UTC

"Most queries are for the sum of amount within certain date range."

sounds like a screaming call for a teeny tiny bit of denormalization doesn't it?

think about the amount of work you could save every day if you made it so that most queries consume 50% or less resources!!! You might carry that date down into the details.

I do not blindly recommend partitioning schemes. I wrote about this in my last book "Effective Oracle By Design". Partitioning tables, without knowledge of how they are used (for real, not just a single sentence), is a recipe for certain disaster!


given as it stands, range partitioning txns by date SEEMS to make sense and it would follow that hash partitioning TXNITEMS by TXNID MIGHT make sense -- but one cannot say for sure.

How to get the archived off data back to a partitioned table

AD, September 24, 2003 - 6:13 pm UTC

Tom,

I have a partitioned table with partitioned key on the year_month.

e.g. 200301 - for jan 2003's data
200302 - for feb 2003's data etc.

Partitioning is implemented in such a way that anything older than 2 yrs of data is archieved off to a table.

e.g. A_arch_200106 - archived data for jun 2001
A_arch_200107 - archived data for jul 2001 etc.

Where A is the table name.

Now, I have a requirement to get the archived off data back to the partitioned table A, so that A_arch_200106 will represent data for partitioned 200106, A_arch_200107 will represent data for partitioned 200107 etc. in the same way they were in the table before the archiving took place.

What should be my approach to handle this?

Thanks & Regards

Tom Kyte
September 25, 2003 - 5:07 am UTC

to archive the data you did an alter table exchange partition right -- so you swapped a totally empty table with the full partition -- ending up with a full table and an empty partition.

Now, you could have either:

a) left the empty partition -- in which case -- you just alter exchange back and you are done

b) dropped the empty partition -- in which case you must now SPLIT the last partition in two, resulting in an empty and full partition and then exchange the new empty slice for the full table you've restored

c) merged the empty partition in to a "mega" empty partition which can be split very fast since it is empty -- repeat step b) above.

Partition Granule

A reader, October 23, 2003 - 7:14 pm UTC

Hi Tom,

Our database stores history transactions of about 100 years. The typical queries are reports searching data within range of 1 year, 5 years, 10 years, and 20 years, start from any calendar year.

I am considering to partition the table by transaction date.

What factors will you consider on how big the partition granule should be?

I am considering to partition the data by 1 year or 5 years. Which one do you recommend?

Tom Kyte
October 23, 2003 - 8:14 pm UTC

maybe by the month.

else many queries might have to hit 24 months of data -- instead of just 12.

Partitioned table and Replication

Alvin, October 27, 2003 - 4:52 am UTC

I have both your books. And i just implemented partitioning on one big table. My problem is since partitioning my table i didn't define any columns as unique or as primary key.

My problem now is i want to replicate that table on another machine for read only purposes and oracle 8i docs states that

"Considerations for Replicated Tables
The following sections discuss considerations for tables you plan to use in a replication environment.

Primary Keys
If possible, each replicated table should have a primary key. Where a primary key is not possible, each replicated table must have a set of columns that can be used as a unique identifier for each row of the table. If any of the tables that you plan to use in your replication environment do not have a primary key or a set of unique columns, alter these tables accordingly. In addition, if you plan to create any primary key snapshots based on a master table, that master table must have a primary key.
"

Do i need PK defined ? or does oracle documentation stated this just to ease on conflict resolution ?

Tom Kyte
October 27, 2003 - 7:46 am UTC



why do you have a table with no primary or unique constraints???????

Partitioned table, Primary Keys and Replication

Alvin, October 27, 2003 - 9:13 pm UTC

Sorry i got confused. I re-read on pp643 (Local indexes and Uniqueness)of Expert one-on-one that PK and UK CAN be enforced but only per partition (never across)and the partitioning key must be included.

Suppose i have a table

Create table t
( bill_num number <- PK
, bill_date date )
partition by range ( bill_date)
....

and my query uses a lot of bill_num in its predicate i can create a non-prefixed composite pk constraint like below

alter table t add constraint t_pk primary key ( bill_num, bill_date);

By doing so i can ensure uniqueness on a partition level and i can 'age-out' some data without the need of rebuilding any indexes ? Right ?



Tom Kyte
October 28, 2003 - 7:31 am UTC



If you want a unique constraint (primary key, whatever), you will either:

o use that column(s) in the partitioning scheme so that a local index can be used

o not use the column(s) in the partitioning schema and then use a global index on that column(s)


those are your choices. If you use a global index, you can still perform partition level operations in 9i using the maintain global indexes clause. It will not cause a rebuild, it will maintain the index as you perform the partition level operations.

Partiioning

A Reader, August 07, 2004 - 1:34 am UTC

I have large table with column ID, ......
I am planning to partition it on ID column, I can either go with Range or List.
Can you please let me know the benefit of one over the other(range vs list)



Tom Kyte
August 07, 2004 - 10:14 am UTC

depends on how you plan on using partitioning totally.

depends on how you admin it
depends on how you age it
depends on how you query it

depends entirely on how you use it. Look at the options

a) range. where x < N part1, where N <= x < M part2, where M <= x < O part3 and so on.

what if X was a state code. Might want Alaska, Rhode Island and Wisconsin in one partition (few people), California gets it's own partition (composite actually, hashed underneath maybe) -- california is big.

range would not be very good for that -- well, actually, range would not work unless we encoded the state codes as something sortable to facilitate that.

what if X was a date, and we purged data old then 3 years. ranging on X would be penultimate.


b) list. where x in ( a,q,r) part1, where x in (z) part2, where x in (l,m,n,o) part3 -- see a)



not knowing what your goals are, what ID represents, the only thing I can suggest is you understand the physics of the data (read the data warehousing guide, they have good info on partitioning, in each of my books I went into the physics of partitioning as well)....

partition

A Reader, August 07, 2004 - 7:49 pm UTC

I apologize, I forgot to mention that the ID will be numeric value from from 1 to 6.
The ID = 1 has say 10million records
ID = 2 has 8 mill
..
...
..
ID = 6 has 5 million recs.
In this case I can have the range partion created with the ID being the Partition key
and also by using the new feature of 9.2 I can use the LIST partition.

If I consider the performance which route should I go
range vs list
which has more advantages performance wise.

many users slect the data based on the specic id ( ID = 1 0r 2 0r .... 6) ---> partition elimination on ID
Only few users access the entire data
(ID in (1,2,3,4,5,6)) --> scanning all the partitions.

I will really appreciate if you could give some insight about the performance for range vs list in my scenerio


Tom Kyte
August 08, 2004 - 9:44 am UTC

still insufficient data -- "performance"

well, unless you are full scanning this table *frequently* by ID = somevalue, ID < somevalue, ID in (some list) -- partitioning is probably going to be something you have to use with extreme care so as to not *negatively* impact performance!


You see, you really need to understand how you access your data and then apply "common sense" to that.

Say you index range scan this data via an index. How would chopping it into 6 partitions help? Sure, *maybe* it might reduce the height of an individual index by 1 -- so on a range scan of 20 rows that right now takes 15 LIO's, it might take 14 instead.


If you FULL SCAN this table frequently, AND your users use ID=value or ID in (values) range and list will perform equivalently -- both can eliminate partitions.

Partitioning

Hans, August 08, 2004 - 3:53 pm UTC

I have a similar situation
I have a table say customers(deptid, customerid, ,,,,)
deptid is numeric with values from 1 to 7, each deptid has millions of records.
The table is partioned by range on deptid and has local indexes on customerid and some date columns.
the users access the data using deptid where partitioning elemination takes place, local indexes are used for accessing the data.
based on the earlier message, if you say that the performance of Range and LIST are same then I would like to change my table to LIST partition using new Oracle feature in 9i.
Your advice is appreciated



Tom Kyte
August 08, 2004 - 6:08 pm UTC

as stated, as long as you are using "where partition_key = :x" or "where partition_key in ( :x, :y, :z)", range and list will perform equivalently.

it is when you get into ranges (between, <, >) that range has more opportunities than list.

HOWEVER, seems you have a table
IT is already populated
the work is done
you would gain nothing other than "prettier syntax" from a reorg.

So, unless you can come up with some benefit of going to list, i would "stay" right where I was.

Partitioning hundreds of millions of rows

Vince Mallet, August 09, 2004 - 6:02 pm UTC

Tom,

I have the following situation (names simplified):

drop table props;
drop table items;

create table items(item_id number, run_id number, ref_id number);
alter table items add constraint items_fk primary key (item_id);

create table props(prop_id number, item_id number, type_id number, value number);
alter table props add constraint props_pk primary key (prop_id);
alter table props add constraint props_fk foreign key (item_id) references items;

(other constraints to run and reference tables omitted)

It's a batch-oriented system. We perform 'runs' 2 to 3 times a month. Each run generates around 10 million items, each item having between 5 and 15 properties of different types. That means every time we insert data into these tables, it is
- about 10 millions in items
- about 50-150 millions in props
A run has never more than 15 distinct property types.

When we query we usually focus on one or two runs at a time and one or two property types at a time.
insert into other_table (...)
select i.ref_id, max(p.value)
from items i, props p
where i.item_id = p.item_id
and i.run_id = :x
and p.type_id = :y
group by i.ref_id;

or
(export all properties for all items of run x)


At this point I am trying to reduce querying time. It appears to me that this problem would be a good candidate for partitioning but after reading the chapters in your two books and all the answers here, I figured I'd ask a question or two first.

What I had in mind:
- list partition the items table on run_id, one partition per run,
- denormalize the props table and add a run_id column to it,
- list partition the props table on run_id, one partition per run,
- every time we add a new run, create a new partition for it (the application would take care of that)

and possibly,
- list-subpartition every props partition on type_id

That would give me one partition per type per run which could end up being 1,500 partitions and subpartitions after three years if the rate of 3 runs per month stays the same.

1. Is that even realistic? With or without subpartitioning?

2. Do you have enough information to answer the question? :)

3. Would it be following the "Good Oracle Practices"?


Any advice would be very valuable. I'm no Oracle expert and still have a great deal to read before I can pretend 'knowing' about all the Oracle concepts, let alone 'understand' them :-)

Thanks,

Vince.



Tom Kyte
August 09, 2004 - 8:25 pm UTC

I don't see how partitioning will help:


select i.ref_id, max(p.value)
from items i, props p
where i.item_id = p.item_id
and i.run_id = :x
and p.type_id = :y
group by i.ref_id;

that looks like (to me, just guessing) a query plan of something like either:

sort/aggregate
use index on items(run_id,item_id,ref_id) to
nested loops join to index on props(item_id,type_id,value)


or

sort/aggregate
use index on props(type_id,value,item_id) to
nested loops join to index in items(items_id,run_id,ref_id)


partitioning *might* decrease an index height by 1 (might) but really -- given no full scans, big joins, data warehouse type of stuff -- how could it dramatically affect anything?


How would you envision partitioning helping here in query performance?



Re: Partitioning hundreds of millions of rows

Vince Mallet, August 10, 2004 - 12:08 pm UTC

Thanks for such fast feedback.

I am seeing a lot of full scans on PROPS which lead me to investigate the partition option. Because the most often used queries consider runs in their entirety (for one or two types), the full scans didn't seem out of place and removing a large part of the work by pruning partitions looked appealing.

For example for the query mentioned previously (select ref_id, max(value) ... ):
240510 rows selected.
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=121052 Card=265860 Bytes=6646500)
1 0 SORT (GROUP BY) (Cost=121052 Card=265860 Bytes=6646500)
2 1 HASH JOIN (Cost=116630 Card=1085931 Bytes=27148275)
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'ITEMS' (Cost=2381 Card=755336 Bytes=10574704)
4 3 INDEX (RANGE SCAN) OF 'ITEMS_FK2' (NON-UNIQUE) (Cost=905 Card=755336)
5 2 TABLE ACCESS (FULL) OF 'PROPS' (Cost=91226 Card=19318815 Bytes=212506965)
(items_fk2 is index on run_id)


From your answer I understand that the queries shouldn't even touch the tables and that the indexes should be enough to answer the questions. Is that right?

In that case, I need to seriously revisit the indexes. Right now, the only indexes defined are:
items: PK, index on run_id, index on ref_id
props: PK, index on item_id, index on type_id (planned for removal)
There is no composite index.

With an average row length of 23 bytes on PROPS and most rows needing to be accessed through large ranges of item_id's, my feeling was that indexes would hurt more than help compared to reducing the set of data that queries have to work with in the first place.

By denormalizing and adding run_id to PROPS for the partition key, I was also able to not have to join against ITEMS anymore to answer a few questions and I figured this would be a plus, too.

I'm in the process of getting rid of 'feelings' and setting up some tests that will provide some real numbers. Your feedback already helped look at things differently.

Thanks,

Vince.


Tom Kyte
August 10, 2004 - 3:45 pm UTC

look at the selectivity of the proposed indexes -- are they selective?

Re: Partitioning hundreds of millions of rows

Vince Mallet, August 12, 2004 - 3:51 pm UTC

You got me thinking on that one. For some reason I believe that Oracle will be faster at reading all partition blocks sequentially than scattered-reading almost as many index blocks.

I believe the indexes are selective given that they are unique. (item_id is unique in items; item_id+type_id is unique in props).

I now have four indexes:
exp_items_u1 on items(run_id, item_id, ref_id)
exp_items_u2 on items(item_id, run_id, ref_id)
exp_props_u1 on props(item_id, type_id, value)
exp_props_u2 on props(type_id, value, item_id)

I have created and populated a new table props_pt which is a copy of props partitioned on run_id (one partition per run_id) with no index and no subpartition.
create table props_pt(prop_id, run_id, item_id, ref_id, type_id, value) ....

I have analyzed 'for table for all indexes for all indexed columns' all three tables (items, props, props_pt);


Now here's the same query against items+props first and props_pt second (this is tested on 8174):


select count(*) from props_pt;

COUNT(*)
----------
100000000

select count(*) from props_pt where run_id = 44;

COUNT(*)
----------
9620400

select count(distinct ref_id) from props_pt where run_id = 44;

COUNT(DISTINCTREF_ID)
----------------------
240510

select type_id, count(*) from props_pt where run_id = 44 group by type_id;

TYPE_ID COUNT(*)
---------- ----------
41 2405100
42 2405100
43 2405100
47 2405100


select i.ref_id, max(p.value) from items i, props p where i.pose_id = p.pose_id and run_id = 44 and type_id = 42 group by i.ref_id;

240510 rows selected.

Elapsed: 00:14:857.90

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=47219 Card=285341 Bytes=5992161)
1 0 SORT (GROUP BY) (Cost=47219 Card=285341 Bytes=5992161)
2 1 MERGE JOIN (Cost=34799 Card=3413847 Bytes=71690787)
3 2 INDEX (RANGE SCAN) OF 'EXP_ITEMS_U1' (UNIQUE) (Cost=3963 Card=2405100 Bytes=26456100)
4 2 SORT (JOIN) (Cost=30836 Card=8115300 Bytes=81153000)
5 4 INDEX (RANGE SCAN) OF 'EXP_PROPS_U2' (UNIQUE) (Cost=12599 Card=8115300 Bytes=81153000)

Statistics
----------------------------------------------------------
0 recursive calls
234 db block gets
18086 consistent gets
69227 physical reads
2 sorts (memory)
2 sorts (disk)
240510 rows processed


select ref_id, max(value) from props_pt where run_id = 44 and type_id = 42 group by ref_id;

240510 rows selected.

Elapsed: 00:00:45.07

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=11551 Card=248903 Bytes=2986836)
1 0 SORT (GROUP BY) (Cost=11551 Card=248903 Bytes=2986836)
2 1 TABLE ACCESS (FULL) OF 'PROPS_PT' (Cost=5368 Card=2405100 Bytes=28861200)

Statistics
----------------------------------------------------------
28 db block gets
22413 consistent gets
23472 physical reads
1 sorts (memory)
1 sorts (disk)
240510 rows processed

A little more consistent reads but a third physical reads and 20 times faster by the clock wall!

Just some precisions about the data:
Each run performed against 1 million ref_ids generates 10 millions items (10 items per ref_id), each item coming with k properties (k fixed per run).
1 run -> n ref_ids -> n * 10 items -> n * 10 * k properties (4<=k<=15, k fixed per run).

If you could explain how you see indexes helping me out better in this situation it would be great. I can't seem to see why.

Now apart from performance I can see how partitioning could help:
- backing up: older partitions should rarely change
- deleting a run: drop related partitions as opposed to massive deletes
- loading in a new run: might be faster because of fewer indexes to update. And potentially, if my indexes are local then I could build partition-indexes after all the data is loaded.

Your comments as always are greatly appreciated.

Thanks,

Vince.


Tom Kyte
August 12, 2004 - 4:20 pm UTC

yes, with those cardinalities -- partitioning would help, indexes not so good (would have hoped for a fast full scan maybe)

normally when joining and looking for a max, there are "relatively few" rows involved. there were not in this case. there were "lots"

so yes, partition elimination would be beneficial (very much so) for the full scans.

Re: Partitioning hundreds of millions of rows

Vince Mallet, August 12, 2004 - 5:05 pm UTC

Thanks for your feedback.

I am still wondering about those aspects:
1. Is programmatically creating a new partition for every new run at load time an acceptable solution in terms of administration and/or practices?

2. Should I consider subpartitioning on types or not? (or 'test and see'?)

Thanks again,

Vince.


Tom Kyte
August 13, 2004 - 8:49 am UTC

1) absolutely -- done all of the time. end of month, end of day -- people do the "partition slide" using automated procedures all of the time.

2) if it can result in further pruning, sure -- if you use hash subpartitions, "TYPE" needs to have *lots* of values (to hash evenly) and use a power of 2 for the number of subpartitions (2,4,8,16,32,..)

A reader, August 13, 2004 - 11:33 am UTC

Tom,

I want to create a table t with 31 partitions(range) for each day, can i do it or there is some catch to it.

Thanks.

Tom Kyte
August 13, 2004 - 5:51 pm UTC

no catches?

you need a column that has the "day" in it
you partition on that column

Partitioned select

Jim Shave, August 23, 2004 - 6:59 pm UTC

I'm selecting from a table with 2.7 million rows, partitioned by mdl_year; partitions: PART_2005 ( values < 2005 ) and PART_MAXVALUE ( values < MAXVALUE )

Instead of 2.7 million rows returned, I get 5.4 million, and the select is only on this one table. It seems that it's reading both partitions. How can I limit my select to reading just one set of data?

Select count(*) returns 2.7 million, and checking the results shows duplication of every single row.
Thanks.

Tom Kyte
August 23, 2004 - 7:48 pm UTC

er? not following you here, can you be more explicit?

Is "multiple table" partitioning possible?

Oleksandr Alesinskyy, August 24, 2004 - 8:30 am UTC

Hi Tom,

I want to ask a question that is very similar to one of already asked ("Partitioning strategy"), but more general. I have a master table that can be partitioned on date and a lot (tens) of detail tables that has only relation to master based on master ID (no dates in details). Is it possible to ensure that records for details go to the same partition as their master record? I need this not so for improving performance, as for simplicity of maintenance of aged data. And denormalization is clearly not an option.

Tom Kyte
August 24, 2004 - 8:58 am UTC

why is denormalization "clearly not an option" -- seems very very fuzzy to me that it isn't?

How to findout whether a query is using a particular partition or not?

RB, August 24, 2004 - 12:58 pm UTC

Tom:
Assume that I have N number of range partitions (ex: 12 partitions, 1 per month ). Now when I do select, where I have specified the range which should only search the partition where the data within that range.

ex: select * from tableA where dt between 1-Jan-2004 and 30-Mar-2004.

How do I verify whether the system is only doing search in the first three partiions - Jan, Feb and Mar?

I am not planning to send the partition key in the where clause.


Tom Kyte
August 24, 2004 - 3:24 pm UTC

delete from plan_table;
explain plan for your_query;
@utlxpls

you'll see "pstart" and "pstop" for partition start/stop keys.


i don't know what you mean by "I am not planning to send the partition key in the where clause."

why is denormalization "clearly not an option"?

Oleksandr Alesinskyy, August 29, 2004 - 6:46 am UTC

Denormalization is clearly not an option because, as I have mentioned, several records in in each of dozens details tables hang on single record of master table. And some details has details of their own. As result, in case of denormalization database size will go to terabyte range (now it dozens GBs). One possible solution is to propagate date information to details, details of details and so on (really I have to admit, this is some type of denormalization anyway). But eve nthis solution is not very viable as date in master may change, and details may migrate from one master to another on any level (data by nature of this system come to it in not very clean state, so a lot of corrections may be done). so this propagation may be cumbersone.

Tom Kyte
August 29, 2004 - 11:59 am UTC

do the math, get back to me. people way overestimate.

that -- and you didn't really present anything above that would make anything "obvious" to "anyone".

adding a little tiny date column would not "obviously" make a multiple gigabyte sized database grow into terabytes. so, make it obvious for me.


If you want to partition a table
then
The column you want to partition on must be in the table


So if you want to partition by this date column from a master table in the child tables

YOU WILL propagate that value down.

Otherwise, you will not be partitioning on that value.



that the propagation is "cumbersome" is not really meaningful IF your goal is to partition by this value. You write a "update this date" stored procedure once that makes it "trivial" to accomplish (programmer calls "update_this_date( pk => ?, new_date => ?);" it is less than cumbersome.

my suggestion

riyaz, April 20, 2005 - 3:44 am UTC

my suggestion for the above query: Ofcourse the below text is based on Tom's book only (+ my own experience).

a- Header (single row per transaction) table is having date field, (primary key doc_no)
b- Detail table is NOT having date field (multiple rows per transaction)- primary key doc_no, prod code

For the above:

Solution - 1
1- range partion with date column for header table.
2- datail table - HASH partion by primary key column of header table. ie., doc_no
3- Create a view with both header and detail table,which will have DATE column (of header) table also.
4- Geneate all reports (which require detail information based on date range) using the view.

Solution - 2
De-normalize the table by adding date column in the DETAIL table.

Note: On top of it, I have tried both methods in test server and bench marked the performance (using trace/tkprof) & then implemented 1st method in production, since it suits my application.

Tom, please correct me if anything wrong.

please correct

riyaz, April 22, 2005 - 5:01 am UTC

Hi Tom,please confirm my understanding. (above)

Tom Kyte
April 22, 2005 - 10:34 am UTC

I frankly lost track of the thread here (1/2 year gap and all)

so I'm not really sure which part of this page you were addressing.

Smita Acharya, November 22, 2005 - 4:28 pm UTC

Hi Tom,

We have a data warehouse of about 50GB which is expected to grow to about 150gb
by next year.
The storage being used is Hitachi Data Systems Lightning 9980V connected over a
Fibre Channel SAN. We are using RAID 5.
My question here is:
Do we need to create different tablespaces (each having files on different mount
points) for indexes, partitions considering the above storage.
OR
Can we keep them all in one tablespace or separate them into different tablespaces taking only database
backup and recovery into account, without considering performance.

I think putting them into a single tablespace will not be a performance problem( Can be a recovery problem though).
Please correct me if I am wrong and please explain why.


Tom Kyte
November 23, 2005 - 9:00 am UTC

50-150gb is relatively small.

raid-5 isn't the best choice for warehousing (lots of direct path writes and reads in general, not buffered IO).

But given this size, it'll likely be "ok".


Use tablespaces to make your life easier - what makes your life "better". You have a valid point on the recovery issue - you could just rebuild indexes rather than back them up if something goes wrong.

Since a tablespace may contain files from many mountpoints....
And your mountpoints are already striped file systems (raid-5)....

You would not really be using tablespaces for distributing IO - use tablespaces to make your administrative jobs easier for you.

Tablespaces in data warehouse

Smita Acharya, November 23, 2005 - 10:48 am UTC

Hi Tom,

Thanks a lot for the answers. They were really useful. I will use a different tablespace for indexes, mainly considering the recovery part.
I do not have much idea about RAID systems. Could you please point me to some easy to understand documentation which describes RAID in detail. I wish everyone could explain things as simply and easily as you do.

Tom Kyte
November 23, 2005 - 11:02 am UTC

</code> http://www.acnc.com/04_01_00.html <code>

is a pretty good overview.

Chris, November 26, 2005 - 1:27 pm UTC

We have a partitioned table (range partitioned by week..each partition has around 70 million rows..3 years of data);
All the partitions except the most recent partitions are compressed.Now we have a situation wherein we need to add some
fields to the table.Since its not possible to add fields to a compressed table and we cannot uncompress the table because
of space limitations, Is there any way to get around this problem?
I was thinking along the lines of creating a new table(with additional fields) and start populating it going forward
and somehow exchange the partitions from old table to new table? but the old and new table doesnt have the same table
structures...any ideas??
thanks



Tom Kyte
November 26, 2005 - 4:29 pm UTC

excellent reason for considering a new release ;)

ops$tkyte@ORA10GR1> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(25)
  6  )
  7  compress
  8  PARTITION BY RANGE (dt)
  9  (
 10    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
 11    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
 12    PARTITION junk VALUES LESS THAN (MAXVALUE)
 13  )
 14  /

Table created.

ops$tkyte@ORA10GR1>
ops$tkyte@ORA10GR1> insert /*+ append */ into t select to_date( '12-mar-2003', 'dd-mon-yyyy') + mod(rownum,5), rownum, rownum from all_users;

28 rows created.

ops$tkyte@ORA10GR1>
ops$tkyte@ORA10GR1> alter table t add z int;

Table altered.


Else, you'll be looking at doing this a partition at a time, create a new partitioned (empty) table with the column.

insert /*+ APPEND */ into partition 1 the old data from the old partition1.  Truncate old partition 1.  Repeat.

 

Murali, March 16, 2006 - 7:14 am UTC

Tom:

I have table with 4 millions and table is done hash partition.no i want to find out howmany records in each partition.

when i give select count(*) from table_name <partition_name> . record countt is always same for with partition clause and with out partition clause .
why?


Tom Kyte
March 16, 2006 - 2:23 pm UTC

prove it?

cut and paste an example, unless you hashed on something with very very low distinct values - one would expect a spread of data.

Or perhaps you literally issued:


select count(*) from table_name partion_1;

(same as "select count(*) from table_name XXXX") which is very very different from

select count(*) from table_name PARTITION(partition_1);



A reader, March 17, 2006 - 4:20 am UTC

Tom:

You are great. 
Its my mistake, sorry. correct one is 

SQL> select  COUNT(1)  from table_name PARTITION(partiton_name);

for example , emp table is done 10 HASH partition's and
if i give select statement with out PARTITION clause
 i.e select emp_no from emp where emp_no=134;
how oracle scan's emp table ? i.e it will search all partition until it find emp_no=134 or is it always starts  first sacning in  first partition  if value not found then  search next partition..and soo on  until result found  .How oracle scans table if i don't give partition clause in select statement?

 

 

Tom Kyte
March 17, 2006 - 5:20 pm UTC

do not do 10 partitions, do

2, 4, 8, 16, 32, 64, 128, .... (*2)

10 will not distribute the data very well.

If you hashed the table on emp_no - then "where emp_no=134" will take the 134, hash it, figure out the only partition it could be in and look there.

Should I Partition?

RG, March 21, 2006 - 6:51 pm UTC

Hi:
I am trying to find out if I should partition a table and if so how?
This table will be used to track who accesses certain tables.

Below are the details:

drop table t;

create table t
(
id_type number
,cd_type char(1)
,ts_cr date
,id_user number
);

insert into t values (1,'C',SYSDATE,1);
insert into t values (1,'B',SYSDATE,1);
insert into t values (1,'P',SYSDATE,1);
commit;

id_type identifies the unique id of the row being accessesd on a different table.
cd_type identifies the table being accessed by the user and it can only have values ('B','C' or 'P').
ts_cr the time the table was accessed
id_user id of the user

A report will be generated to determine if a particular row in a table has been accessed (based on id_type, cd_type and ts_cr).

select * from t where id_type = 1 and cd_type = 'C' and ts_cr between to_date('01/01/2006','MM/DD/YYYY') AND to_date('12/31/2006','MM/DD/YYYY');

ID_TYPE C TS_CR ID_USER
---------- - --------- ----------
1 C 21-MAR-06 1

or based on id_user and ts_cr

select * from t where id_user = 1 and ts_cr between to_date('01/01/2006','MM/DD/YYYY') AND to_date('12/31/2006','MM/DD/YYYY');

ID_TYPE C TS_CR ID_USER
-------- - --------- ----------
1 C 21-MAR-06 1
1 B 21-MAR-06 1
1 P 21-MAR-06 1

The table will be populated every week with 26,000 rows (about 1.2 million rows every year) as follows:
CD_TYPE = 'C' ~ 15,000 rows
CD_TYPE = 'B' ~ 2,000 rows
CD_TYPE = 'P' ~ 9,000 rows

Also, data from this table will be archived (no details on how often, any suggestions)

create table t_archive
(
id_type number
,cd_type char(1)
,ts_cr date
,id_user number
);

Thanks for your help!!

Tom Kyte
March 22, 2006 - 3:43 pm UTC

why do you want to partition - what do you NEED that partitioning can PROVIDE.

It can make purging of data easier. (ease administration)
It can make the data more available in some cases.
It can make certain specific classes of queries potentially faster.
It can help increase concurrency in a high insert environment sometimes.

But most of the above may be mutually exclusive. What is your number one important goal.

Re: Should I Partition?

RG, March 23, 2006 - 12:27 pm UTC

My number one important goal is to make the queries for the reports I had mentioned (below) fast (if it is possible through partitions).

.....

A report will be generated to determine if a particular row in a table has been
accessed (based on id_type, cd_type and ts_cr).

select * from t where id_type = 1 and cd_type = 'C' and ts_cr between
to_date('01/01/2006','MM/DD/YYYY') AND to_date('12/31/2006','MM/DD/YYYY');

ID_TYPE C TS_CR ID_USER
---------- - --------- ----------
1 C 21-MAR-06 1

or based on id_user and ts_cr

select * from t where id_user = 1 and ts_cr between
to_date('01/01/2006','MM/DD/YYYY') AND to_date('12/31/2006','MM/DD/YYYY');

Tom Kyte
March 23, 2006 - 1:51 pm UTC

do you plan on having your report full scan or index range scan to find these rows.

full scan => partitioning could be employed to help speed up the full scan via partition elimination

index range => partitioning would have to be very carefully employed so as to not make it go slower, the odds of it speeding up the query are slim to none.

Re: Should I Partition?

RG, March 23, 2006 - 3:17 pm UTC

Normally I use indexes to make queries fast. I have never used partitions and since I know that this table is going to be huge over time I wanted to find out if partitioning the table right when it is created would be better for performance when compared to having indexes on the table. I ran the following sql for test:

SQL> create table t
  2  (
  3   id_type number
  4  ,cd_type char(1)
  5  ,ts_cr   date
  6  ,id_user number
  7  );

Table created.

SQL> 
SQL> insert into t values (1,'C',SYSDATE,1);

1 row created.

SQL> insert into t values (1,'B',SYSDATE,1);

1 row created.

SQL> insert into t values (1,'P',SYSDATE,1);

1 row created.

SQL> commit;

Commit complete.

SQL> set autotrace traceonly;
SQL> select * from t where id_type = 1 and cd_type = 'C' and ts_cr between 
  2  to_date('01/01/2006','MM/DD/YYYY') AND to_date('12/31/2006','MM/DD/YYYY');


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'T'




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          8  consistent gets
          0  physical reads
          0  redo size
        581  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> select * from t where id_user = 1 and ts_cr between 
  2  to_date('01/01/2006','MM/DD/YYYY') AND to_date('12/31/2006','MM/DD/YYYY'); 


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'T'




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          8  consistent gets
          0  physical reads
          0  redo size
        622  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          3  rows processed

SQL> 
SQL> 

SQL> create index indx1_t on t(id_type,cd_type,ts_cr);

Index created.

SQL> create index indx2_t on t(id_user,ts_cr);

Index created.

SQL> exec dbms_stats.gather_table_stats(user,'T',cascade=>true);

PL/SQL procedure successfully completed.

SQL> select * from t where id_type = 1 and cd_type = 'C' and ts_cr between 
  2  to_date('01/01/2006','MM/DD/YYYY') AND to_date('12/31/2006','MM/DD/YYYY');


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=16)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=2 Card=1 Bytes=
          16)

   2    1     INDEX (RANGE SCAN) OF 'INDX1_T' (NON-UNIQUE) (Cost=1 Car
          d=1)





Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        581  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> select * from t where id_user = 1 and ts_cr between 
  2  to_date('01/01/2006','MM/DD/YYYY') AND to_date('12/31/2006','MM/DD/YYYY'); 


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=3 Bytes=48)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=2 Card=3 Bytes=
          48)

   2    1     INDEX (RANGE SCAN) OF 'INDX2_T' (NON-UNIQUE) (Cost=1 Car
          d=3)





Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        622  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          3  rows processed

SQL>  

Tom Kyte
March 23, 2006 - 3:53 pm UTC

like I said - are you planning on using indexes to retrieve your data (big table, but you need a relatively small piece of it)

or are you planning on full scanning, because it is a big table and you need LOTS of it.

if you index scan - partitioning must be applied in a way to NOT degrade query performance.

if you full scan - perhaps we can partition to IMPROVE query performance

Re: Should I Partition?

RG, March 23, 2006 - 5:04 pm UTC

I am planning on using indexes to retrieve small pieces of data. So based on what you are saying it looks like partitioning is not going to help me a whole lot (to make the reports run faster). Am I correct?

Thanks!!

Tom Kyte
March 23, 2006 - 6:12 pm UTC

correct - it would be HIGHLY unlikely partitioning would do anything positive for your query retrieval.


Re: Should I Partition

RG, March 23, 2006 - 6:27 pm UTC

Thanks Tom!!

Partition Exchange to partitioned tabled

Murali, March 24, 2006 - 2:11 am UTC

Tom:

a)
we have 20 million records non partition table. just assume T1 . we would like to add partition to non partitiion table how can we do this?

Can we use ALTER Table to add partition?


2) We have one partition table just assume table_1 with 4 range partitions on Date column. we want to move partion1 and partition2 of table1 to second table called table2 which is already partitiond on same date column as table1

FYI,table2 structure,datatypes is same as table1
How can we do this?

pls help



Tom Kyte
March 24, 2006 - 9:46 am UTC

a) and then 2) ;)


you have to create a partitioned table, you cannot take an existing table and make it suddenly be a partitioned table

You can

a) create a new partitioned table and then EXCHANGE this existing table for a partition in the newly created table - hence converting the table T1 into a partition of a partitioned table.

b) use dbms_redefinition to online redefine the data into a partitioned table. Search this site for dbms_redefinition - examples are there showing that.



for the second question - you would alter table T1 and exchange partition 1 with an empty table (turn partition 1 into a table), then you can exchange that table with the other partitioned table. Repeat for partition 2.


Mohamed From France

A reader, March 24, 2006 - 10:33 am UTC

Dear Tom,

Exchanging partition is not as easy as you wrote. I have tested it. This is the method I've followed.
I have a master table T1 as follows
IDE NUMBER
DAT_PART DATE
.../...
range partitioned on dat_partition (3 months). This master table has several other child table like T2
Master_IDE
Master_Dat_part
also range partitioned by dat_partition

1.set constraints of master and slave tables to no validate (disable constraints).
2.create empty clone non partitioned tables with relevant indexes in place.
3.set constraints of clone tables to a non validate state (disable constraints)
4. exchange each child partition with empty clone table including indexes but without validation
5.exchange parent partition with corresponding empty clone table.

-- disable constraints of partitioned tables
alter table CHILD_TABLE disable constraint Child_Fk;
alter table PARENT_TABLE disable constraint Parent_Pk;

-- create empty clone non partitioned tables
create table PARENT_TABLE_CLONE as select * from PARENT_TABLE where 1 = 2;
create table CHILD_TABLE_CLONE as select * from CHILD_TABLE where 1 = 2;

-- create indexes and constraints of empty clone non partitioned tables
ALTER TABLE CHILD_TABLE_CLONE ADD (
CONSTRAINT CHILD_FK_01_clone FOREIGN KEY (mast_ide, mast_dat_part)
REFERENCES PARENT_TABLE_CLONE (IDE,DAT_PART));

CREATE UNIQUE INDEX PK_MAST_CLONE ON PARENT_TABLE_CLONE
(IDE, DAT_PART);

ALTER TABLE PARENT_TABLE_CLONE ADD (
CONSTRAINT PK_MAST_CLONE PRIMARY KEY (IDE, DAT_PART)
);

-- disable constraints of empty clone non partitioned tables
ALTER TABLE CHILD_TABLE_CLONE disable constraint CHILD_FK_01_clone;
ALTER TABLE PARENT_TABLE_CLONE disable constraint PK_MAST_CLONE;


-- Exchange the partition with the empty clone tables starting by child ones
ALTER TABLE CHILD_TABLE
EXCHANGE PARTITION P_CHLD_20050331
WITH TABLE CHILD_TABLE_CLONE
INCLUDING INDEXES
WITHOUT VALIDATION

-- Exchange the partition of the master table
ALTER TABLE PARENT_TABLE
EXCHANGE PARTITION P_MAST_20050331
WITH TABLE PARENT_TABLE_CLONE
INCLUDING INDEXES
WITHOUT VALIDATION



Tom Kyte
March 24, 2006 - 10:46 am UTC

what was terribly difficult.

It was asked "how to", this is "how to" (one way anyway)

Mohamed from France

A reader, March 24, 2006 - 11:04 am UTC

Because if you do not follow the indicated order you will get the following error

ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION

Even if your tables to be exchanged are exactly the same.

Kind Regards

Tom Kyte
March 24, 2006 - 3:27 pm UTC

I'm still not getting it, you have to execute commands in the correct order? so?

Murali, March 29, 2006 - 1:16 am UTC

Tom:

If my current table is not partitioned but My table T1 have 20 millions.

What is the best method to transfer data from NON_PARTITIONED TABLE TO PARTITIONED TABLE . ( both partitioned and nonpartitioned tables have same columns names and same datatypes)

CAN I use Alter EXCHANGE PARTITION to transfer my data into partitioned table?

or insert /*+ append */ into partitioined_table select * from non_partitioned table?

Which one is the best

Tom Kyte
March 29, 2006 - 7:12 am UTC

"best" always depends on "need"

If there were a BEST way in all cases, we would not have wasted our developers cycles giving you more than one.


IF downtime is permitted
then
stop modifications to non-partitioned
create table partitioned as select from non-partitioned
index partitioned
grant on it
constrain it
drop non-partitioned
rename partitioned
else
dbms_redefinition with the same steps as before except for the last two
dbms_redefinition finish redef to replace last two
end if


would be the two primary paths to consider.


Mohamed from France, March 29, 2006 - 9:39 am UTC

This is just an example showing how to exchange partitions
-- This is the master table to be archived
CREATE TABLE MHO_ARC_MAST
(IDE NUMBER,
DAT_PART DATE,
NMFI_IDE NUMBER,
PTFL_IDE NUMBER)
PARTITION BY RANGE (DAT_PART)
(PARTITION P_20050331 VALUES LESS THAN (TO_DATE(' 2005-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
)

-- Index
CREATE UNIQUE INDEX PK_MAST ON MHO_ARC_MAST
(IDE, DAT_PART)

-- Primary Key
ALTER TABLE MHO_ARC_MAST ADD (
CONSTRAINT PK_MAST PRIMARY KEY (IDE, DAT_PART)
);
-- with the following data in it
11:54:02 > select * from MHO_ARC_MAST;

IDE DAT_PART NMFI_IDE PTFL_IDE
-------- --------- --------- ---------
1 01-JAN-05 56 89
2 13-JAN-05 55 89
3 15-JAN-05 44 55


-- This is the slave table to be archived
CREATE TABLE MHO_ARC_CHILD1
(MAST_IDE NUMBER,
MAST_DAT_PART DATE,
DAT_TYP VARCHAR2(6),
VLU DATE)
PARTITION BY RANGE (MAST_DAT_PART)
(PARTITION P_CHLD_20050331 VALUES LESS THAN (TO_DATE(' 2005-04-01 00:00:00','SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
)
-- Index
CREATE UNIQUE INDEX PK_CHILD1 ON MHO_ARC_CHILD1
(MAST_IDE, MAST_DAT_PART);

-- Primary Key
ALTER TABLE MHO_ARC_CHILD1 ADD (
CONSTRAINT PK_CHILD1 PRIMARY KEY (MAST_IDE, MAST_DAT_PART);

-- Foreign Key
ALTER TABLE MHO_ARC_CHILD1 ADD (
CONSTRAINT CHILD_FK_01 FOREIGN KEY (MAST_IDE, MAST_DAT_PART)
REFERENCES MHO_ARC_MAST (IDE,DAT_PART));

-- with the data in it
11:54:26 > select * from MHO_ARC_CHILD1;

MAST_IDE MAST_DAT_ DAT_TY VLU
--------- --------- ------ ---------
1 01-JAN-05 000010 29-NOV-05
2 13-JAN-05 000010 29-NOV-05

--Now I would like to archive those two tables using exchange partition (oracle 8.1.7).

-- first create the clone tables

-- create master clone non partitioned table MHO_ARC_MAST_CLONE ans its constraints
13:09:17 > CREATE TABLE MHO_ARC_MAST_CLONE
13:09:40 2 ( IDE NUMBER,
13:09:40 3 DAT_PART DATE,
13:09:40 4 NMFI_IDE NUMBER,
13:09:40 5 PTFL_IDE NUMBER);

Table created.

13:10:13 > CREATE UNIQUE INDEX PK_MAST_CLONE ON MHO_ARC_MAST_CLONE
13:10:13 2 (IDE, DAT_PART);

Index created.

13:10:13 > ALTER TABLE MHO_ARC_MAST_CLONE ADD (
13:10:34 2 CONSTRAINT PK_MAST_CLONE PRIMARY KEY (IDE, DAT_PART)
13:10:34 3 );

Table altered.


-- create slave clone non partitioned table mho_arc_child1_Clone and its constraints
13:10:34 > create table mho_arc_child1_Clone (Mast_ide number,Mast_Dat_Part Date
13:11:33 2 ,dat_typ VARCHAR2(6), VLU DATE);

Table created.

create unique index PK_CHILD1_CLONE ON MHO_ARC_CHILD1_CLONE(mast_ide, mast_dat_part);

Index created.


13:12:11 > alter table mho_arc_child1_Clone add constraint PK_CHILD1_Clone
13:15:04 2 primary key (mast_ide, mast_dat_part);

Table altered.

13:16:10 > ALTER TABLE mho_arc_child1_Clone ADD (
13:16:24 2 CONSTRAINT CHILD_FK_01_clone FOREIGN KEY (mast_ide, mast_dat_part)
13:16:24 3 REFERENCES mho_arc_mast_Clone (IDE,DAT_PART));

Table altered.

-- Now I will start exchanging partitions with all constraints on. I will also start exchanging the master table
13:28:12 > ALTER TABLE MHO_ARC_MAST
13:28:13 2 EXCHANGE PARTITION P_20050331
13:28:13 3 WITH TABLE MHO_ARC_MAST_CLONE
13:28:13 4 WITHOUT VALIDATION;
ALTER TABLE MHO_ARC_MAST
*
ERROR at line 1:
ORA-02266: unique/primary keys in table referenced by enabled foreign keys

-- Then I will try to exchange the child one first
13:28:16 > ALTER TABLE MHO_ARC_CHILD1
13:28:52 2 EXCHANGE PARTITION P_CHLD_20050331
13:28:52 3 WITH TABLE MHO_ARC_CHILD1_CLONE
13:28:52 4 WITHOUT VALIDATION;
ALTER TABLE MHO_ARC_CHILD1
*
ERROR at line 1:
ORA-14128: FOREIGN KEY constraint mismatch in ALTER TABLE EXCHANGE PARTITION

-- Then I start disabling all the constraints

13:30:12 > alter table MHO_ARC_CHILD1_CLONE disable constraint pk_child1_clone;

Table altered.

13:30:42 > alter table MHO_ARC_CHILD1_CLONE disable constraint child_fk_01_clone
13:30:52 2 ;

Table altered.

13:30:55 > alter table MHO_ARC_CHILD1 disable constraint child_fk_01;

Table altered.

13:31:41 > alter table MHO_ARC_CHILD1 disable constraint pk_child1;

Table altered.

13:32:01 > alter table MHO_ARC_MAST_CLONE disable constraint pk_mast_clone;

Table altered.

13:33:07 > alter table MHO_ARC_MAST disable constraint pk_mast;

Table altered.

-- And now I will exchange the partition starting by child one

13:33:32 GSPD> ALTER TABLE MHO_ARC_MAST
13:34:28 2 EXCHANGE PARTITION P_20050331
13:34:28 3 WITH TABLE MHO_ARC_MAST_CLONE
13:34:28 4 WITHOUT VALIDATION
13:34:28 5 ;

Table altered. ===> tables have been exchanged

-- Lets go the master tables

13:34:30 GSPD> ALTER TABLE MHO_ARC_CHILD1
13:35:17 2 EXCHANGE PARTITION P_CHLD_20050331
13:35:17 3 WITH TABLE MHO_ARC_CHILD1_CLONE
13:35:17 4 WITHOUT VALIDATION
13:35:17 5 ;

Table altered. ===> tables have been exchanged


-- At this step partioned tables have been put into non partitioned ones
-- By executing the same Exchange partition the inverse process is done


16:11:27 > ALTER TABLE MHO_ARC_CHILD1
16:11:27 2 EXCHANGE PARTITION P_CHLD_20050331
16:11:27 3 WITH TABLE MHO_ARC_CHILD1_CLONE
16:11:27 4 WITHOUT VALIDATION;

Table altered.

16:11:28 > ALTER TABLE MHO_ARC_CHILD1
16:11:28 2 EXCHANGE PARTITION P_CHLD_20050331
16:11:28 3 WITH TABLE MHO_ARC_CHILD1_CLONE
16:11:28 4 WITHOUT VALIDATION;

Table altered.

Best Regards

Murali, March 30, 2006 - 5:17 am UTC

Tom:

Thanks for your examples.

I would like to regarding performance of "index partitioned" when i insert 0.5 millions of records into table T1 . which one works better i.e partition index or non partition index which one is faster when i am inserting records. option 'a' or option 'b'

a) if my table T1 have 20 millions of records is done 8 partitions and my index' is is not partitioned

b) My table T1 have 8 partitions and my index's also have 8 partition's


Thanks




Tom Kyte
March 31, 2006 - 11:20 am UTC

answer is:

it depends.


say you have thousands of concurrent users inserting these 0.5 million rows (very small amount of data, not too much to worry about). Then having a partitioned index could spread the points of contention (index splits, hot right hand sides and such) over many physical structures. it could increase performance.


say you have a single process inserting all of this data in bulk. Then, figuring out WHICH partition to put each entry into will incurr some runtime overhead that would not otherwise be there.




Murali, April 02, 2006 - 11:47 pm UTC

Tom:

Thanks for your respone.

Lets say we have a single process inserting all of this data row by row i.e in loop , not bulk insert

is it Ok to use partition index's than normal global index?

Why i am stressing on "use of partition index is"

we have a batch process that insert 0.5 mill records into T1 and select records from same table T1 with different where conditions. we have currently table T1 have 5 partition's. but index is 4 global index on T1 i.e index is not partitioned.

If i used partition index than normal index , is it improve my performance





Tom Kyte
April 03, 2006 - 8:05 am UTC

it would do nothing to increase performance, it would likely make the process a little slower - but not nearly as slow as the slow by slow processing you are doing is (it would be hard to make it slower than you already have!)



I need your opinion

Mike, April 19, 2006 - 5:48 pm UTC

Tom,

I have a table with few columns. Two indexes in this table and one of them is primary key index.
I will add new column in this table and create range partitions using new column as a partitioning key.
All queries where this table appears will contain new column in their WHERE clause. Every time new column value will point only on one partition.

I will make all existing indexes to be a local.

Is this a good idea?

From my point of view, since each query will select the records only from one partition, local indexes can produce only benefit.

Thanks,


Tom Kyte
April 19, 2006 - 6:01 pm UTC

"I will make all existing indexes to be a local."

well, except for the index on the primary key of course....  how will you get that index to be used for enforcement of uniqueness to be local?  Unless you partition by the primary key - that won't happen.


ops$tkyte@ORA10GR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(25)
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /

Table created.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> alter table t add constraint t_pk
  2  primary key(x)
  3  using index ( create unique index t_pk on t(x) local );
alter table t add constraint t_pk
*
ERROR at line 1:
ORA-14039: partitioning columns must form a subset of key columns of a UNIQUE
index


ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> alter table t add constraint t_pk
  2  primary key(x)
  3  using index ( create index t_pk on t(x) local );
alter table t add constraint t_pk
*
ERROR at line 1:
ORA-14196: Specified index cannot be used to enforce the constraint.



the rest - fine, if the queries all include that column and we can partition eliminate down to a single index partition - that is good. 

althaf, April 19, 2006 - 7:06 pm UTC

Is there a way I can test that the records are going to the correct in partitions(ramge)?

Althaf, April 19, 2006 - 7:06 pm UTC

Is there a way I can test that the records are going to the correct partitions?

Tom Kyte
April 19, 2006 - 7:11 pm UTC

they will go into the correct partition - I assume you mean "the partition I expected them to" :)

you can

select from table_name partition(partition_name) where key = value;

it'll only query that partition for that record.

Partitioning Existing table

ravs, June 13, 2006 - 2:28 am UTC

Hi Tom,
Thanks alot for this greate site!
I have a table with no partitions on which snapshots (version 8i) are defined. These snapshots are used by other interfaces. Now i want to partition this table with out full refreshing the snapshots used by other interfaces.
if i drop the original table then the snapshots logs on the table will become unusable and requires a full refresh(correct me if i am wrong)
is there any way to partition this table without dropping ?
- Thanks in advance
ravi

Tom Kyte
June 13, 2006 - 12:10 pm UTC

you will have to full refresh the materialized views, you will be creating an entirely new object there.

help with partitioning strategy

Ryan, June 14, 2006 - 11:02 am UTC

We are developing a multi-TB bulk load system(minimal OLTP queries, no oltp DML). We will use BCVs to copy data to a reporting system and transform. It is in development. I have some bulk data, but not all of it. Right now I am working on a partitioning strategy to optimize bulk data loads. I have not worked extensively with partitioning.

We have an an ID column that is a key to many of our tables. We want to partition on this column so that when we do joins we can take advantage of partition wise joins.

We get large data loads. We have discussed splitting these up into small pieces(we have a seperate ETL tool on a seperate server that can do that). So we can open say 5-10 threads at once each processing part of the data.

We talked about doing the following partitioning strategy:

mod(id,100) part_key

This way we have 10 partitions. We use list partitions of the form:

PARTITION p1 VALUES (0, 10, 20, 30, 40, 50, 60, 70, 80, 90)
PARTITION p2 VALUES (1, 11, 21, 31, 41, 51, 61, 71, 81, 91)

Here is the process:

1. the loading database recieves 10 files each seperated by the partition scheme above
2. each thread uses an external table to load the data into a global temp table(each thread will share the same global temp table DDL
3. we then have to do joins/transformation to tables in the system and then mainly bulk inserts. Most of this can be done with SQL. We need some pl/sql

We did 2 tests, where we checked the plans and ran 10046 traces and tkprofs on some of the queriers with non-partitioned tables and some with partitioned tables. The non-partitioned tables were significantly better in everything from response time to logical IO. When we analyzed the plans here is what we found were the differences in the plans:

Here is a sample query:

insert into <table>
select col1,col2,col3
from globalTempTable a, applicationTable b
where a.ID = b.ID
and b.part_key in (0, 10, 20, 30, 40, 50, 60, 70, 80, 90);

take out b.part_key for the non partitioned table. We also tried using the 'partitioning keyword'.

Here are the plans:

Non-partitioned table:

INSERT STATEMENT
NESTED LOOPS
INDEX FAST FULL SCAN | globalTempTable
TABLE ACCESS BY INDEX ROWID | applicationTable
INDEX UNIQUE SCAN |applicationTable.ID


On partitioned Table
INSERT STATEMENT
TABLE ACCESS BY LOCAL INDEX ROWID | applicationTable
NESTED LOOPS
INDEX FAST FULL SCAN | globalTempTable
INLIST ITERATOR
PARTITION LIST ITERATOR | KEY(I) |KEY(I)
INDEX RANGE SCAN | uses local index

Looks like the difference is between the unique vs. range scan. We tested this on 5000 co-located rows(we inserted rows to the application table and the global temp table ordered by ID)

1. any way to get the partitioned table to use a unique scan.
2. I don't quite understand the partitioned plan. I think it's saying that it is using partition elimination? I don't understand why we have 'INLIST ITERATOR' and why we we have the clause 'PARTITION LIST ITERATOR'
3. Any suggestions? We want to break it up in part to reduce waits on the same blocks.

Primary Key

Hitesh Bajaj, July 14, 2006 - 1:26 am UTC

Hi Tom,

We in the warehouse has a table which has more than 1 billion rows.

The table structure is something like this:

CREATE TABLE P_Data
(
P_tag VARCHAR2(10),
P_timestamp TIMESTAMP,
Insert_date Date
)
PARTITION BY RANGE (Insert_date)
(
PARTITION part1 VALUES LESS THAN (to_date('13-mar-2006','dd-mon-yyyy')) ,
PARTITION part2 VALUES LESS THAN (to_date('14-mar-2006','dd-mon-yyyy')) ,
and so on...
)
/

This insert_date is the default sysdate when the record is inserted in the table. And the P_Timestamp is the timestamp when the record was generated by the user community.

& we have a Primary Key based on P_tag which is huge and when ever we do play with some kind of partition maintenance the PK becomes unusable, and it becomes very difficult to build the PK also, it takes huge 12-13 hrs to build the PK. We even did try to use UPDATE GLOBAL INDEXES when doing the partition maintenance but that takes even more longer to build the index.

Now we are thinking to partition the table by the P_Timestamp column and make a local PK index of P_Timestamp + P_tag, since we feel that any file if trying to get loaded twice by the SQL*LOADER will fail since the P_timestamp and P_tag will always be unique through out the application in years to come.

What are your thoughts on this? This will also save us considerable down time in the warehouse.

Also suggest if there is any way to convert a table partitioning from one column to another OR it is just to create another table and reload the data from the old table to the new constructed table.

Appreciate a reply. Thanks


Tom Kyte
July 14, 2006 - 8:29 am UTC

but so what if it takes longer with "update global indexes" - the goal of that command is simply:

"TAKE LONGER, but don't preclude access. Let people continue to work"

so - the question is "so"? It is supposed to take longer - but your end users don't know or care.

You have achieved your goal of "save us considerable down time in the warehouse" with update global indexes - I'm confused?

Isn't the approach of having local indexes beneficial

Hitesh, July 14, 2006 - 11:09 am UTC

Hi Tom,

Agreed with you, but isn't the local PK index is beneficial in all cases where in the maintenance time is negligible.

More importantly with the index becoming so huge the wait events for 'db file sequential read' while inserting the records via sqlldr are increasing day by day.

Is that the idea of local PK index beneficail.

Thanks

Tom Kyte
July 14, 2006 - 12:37 pm UTC

But having a local primary key index implies you partitioned by the primary key which means the mainanence operations you want to do by time - well CANNOT BE DONE since the data isn't partitioned by what you need it partitioned by !!!!


the primary key index need not be a single monster one either - it can be range partitioned (and in 10g hash partitioned) globally by itself.


The idea of a local index in support of your primary key only makes sense IF YOU CAN PARTITION BY YOUR TABLE BY PRIMARY KEY - but you cannot, you want to partition by TIME.

Local PK Index

Hitesh, July 14, 2006 - 1:29 pm UTC

Hi Tom,

But having the PK index on the prefixed column (on which the partition has been done) + P_tag will solve the problem.

What happens is the data files comes to us which needs to get loaded in the warehouse using sqlldr. In these data files we have fixed format of a tag and the timestamp when that tag got generated.

These files needs to get loaded and the tag will come to us once thru' out the application existence. Thus it makes sense to have a local PK index so that the same file should not get reloaded twice by the different file name.

That what the validation we are looking forward.

Thanks


Tom Kyte
July 14, 2006 - 1:55 pm UTC

but - you have changed your partitioning scheme, why did you partition in the first place?

what was the LOGIC behind your original partitioning scheme? Because you just just blown that logic out of the water (one would presume "purging")

Local indexes

hitesh, July 14, 2006 - 2:40 pm UTC

Hi Tom,

Earlier logic was based on the purging of old data based on Insert_date which is the system date and now the logic would shift to purging of partitions based on the dates for which the files were created, or you may say when the users placed their request.

What used to happen was that sqlldr used to lag behind by 2-3 days so we will see non-uniform patterns of partition data since we used to load all the data for the past two days in a single day by removing the PK of the table.

Thus the partitons would become something like this, for a given day it will have 0 rows and for the other day it will have 6 million rows.

Now with this new partitioning method the data will always be uniformly scattered across all partititions.

Since , we would be implementing this approach just wanted to confirm from one of the greats in oracle whether I am going to do the right thing

Appreciate for the answers

Tom Kyte
July 14, 2006 - 2:53 pm UTC

what is the new partitioning scheme then?

and how will it facilitate purging?
or are you never going to purge again.

Purging Mechanism

Hitesh, July 14, 2006 - 3:03 pm UTC

New Partitioning Scheme would be based on column

P_Timestamp TIMESTAMP, which is when user placed the request

instead of Insert_date when the data got loaded, which will be always behind the P_Timestamp.

Earlier purging used to happen based on the insert_date now it will happen based on P_timestamp column with no maintenance of maintaining a big PK key for nearly 1 billion rows.

Now to trace the file if it has been loaded twice it will fail since the P_timestamp along with the P_tag will always be UNIQUE through out the application.

Hence there we are clear that the combination of P__timestamp + P_tag can give us the advantage of Primary Key local indexed.

Thanks



Tom Kyte
July 14, 2006 - 3:18 pm UTC

so basically you are entirely changing everything - the primary key, the partitioning scheme, everything.

Hmm.

that is fine.

will it be "uniform" - depends, only you can answer that (you know the partitioning scheme, you know your data, you should be able to answer that bit?)

Local Indexes

hitesh, July 15, 2006 - 11:20 am UTC

Hi Tom,

Yes the data will be quite uniform across partitions, we definitely get about 30 million rows everday, but may or may not be able to load them. This results in loading sometimes backlog data for 2-3 days in a given day making the partitioning scheme not so uniform.

One partition may have data 0 rows and the other will have the data pertaining to 6 million rows since the data was loaded on that particular day, because the partitioning on Insert_date is by default the sysdate.

Do we have a method to quickly change the partitioning scheme and the rows shifts accordingly depending on the new partitioning scheme of things, OR we have to create a new table with new define partitioning architecture and then insert the rows from the old table to this new table.

What should be the fastest approach!

Thanks



Tom Kyte
July 15, 2006 - 3:21 pm UTC

you have to create a brand new table and load it. Every single row goes "somewhere else" now. The primary key field - disappears. This is a new implementation, a re-architecture, a do-over.

what kind partition for this table

zb, July 20, 2006 - 10:18 pm UTC

I have the table PA_BUDGET_LINES for OLTP and batch job mix database. The table is over 50 miliion records and size of 14GB.
I consider to partition this table and not sure what kind partition to be used.
There 3 indexes on the table now:
1) uqueue
on columns RESOURCE_ASSIGNMENT_ID, START_DATE
2) on
START_DATE, END_DATE
3) RESOURCE_ASSIGNMENT_ID, PERIOD_NAME

Based on the current indexes, what kind partition would you recommend?

it's 8.1.7.4 and we are soon to upgrade it 9.2.0.7

Thanks,

Zb

Tom Kyte
July 22, 2006 - 5:36 pm UTC

you very much have to answer this question first and foremost:

What is it you wish to achieve by partitioning, what business problem are you trying to solve.


From there, we can start talking about "how". (if you have access to my book "Expert Oracle Database Architecture" or "Effective oracle by Design", read the partitioning chapters so you can fully understand what I'm trying to say here)


Partitioning is a tool.
Before we use a tool - we need to have something to "fix".
So, what is broken here that we are trying to fix.

what kind of partition method would be?

ZB, July 22, 2006 - 9:11 pm UTC

<qoute
What is it you wish to achieve by partitioning, what business problem are you
trying to solve.
>
the table grows at the rate of more than 4 million records monthly even under the daily purge job to maintain the size of the table.
Having that said, the table grows like it is now and we cannot do much more on controling the rate growth unless the business rule altered -- that is very unlikely .

I am concerned the performance impact of the size of non-partitioned talbe, which eventually will hit us. we have not noticed the performace issue yet for the current size of 50 million records(14GB). We might be hit by performace once the size as 100 million resords or more.
I wonder if the partition-elimination will help on the performance if the partition implementated on the table.
The applications queies use the table as the indexes intended:
There 3 indexes on the table now:
1) unqeue
on columns RESOURCE_ASSIGNMENT_ID, START_DATE
2) on columns
START_DATE, END_DATE
3) on columns
RESOURCE_ASSIGNMENT_ID, PERIOD_NAME

Any methods of partition could be helpful for the situation? Or do you think we will be OK on performance perspective for non-partitioned table grows like it does, even in the future it would be more than 100 million records next year?

TIA
ZB

Tom Kyte
July 23, 2006 - 8:35 am UTC

why? why are you worried. If you retrieve say by primary key (or any fixed number of rows via an index) then it is true that the time to retrieve data from a 5 row table can be about the same as it takes to retrieve from a 5,000,000 row table or as a 5,000,000,000 row table.

So, do you plan on purging data?

What is the goal you wish to achieve via partitioning - given that in a transactional system - it is more likely you negatively impact performance by applying partitioning if you are not very careful, rather than positively affect transactional performance. It can be very useful to ease administrative things, but in general is not a "performance" tool in OLTP.



ORA-14097 Error

Vivek, September 12, 2006 - 4:45 pm UTC

I think what Mohamed (Mohamed from France March 24, 2006 )was trying to tell was that we need to actually disable the constraints on the main table and the clone table using

alter table <main_table>
disable constraint <constraint_name>

alter table <clone_table>
disable constraint <constraint_name>

before we can execute the following:

alter table <main_table>
exchange partition <main_table_partition>
with table <clone_table>
including indexes without validation

command.

We cannot just do the "alter table exchange partition ... without validation" without physically disabling the constraints on the main table and the clone table.

Otherwise we will get the "ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION" error.

I experienced this first hand.



Partition View

pradikan, October 11, 2006 - 4:34 am UTC

Hi Tom,

one quick question. Is there any way to check what are all the columns included in a table partition.

thanks in advance

sasikanth


Tom Kyte
October 11, 2006 - 8:11 am UTC

all of them?

or do you mean "what is the partition key"?

user_part_key_columns
user_subpart_key_columns

reader

A reader, October 29, 2006 - 4:49 pm UTC

exchanging partition is very good and time effective

Is there a similar method by which data can be appended to partitions. That is, can data in a table be appended to the partition quickly besides utilizing , insert /*+ append */ method

Tom Kyte
October 29, 2006 - 5:15 pm UTC

ok, I'm curious, why isn't the documented way to append with sql good enough?

there are direct path loads (sqlldr)
there are direct path loads (insert /*+ APPEND */ )
there are direct path loads (OCI - custom code)

Exchange partition

Mohamed, November 25, 2006 - 12:00 pm UTC

Dear Tom,

I've just get your new book Expert Oracle Database Architecture. I jumped directly to Chapter 13 (Partitioning) particularly in its 612 page (Auditing and Segment Space Compression). After reading this part of this Chapter, I have to ask you a question.

In our application we implemented a range partition on a master table say table T_master(ide, dat_partition, ...)
where

pk : ide + dat_partition
partition key : dat_partition
locally partitioned index on : ide + dat_partition
partition : three months for each partition

To this master table are attached several partitioned child tables having ide +dat_partition as Primary and Foreign key and are partitioned the same way as the master table

We are in Oracle 8.1.7.4.

We would like to archive 2 years of data (24 partitions by table)

Our goal is
: reduce disk space (because it cost)
: increase performance as far as tables will contain less data.

We have created 24 non partitioned tables For each partitionned table.

We intend to use

alter table T_master exchane partition P_T_master1 with non_partitioned_Table_that_corresponds_To_P_T_master1;

Then we can drop P_T_master1 partition.

However it seems for me that this will certainly reduce the volume of data in the partitioned table but will not gain disk space.

In addition we have developed screens on these partitioned tables and when we archive 24 partitions, corresponding data will not be available for end users.

And I have no idea how to bring them back so that the same screen will show the corresponding data when partitions are back.

My questions are:

1. How can we imagine this archive in order to gain disk space ?

2. How can we put back archived partitions when the need to have them online directly from existing screens will raise up?

3. How can we use read write and read only table space

Thanks in advance for your precious advises







Tom Kyte
November 25, 2006 - 1:07 pm UTC

1) unless you actually remove data, how can you "gain disk space"? you have just changed the name of the segment, you haven't taken it out.

2) you alter the table and put the partition back and exchange it in.

3) when the data is read only - make the tablespace read only? not sure what you mean.

Exchange partition

CT VELU, February 23, 2007 - 4:56 pm UTC

Hi Tom
I am following this site and your books for one of requirement in doing exchange partition. I am getting a error message while I try do exchange partition between a partitioned table with primary key (and global index) and a non partitioned table (no index). I tried few different options with the help of your book and this site but not quite successful ..........(either I am getting a column mis match or index mismatch). If I drop the constraint everything works fine.
Here is the script which generates error message.......

drop table my_partitioned;

create table my_partitioned
(run_nbr number(10),
cst_nbr number(10),
itm_nbr number(10),
yr number(4),
mth number(2),
sales number(4))
partition by list (run_nbr)
(
partition part_1 VALUES (1001),
partition part_2 VALUES (1002)
);



create index my_part_indx
on my_partitioned(run_nbr,cst_nbr,itm_nbr,yr,mth)
GLOBAL;


drop table dummy;

create table dummy
as
select *
from my_partitioned
where 1=2;

--create index dummy_index on dummy(run_nbr,cst_nbr,itm_nbr,yr,mth);

insert into my_partitioned (run_nbr,cst_nbr,itm_nbr,yr,mth,sales)
values(1001,5501,3456,2007,01,35);

insert into my_partitioned (run_nbr,cst_nbr,itm_nbr,yr,mth,sales)
values(1001,5501,3457,2007,01,35);

insert into my_partitioned (run_nbr,cst_nbr,itm_nbr,yr,mth,sales)
values(1001,5502,3456,2007,01,35);

insert into my_partitioned (run_nbr,cst_nbr,itm_nbr,yr,mth,sales)
values(1001,5503,3456,2007,01,35);

insert into my_partitioned (run_nbr,cst_nbr,itm_nbr,yr,mth,sales)
values(1001,5504,3456,2007,01,35);

commit;

select count(*)
from my_partitioned
where run_nbr = 1001;

select count(*)
from my_partitioned
where run_nbr = 1002;

select count(*)
from dummy;

ALTER TABLE my_partitioned
EXCHANGE PARTITION
PART_1 WITH TABLE dummy
including indexes
without validation
UPDATE GLOBAL INDEXES;


alter table my_partitioned add constraint
my_partitioned_pk
primary key(run_nbr,cst_nbr,itm_nbr,yr,mth);

-- Error pops up here either way you do the exchange-------
ALTER TABLE my_partitioned
EXCHANGE PARTITION PART_1 WITH TABLE dummy
including indexes
without validation
UPDATE GLOBAL INDEXES;

alter table my_partitioned drop constraint
my_partitioned_pk;

ALTER TABLE my_partitioned
EXCHANGE PARTITION PART_1 WITH TABLE dummy
including indexes
without validation
UPDATE GLOBAL INDEXES;

I tried creating a local unique index and a primary key on the table it works fine, but In the production table I would like to have local bitmap indexes on itm_nbr,yr, mth ( result set will be 2.5 to 5 million rows and 1000 distinct itm_nbr) and global primary key on cst_nbr(this will become a header number),itm_nbr,yr,mth.

Thank you very much for your time.

CT

Create a partitioned table using a select statement from another partitioned table

Ryan, May 17, 2007 - 2:31 pm UTC

Tom,

is it possible to create a partitioned table using a select statement from another partitioned table.

Example:
create table2 --this the table I want to be partitioned
as
select * from table1 --this my partitioned table

thx
Tom Kyte
May 18, 2007 - 3:50 pm UTC

yes, it is.

ops$tkyte%ORA10GR2> CREATE TABLE t
  2  ( dt  , x   , y   )
  3  PARTITION BY RANGE (dt)
  4  (
  5    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
  6    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
  7    PARTITION junk VALUES LESS THAN (MAXVALUE)
  8  )
  9  as
 10  select created dt, user_id x, username y from all_users
 11  /

Table created.

Partitioned table question

Menon, June 07, 2007 - 5:12 pm UTC

Hi Tom
Do you know off hand if in Oracle different partitions of a table can be stored in the same block?
Tom Kyte
June 09, 2007 - 10:16 am UTC

that would be clustered tables, not partitions.

and would obviate the usefulness of partitioning in the first place.

Thanx!

Menon, June 11, 2007 - 10:10 am UTC

I knew about clustered tables - in case of clustered tables it is different tables altogether sharing the same block. But I have not seen any doc/discussions on the partitions of the same table sharing a block. It is not very obvious to me why sharing the same block would completely defeat the purpose of paritioning.


Tom Kyte
June 11, 2007 - 10:46 am UTC

tell me why you partition?

to do things like

a) partition elimination to speed a full scan. but wait, if you comingle the rows from many partitions on the same block - that is defeated

b) administer large data sets better - like 'truncate partition', but wait, you cannot truncate a table in a cluster, you can only truncate the cluster (which defeats all of the partitioning administration features)

c) higher availability via partition elimination, but you defeated that

d) colocation of related data - but you again defeat that by comingling it with other data


what possible advantage, real world use case - could you envision?

Thanx Tom!

Menon, June 14, 2007 - 11:42 am UTC

Sometimes the simplest things come to you after a long time:) Let me ask you one further question which would clarify for me the whole issue. Is it true that for a normal unpartitioned heap table (not a cluster), there can not be sharing of block? That is, can one block contain rows from two different normal, unparitioned heap tables? I suspect the answer is no and that would explain why I was confused about partitioned tables.


Tom Kyte
June 14, 2007 - 4:06 pm UTC

a cluster is the segment type that allows a block to hold rows from more than one table.

if the segment type is not cluster (eg: it is table) then the blocks in that segment will have rows from only one table.

Perfect, thanx!

Menon, June 14, 2007 - 4:24 pm UTC


exchanging partitions

martina, June 20, 2007 - 1:36 am UTC

hello tom,

i have a package which handles our partitioned tables. it adds partitions, splits partitions and gets old partitions ready to be exported. to do this last part there is a procedure to exchange partitions with tables and this has given me some trouble.
it works in short like this:
1. create table new_table as select * from part_table where 1=2;
2. copy indices
3. copy not nulls
4. copy foreign keys
5. exchange old partition with new_table
6. drop old partition

i quickly found that i must not copy lob-indices. 2 days ago i realized i must not copy mv-function based indices. i know that i cannot handle function based indices and iots (so far). i know that a primary key or unique constraint must have the same name as the underlying index. (we still have 2 v8 databases and to put the cursor and loop into an anonymous block executed immediately did not work with v10).

given the cursor which indices to copy, what else am i missing:

procedure cpy_inx (p_fromtable varchar2,p_totable varchar2)  is
begin
declare
       w_cols varchar2(1000);
       w_cname varchar2(30);
       w_itsp varchar2(30);
       w_inxname varchar2(27);
       cursor c_inx is select i.index_name
               ,decode(i.uniqueness,'UNIQUE','UNIQUE',' ') uniqueness
               ,nvl(c.constraint_type,'X') constraint_type
                      ,substr(i.index_name,length(i.index_name)-2) postfix
                from user_indexes i,user_constraints c
                where i.index_name = c.constraint_name (+)
                  and i.table_name = upper(p_fromtable)
                  and i.table_name = c.table_name (+)
                  and i.index_type not like '%LOB%'
                  and instr(i.index_name,'$') = 0   -- Against indices we don't want to copy (fb-Index on stat_fil_mv)
                  and not exists (select 1
                                  from user_indexes
                                  where index_name = upper(p_totable||substr(i.index_name,length(i.index_name)-2))
                                 );
       begin
          select av_reorg.get_itsp(t.tablespace_name)
            into w_itsp
            from user_tables t
            where t.table_name = upper(p_totable)
              ;
          for r_inx in c_inx loop
          begin
                 av_reorg.pput_line(r_inx.index_name);
                 declare cursor c_icols is
                 select n.column_name
                   from user_ind_columns i,user_tab_columns o,user_tab_columns n
                 where i.index_name = r_inx.index_name
                   and o.table_name = upper(p_fromtable)
                   and o.column_name = i.column_name
                   and n.table_name = upper(p_totable)
                   and n.column_id = o.column_id
                 order by i.column_position;
              begin
                  w_cols := ' ';
                  for r_icols in c_icols loop
                     w_cols := w_cols || r_icols.column_name ||',';
                     av_reorg.pput_line(r_icols.column_name);
                  end loop;
                  w_cols := substr(w_cols,1,length(w_cols) - 1);
                  av_reorg.pput_line(w_cols);
              end;              select 'I'||av_reorg_seq.nextval
                into w_inxname
                from dual;
              if r_inx.constraint_type = 'P'
                 then bml_views.cre_pk(p_totable,w_inxname||r_inx.postfix,w_itsp,w_cols,r_inx.uniqueness);
                 else bml_views.creinx(p_totable,w_inxname||r_inx.postfix,w_itsp,w_cols,r_inx.uniqueness);
              end if;
         exception when others then
           av_reorg.pput_line('wasis');
           av_reorg.handle_exception;
         end;
          end loop;
      end;
end cpy_inx;


thank you in advance!

martina

Index rebuilding on table partition

Raja, January 15, 2009 - 1:19 am UTC

Hi,

With respect to table partitioning, i have doubt on index rebuilding.

I have a Oracle 10gR1 database used for datawarehousing.
There are around 200 tables with partition made already
(partitions/subpartitions made for every month...) and some 200 tables without partitions.

Now i have to create/extend the partition. ( For the tables that have been partitioned already, some have data in max partition and some dont have data in max partition ).
to create/extend the partition/max partition, before partitioning we have to disable the constraints & drop the indexes and after partitioning we have to enable the constraints & re-create those indexes.

Instead of dropping and re-creating the indexes again, i though of rebuilding the indexes. i felt that this would also save the time.
so to rebuild the indexes, on which partition was done ( now the indexes should be in UNUSABLE state ), i felt that i should check whether the following scenarios would work out properly (i.e, i wanted to be sure that rebuilding the indexes should not be a problem under the scenarios that i have mentioned you ).

a. first i want to know, how to check whether there is data or not in the max partition ?
b. second, how to find whether an index is a local index or a global index ?

I would like to tell the scenarios :
1. rebuilding indexes on a max partition with data
2. rebuilding indexes on a max partition with data on local indexes
3. rebuilding indexes on a max partition with data on global indexes
4. rebuilding indexes on a max partition without data
5. rebuilding indexes on a max partition without data on local indexes
6. rebuilding indexes on a max partition without data on global indexes

Table partition details are as follows :
-------------------------------------------------
partitioning_type subpartitioning_type
list none
range hash

Index partition details are as follows :
-------------------------------------------------
partitioning_type subpartitioning_type
range hash

index type details are as follows :
-------------------------------------------------
NORMAL
IOT - TOP

For example : i have the following create script of a sample table :
------------------------------------------------------------
CREATE TABLE ID_ALL
(
REP_ENT_COD_SK_NB NUMBER(12) NOT NULL,
COST_CENTER_COD_SK_NB NUMBER(12) NOT NULL,
POST_ACCN_MAJOR_COD_SK_NB NUMBER(12) NOT NULL,
PRDMKT_COD_SK_NB NUMBER(12) NOT NULL,
SCENARIO_COD_SK_NB NUMBER(12) NOT NULL,
DATA_TYPE_COD_SK_NB NUMBER(12) NOT NULL,
ACCN_PERIOD_COD_SK_NB NUMBER(12) NOT NULL,
CURRENCY_FUNC_COD_SK_NB NUMBER(12) NOT NULL,
OP_AMT_FAA_NB NUMBER,
USD_AMT_FAA_NB NUMBER,
SESSION_INST_COD_VC VARCHAR2(30 BYTE) NOT NULL
)
PARTITION BY RANGE (SESSION_INST_COD_VC)
SUBPARTITION BY HASH (COST_CENTER_COD_SK_NB)
SUBPARTITION TEMPLATE
(SUBPARTITION SP_01 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION SP_02 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION SP_03 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION SP_04 TABLESPACE ID_DW_FCT_0128_DATA
)
(
PARTITION ID_P_20070800A VALUES LESS THAN ('20070800A-000')
( SUBPARTITION ID_P_20070800A_SP_01,
SUBPARTITION ID_P_20070800A_SP_02,
SUBPARTITION ID_P_20070800A_SP_03,
SUBPARTITION ID_P_20070800A_SP_04 ),
PARTITION ID_P_20070900A VALUES LESS THAN ('20070900A-000')
( SUBPARTITION ID_P_20070900A_SP_01,
SUBPARTITION ID_P_20070900A_SP_02,
SUBPARTITION ID_P_20070900A_SP_03,
SUBPARTITION ID_P_20070900A_SP_04 ),
.
.
PARTITION ID_P_20090400A VALUES LESS THAN ('20090400A-000')
( SUBPARTITION ID_P_20090400A_SP_01,
SUBPARTITION ID_P_20090400A_SP_02,
SUBPARTITION ID_P_20090400A_SP_03,
SUBPARTITION ID_P_20090400A_SP_04 ),
PARTITION ID_P_MAX VALUES LESS THAN (MAXVALUE)
( SUBPARTITION ID_P_MAX_SP_85,
SUBPARTITION ID_P_MAX_SP_88,
SUBPARTITION ID_P_MAX_SP_89,
SUBPARTITION ID_P_MAX_SP_90 )
)
ENABLE ROW MOVEMENT;

CREATE UNIQUE INDEX ID_ALL_PK ON ID_ALL
(COST_CENTER_COD_SK_NB, POST_ACCN_MAJOR_COD_SK_NB, PRDMKT_COD_SK_NB,
ACCN_PERIOD_COD_SK_NB, DATA_TYPE_COD_SK_NB,
SCENARIO_COD_SK_NB, REP_ENT_COD_SK_NB, CURRENCY_FUNC_COD_SK_NB);

ALTER TABLE ID_ALL ADD (
CONSTRAINT ID_ALL_PK
PRIMARY KEY
(COST_CENTER_COD_SK_NB, POST_ACCN_MAJOR_COD_SK_NB, PRDMKT_COD_SK_NB,
ACCN_PERIOD_COD_SK_NB, DATA_TYPE_COD_SK_NB, SCENARIO_COD_SK_NB,
REP_ENT_COD_SK_NB, CURRENCY_FUNC_COD_SK_NB));

the alter script for the table will be as follows :
------------------------------------------------------------
ALTER TABLE ID_ALL
SPLIT PARTITION ID_P_MAX AT ('20090400A-000')
INTO (
PARTITION ID_P_20090400A
NOLOGGING
TABLESPACE ID_DW_FCT_0128_DATA
PCTFREE 10
INITRANS 1
MAXTRANS 255
STORAGE (
BUFFER_POOL DEFAULT
)
( SUBPARTITION ID_P_20090400A_SP_01 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION ID_P_20090400A_SP_02 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION ID_P_20090400A_SP_03 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION ID_P_20090400A_SP_04 TABLESPACE ID_DW_FCT_0128_DATA),
PARTITION ID_P_MAX
NOLOGGING
TABLESPACE ID_DW_FCT_0128_DATA
PCTFREE 10
INITRANS 1
MAXTRANS 255
STORAGE (
BUFFER_POOL DEFAULT
)
( SUBPARTITION ID_P_MAX_SP_85 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION ID_P_MAX_SP_88 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION ID_P_MAX_SP_89 TABLESPACE ID_DW_FCT_0128_DATA,
SUBPARTITION ID_P_MAX_SP_90 TABLESPACE ID_DW_FCT_0128_DATA)
);

Since i dont know much about partitioning concepts, also please explain me the above scripts.

Please explain me the above doubts on index rebuilding... in detail.

Thanks in Advance.
With Regards,
Raja.
Tom Kyte
January 16, 2009 - 5:10 pm UTC

... to create/extend the partition/max partition, before partitioning we have to
disable the constraints & drop the indexes and after partitioning we have to
enable the constraints & re-create those indexes.
...

you lost me on that. Not sure at all what you are trying to say. First, I don't know what it means to "extend" a partition. Second, to add a partition (create a partition) you don't need to do what you say.


a) select 'yes' from table partition(max_partition_name) where rownum = 1;

if that returns 'yes', then yes, there is data in that max partition.

b) query user_part_indexes, locality will tell you global or local.


... Since i dont know much about partitioning concepts, also please explain me the
above scripts.
...

you best has rectify that before you start plodding along - there is a LOT of information out there on partitioning, start with the concepts guide, move onto the warehousing guide - BEFORE you start doing anything. I agree that you don't have the concepts down yet - you need to do that FIRST.


default attributes

martina, February 11, 2009 - 6:06 am UTC

Hi Tom,

I am periodically creating new partitions. For some tables I have to split - they are no problem. But with some tables I have to add and I take min(tablespace_name) from user_tab_partitions. that is not very good but max() would be as bad. In which DD view can I find the default attributes ?

thank you!
martina
Tom Kyte
February 11, 2009 - 10:40 am UTC

I don't know what you mean - why would you take "min(tablespace_name)"???? I'm not following your logic here at all - I have no idea what your goal is.


find a tablespace

martina, February 16, 2009 - 5:29 am UTC

Hi Tom,

i took min(tablespace_name) because i did not know what else to take ... did not know that dba_part_tables is dba_part_tables - i was searching (the database) for something like '%PART%DEF%' ...

sorry for bothering you!
regards, martina
Tom Kyte
February 16, 2009 - 12:41 pm UTC

you didn't bother me, but you still haven't told us what you are actually *trying* to accomplish.

If you did, we might be able to add something useful.

partition is out of balance

bhaskar, February 18, 2009 - 12:56 am UTC

Hi Tom,

I have a hash partition in one of my histry table.There are 16 partitions intaht table.All have hash partitions.

My problem is out of these 16 partitions one partition is getting populated heavily.

i have checked dba_tab_partition view fro initial_extent,min_extent,max_extent,next_extent colums of all the partitons .I found same for all.

But num_rows column of that heavily populated column of the partiton (say p16)is 3.5 times greater than the next largly populated partition(say p15).

And blocks column 0f p16 is 3.80 times grater tahn p15.

Now whatever data p15 contains is normal but is that p16 partition which abnormally containing huge data is of that bolcks column value difference?

Hope i expaline dmy problem clearly.

Thanks in advance.
Tom Kyte
February 18, 2009 - 8:05 am UTC

you do not describe the table AT ALL. You do not tell us what you partitioned on, nor what this columns domain of values is.


So, I'll guess. You partitioned on something that has low distinct cardinality. Say this column has 32 distinct values. You will probably (almost certainly) NOT get 2 values per partition (I'd be really surprised if you did). You would probably get some partitions that had 4 or 5 of the values, some with zero, some with 1, 2, or 3. You would be really "unbalanced"

when hashing, you want (need, have to) pick something that has a TON of distinct values - else - you will not, cannot, get an even distribution.

consider the following - load one time, hash on something with 32 values, load another time, hash on something unique - very different distributions:


ops$tkyte%ORA10GR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(30)
  6  )
  7  PARTITION BY HASH (x)
  8  ( partition p01, partition p02, partition p03, partition p04,
  9    partition p05, partition p06, partition p07, partition p08,
 10    partition p09, partition p10, partition p11, partition p12,
 11    partition p13, partition p14, partition p15, partition p16
 12  )
 13  /

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t
  2  select sysdate, mod(rownum,32), 'x' from all_objects;

49965 rows created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> variable x refcursor
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> declare
  2          l_sql long := 'select ''p01'' pname, count(*) cnt from t partition(p01)';
  3  begin
  4          for i in 2..16
  5          loop
  6                  l_sql := l_sql || ' union all select ''p'||to_char(i,'fm00')||''', count(*) from t partition(p'||to_char(i,'fm00')||')';
  7          end loop;
  8
  9          l_sql := 'select pname, cnt, substr( rpad(''*'',30*round( cnt/max(cnt)over(),2),''*''),1,30)  from (' || l_sql || ') order by pname';
 10          open :x for l_sql;
 11  end;
 12  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> print x

PNA        CNT SUBSTR(RPAD('*',30*ROUND(CNT/M
--- ---------- ------------------------------
p01          0
p02       1561 ******
p03       3123 ************
p04       3123 ************
p05       3123 ************
p06       4684 ******************
p07       4684 ******************
p08       7806 ******************************
p09       4685 ******************
p10       4684 ******************
p11       1561 ******
p12       1562 ******
p13          0
p14       4684 ******************
p15       3123 ************
p16       1562 ******

16 rows selected.

ops$tkyte%ORA10GR2> truncate table t;

Table truncated.

ops$tkyte%ORA10GR2> insert into t select sysdate, object_id, 'x' from all_objects;

49965 rows created.

ops$tkyte%ORA10GR2> declare
  2          l_sql long := 'select ''p01'' pname, count(*) cnt from t partition(p01)';
  3  begin
  4          for i in 2..16
  5          loop
  6                  l_sql := l_sql || ' union all select ''p'||to_char(i,'fm00')||''', count(*) from t partition(p'||to_char(i,'fm00')||')';
  7          end loop;
  8
  9          l_sql := 'select pname, cnt, substr( rpad(''*'',30*round( cnt/max(cnt)over(),2),''*''),1,30)  from (' || l_sql || ') order by pname';
 10          open :x for l_sql;
 11  end;
 12  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> print x

PNA        CNT SUBSTR(RPAD('*',30*ROUND(CNT/M
--- ---------- ------------------------------
p01       3113 ****************************
p02       3163 *****************************
p03       3159 *****************************
p04       3062 ****************************
p05       3164 *****************************
p06       3114 ****************************
p07       3099 ****************************
p08       3142 *****************************
p09       3128 ****************************
p10       3115 ****************************
p11       3160 *****************************
p12       3115 ****************************
p13       3051 ****************************
p14       3060 ****************************
p15       3245 ******************************
p16       3075 ****************************

16 rows selected.


partition is out of balance

Bhaskar, February 19, 2009 - 2:18 am UTC

Hi Tom,

First many thanks for such a solution.Sorry for providing lack of informations.

My table has near 1000 million data as it is a history table.
Partition_key (col3) has distinct 926 values.

I'm providing a dummy table structure.


create table tab1(col1 varchar2(5),col2 varchar2(4),col3 number (5));
partion p1 tablespace t1
...
partion p16 tablespace t1

Now say partiton p16 has higest amount of data which is nearly 3.8 time greater tahn the next partiotiuon (say p15)containing highest amount.

Now i understood as the cardinality is such low data is not evenly distributed.As this table is a history table in near future this cardinality will increase then i hope to see even data:)

Now another problem is one value (say col3 =1) of col3 from the 926 distinct values only goes to p16 partition but surprisingly this is not going to any other partitons.We have no other objects on this table to control the flow of data between partitions.


Tom Kyte
February 19, 2009 - 8:06 am UTC

... has distinct 926 values.
...

insufficient for 1 billion rows (as is 32 partitions probably)


If this is a history table with 1 billion rows already - that 926 distinct cardinality number doesn't see to be rising fast enough to make a difference.


... Now another problem is one value (say col3 =1) of col3 from the 926 distinct
values only goes to p16 partition but surprisingly this is not going to any
other partitons. ...

really, that is surprising? That when col3=1, all rows go into a single partition? Think about it, how else could it work?

if the data, when col3=1, was to go into different partitions - if the HASH(1) was not deterministic, we'd never be able to find your data again.

What to Partition on

A Reader, April 16, 2009 - 11:59 pm UTC

Hi Tom,

We have Type-II Dimensions where in we maintain two partitions (currrent Rec and historical Rec)
based on a flag "Is_Current" (values being 1 for current records and null for historical records)

Today's data will be lying in Current Partition and last 13 months of data will be lying in the
historical partition. One such dimension has 2 billion rows in historical rec and 2.5 billion in
the curent rec.

This further has been divided across 64 hash sub partitons based on the SearchTerm_Key.

However most of the queries in the warehouse execute with the WHERE clause to find the
SearchTerm_Keys which are ACTIVE.

effective_start_date <= to_date('04/16/2009','MM/DD/YYYY') and
(effective_end_date > to_date('04/16/2009','MM/DD/YYYY') or effective_end_date is NULL)

effective_start_date is the date when the new SearchTerm gets created.
effective_end_date is the date when the search term gets de-activated in the system.Being null
means that the searchTerm Key is still ACTIVE term.

However when we scan it takes around 57 mins to scan the table and filter out the rows since the
whole table is scanned.

What partition strategy would you recommend for this approach, so that we can take advantage of
partition pruning.

Also every month we need to archive the oldest data before purging it off.

Thanks

Tom Kyte
April 17, 2009 - 9:44 am UTC

if the question is:

... What partition strategy would you recommend for this approach, so that we can
take advantage of
partition pruning.
...

and the predicate is:

effective_start_date <= to_date('04/16/2009','MM/DD/YYYY') and
(effective_end_date > to_date('04/16/2009','MM/DD/YYYY') or
effective_end_date is NULL)


You would either partition by start date or end date

which one is in general 'more selective'? Which would result in less data being scanned - in general?

And if you do end_date - bear in mind, you'll need to enable row movement and be expecting rows to move as you update the end date.

one more clarification

A Reader, April 20, 2009 - 12:11 am UTC

Hi Tom,

Thanks for your reply.

Indeed if you see the WHERE clause, the effective_start_date is being scanned for the whole table as it is <= (Current_date) so doing partition by Range on Effective Start date is meaningless.

However if we do it by effective end date the rows will move across partitions as soon as we update them based on Type -II dimension.

how much overhead do your forsee in this approach.

I have another question if you can answer, Do we need bitmap indexes on Colums for dimension tables.

My answer to this is NO, since the fact Keys will need them for reducing the set during bitmap transformation in a single logical operation.

However I had a debate with DBA and he says yes they are needed on Dimension tables. i agree that the Bitmap or B*tree indexes are required on the Dimension tables for which we specify the WHERE clause during star transformation but essentially not on the Ids.

desc DIM_COM_CAMPAIGN
Name Null Type
------------------------------ -------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HISTORICAL_CMPGN_SID NOT NULL NUMBER
CMPGN_SID NUMBER
CMPGN_KEY NUMBER(19)
ACCOUNT_KEY VARCHAR2(300)
CMPGN_NAME VARCHAR2(50 CHAR)
CMPGN_STATUS NUMBER
CMPGN_START_DATE DATE
CMPGN_END_DATE DATE
CMPGN_GOAL VARCHAR2(4000)
CMPGN_SEARCH_ENABLED NUMBER(1)
CMPGN_ADVANCE_ENABLED NUMBER(1)
CMPGN_CONTENT_ENABLED NUMBER(1)
CMPGN_WATCH_FLAG NUMBER(1)
CMPGN_INACTIVE_DATE DATE
CMPGN_LOOKUP_VAL VARCHAR2(300)
CMPGN_SOURCE VARCHAR2(30)
CMPGN_MONTHLY_SPEND_AMT NUMBER(14,4)
CMPGN_OPT_ENABLED NUMBER(1)
CMPGN_DEL_TMSTMP DATE
CMPGN_CM_MAX_BID NUMBER(14,4)
CMPGN_CM_MAX_BID_DATE DATE
CMPGN_SS_MAX_BID NUMBER(14,4)
CMPGN_SS_MAX_BID_DATE DATE
CREATED_DATE DATE
LAST_UPDATE_DATE DATE
GEOSET_SID NUMBER
IS_CURRENT_FLAG NUMBER(1)
EFFECTIVE_START_DATE DATE
EFFECTIVE_END_DATE DATE
POW_ACCOUNT_KEY NUMBER
CMPGN_THROTTLE_RATE NUMBER
CMPGN_PREV_THROTTLE_RATE NUMBER
CMPGN_FORECAST_SPEND NUMBER
CMPGN_LAST_CLICK_DATE DATE
CMPGN_FIRST_CLICK_DATE DATE
MRKT_SID NUMBER DIM_COM_CAMPAIGN DIM_COM_CAMPAIGN_BIDX1 BITMAP CMPGN_KEY
DIM_COM_CAMPAIGN DIM_COM_CAMPAIGN_BIDX3 BITMAP ACCOUNT_KEY
DIM_COM_CAMPAIGN DIM_COM_CAMPAIGN_BIDX4 BITMAP HISTORICAL_CMPGN_SID
DIM_COM_CAMPAIGN DIM_COM_CAMPAIGN_BIDX5 BITMAP CMPGN_SID
DIM_COM_CAMPAIGN DIM_COM_CAMPAIGN_BIDX6 BITMAP POW_ACCOUNT_KEY
DIM_COM_CAMPAIGN DIM_COM_CAMPAIGN_IDX1 NORMAL CMPGN_SID
DIM_COM_CAMPAIGN DIM_COM_CAMPAIGN_IDX1 NORMAL HISTORICAL_CMPGN_SID
DIM_COM_CAMPAIGN DIM_COM_CAMPAIGN_IDX1 NORMAL EFFECTIVE_START_DATE
DIM_COM_CAMPAIGN DIM_COM_CAMPAIGN_IDX1 NORMAL EFFECTIVE_END_DATE

These are the indexes built on Dimension. I think we don't need any of them after the correct partitioning approach is done.

If I were to go I would create a Primary Key and B*tree indexes on CMPGN_NAME and CMPGN_GOAL if these are used in the queries to filter out dimension rows.

Can you please guide us.

Thanks


Tom Kyte
April 21, 2009 - 12:41 pm UTC

... Indeed if you see the WHERE clause, the effective_start_date is being scanned
for the whole table as it is <= (Current_date) so doing partition by Range on
Effective Start date is meaningless.
....

I told YOU to evaluate "which would be in general more selective, partition on that". You were to evaluate that - I cannot tell if the ONLY predicate you ever do is the one you quoted above. I cannot tell that "..table as it is <= (Current_date ..." is the rule - the only way you do it.

... how much overhead do your forsee in this approach. ...

only YOU can evaluate that, how often do you update it? how many rows at a time do you update? When you benchmark it, how much does it seem to impact you? The physics are easy - the update will be done internally as a DELETE+INSERT. What impact will that have on you? It totally depends. If you have a process that updates 1,000,000 records right now in one update, it might be a show stopper. On the other hand, if you update 100,000 rows constantly during the day (about 1 or 2 per second), you probably won't notice it. It totally depends

One word: benchmark


... Do we need bitmap indexes on Colums
for dimension tables.
...

of course the answer to that is "no", you never 'need' anything, everything is a choice.


... My answer to this is NO, since the fact Keys will need them for reducing the
set during bitmap transformation in a single logical operation.
...

that I did not understand or follow at all, not sure what you are trying to say


... i agree that the Bitmap or B*tree indexes are required on the Dimension
tables for which we specify the WHERE clause during star transformation but
essentially not on the Ids.

....

but now you bring star transformation into play, if you want star transformation then you WILL have bitmap indexes.

http://docs.oracle.com/docs/cd/B19306_01/server.102/b14223/schemas.htm#sthref1591

but it'll be on the fact table...

Queries on partition

A, April 23, 2009 - 7:14 am UTC

Hello,
In Oracle 9i when I update a STATUS column (table is partitioned on status column) then row will be moved from one partition (say p1) to another (say p5), will it delete from p1 and insert in p5? If this is the case, will it fragement the partition (p1)?

What is the impact of query running full table scan on a partitioned table? As we don't want to end up in a situation where the full table scan on partitioned table takes more time than the non-partitioned table?

Thanks in advance

Tom Kyte
April 27, 2009 - 10:43 am UTC

deletes do no "fragment" anything. The delete will free up space in partition p1 that will be available to be reused by some future insert into p1 (or an update might use it)

... What is the impact of query running full table scan on a partitioned table? ...

pretty much the same as a regular table - with the exception of course that if we can eliminate partitions - the partitioned table full scan might be a lot faster.

Partitioning Question

Prats, September 03, 2009 - 3:36 am UTC

I have a table with initial size of 100M records. An insertion of 2M records would happen every month.
I am going to use partitioning on the basis of date.

Is it beneficial to have each partition stored in a seperate tablespaces [say p1 in t1, p2 in t2...]. Will there be any benefit in terms of performance, dropping partition maintainence etc

Please help
Tom Kyte
September 03, 2009 - 7:49 am UTC

tablespaces are a book-keeping device you use to organize data in a fashion that makes you feel better about yourself, that improves your life somehow, that makes you sleep better at night.

They are not about performance really - if two tablespaces reside on the same single disk that is not striped or anything - would it make any difference "performance wise"?

It is all about spreading the IO out. Stripe the file system (use ASM for example) and spread the IO out. Tablespaces don't do it.

Think Bitmap Join Indexes

Rick, September 03, 2009 - 3:25 pm UTC

A Reader from India: <<I have another question if you can answer, Do we need bitmap indexes on Colums for dimension
tables. My answer to this is NO, since the fact Keys will need them for reducing the set during bitmap
transformation in a single logical operation.
However I had a debate with DBA and he says yes they are needed on Dimension tables. i agree that
the Bitmap or B*tree indexes are required on the Dimension tables for which we specify the WHERE
clause during star transformation but essentially not on the Ids.>>

Look into Bitmap Join Indexes -- they basically pre-join dimensions to fact tables given you turn on star transform, don't do refresh on commit, etc.


How can I find the unpartitioned constraints?

Teresa, October 29, 2009 - 1:19 pm UTC

Hi Tom,

I'm trying to extract a list of all primary key constraints of partitioned tables that are not partitioned. Can you please help me out?


Tom Kyte
October 29, 2009 - 1:59 pm UTC

assumption: indexes are in same schema as table. fix to use dba_ views with owner column if not true:

ops$tkyte%ORA10GR2> CREATE TABLE t1
  2  (
  3    x   int,
  4    dt  date,
  5    y   varchar2(30)
  6  )
  7  PARTITION BY RANGE (x)
  8  (
  9    PARTITION part1 VALUES LESS THAN (10),
 10    PARTITION part2 VALUES LESS THAN (20)
 11  )
 12  /

Table created.

ops$tkyte%ORA10GR2> create unique index t1_pk_idx on t1(x) global;

Index created.

ops$tkyte%ORA10GR2> alter table t1 add constraint t1_pk primary key(x);

Table altered.

ops$tkyte%ORA10GR2> CREATE TABLE t2
  2  (
  3    x   int,
  4    dt  date,
  5    y   varchar2(30)
  6  )
  7  PARTITION BY RANGE (x)
  8  (
  9    PARTITION part1 VALUES LESS THAN (10),
 10    PARTITION part2 VALUES LESS THAN (20)
 11  )
 12  /

Table created.

ops$tkyte%ORA10GR2> create unique index t2_pk_idx on t2(x) local;

Index created.

ops$tkyte%ORA10GR2> alter table t2 add constraint t2_pk primary key(x);

Table altered.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select uc.constraint_name, uc.table_name, uc.index_name,
  2         nvl( up.partitioning_type, 'not partitioned' ) ptype
  3    from user_constraints uc, user_part_indexes up
  4   where uc.constraint_type = 'P'
  5     and uc.table_name in ( 'T1', 'T2' )
  6     and uc.index_name = up.index_name (+)
  7  /

CONST TABLE INDEX_NAME                     PTYPE
----- ----- ------------------------------ ---------------
T1_PK T1    T1_PK_IDX                      not partitioned
T2_PK T2    T2_PK_IDX                      RANGE


Helena Marková, November 06, 2009 - 1:32 am UTC


Excellent

mohan, May 20, 2010 - 9:30 pm UTC

I find this extremely helpful in my development.. There are scenarios which i find difficult to type in Goole search... But i have the answers fpr those in ask tom.. Tom is truely the Best.

Warm Regards,
Mohan

Partitioning Audit Table

Bharat K, September 16, 2010 - 11:05 am UTC

Hi Tom,
Like most of your posts, the above discussion is very useful. I would like to explain my case,which I think, is related to this topic.

We have a Unique and complex situation wherein:
1.We have an audit table of size 50+ Gb and 280+ million rows, with no primary Key. The Audit table doesn't have any unique key, so we could not partition it. The sync job of this table (OLTP -> OLAP) is taking a long time as the only way to refresh is a complete refresh(since there is no primary key). Our primary goal is to reduce the time it is taking to sync.
Apart from applying a sequence to this table is there any other way to handle?

2. On top of this We are supposed to implement archive/purge policy for this Audit table. Data <180 days remains in OLTP and data <3yrs should stay in OLAP.
Can you suggest me in picking up the best method(advanced replication through updatable materialized views or Oracle streams) for this case?

Thanks,
Bharat
Tom Kyte
September 16, 2010 - 11:39 am UTC

.. The Audit table doesn't have any unique key, so we could not partition it ...

those two things are unrelated, a table does not need a primary key to partition it.


materialized views can use rowid based replication as a surrogate key under certain circumstances.


if you range partitioned this data by DATE, you'd be able to achieve your purging goal easily as well as reduce the full scan cost of the 'refresh' (since you only have to full scan a fraction of the data).

why not partition daily, refresh OLAP by transporting the last days partition and purging as you see fit?

Bharat K, September 16, 2010 - 12:22 pm UTC

Thanks Tom, Your suggestion is very helpful and it cleared my assumption. I think its time for me to revisit my basics :).
PS: Your reply is lightning fast, I appreciate it.

Partitioning Audit Table

Bharat K, September 16, 2010 - 1:43 pm UTC

Hi Tom,
I'm sorry, I totally forgot to inform you. I tried using rowid long time ago but the sync process used to fail frequently. Is it advisable, not to use rowid for huge tables say 250+ million records? If not, what might have caused the sync process to fail? Because I did not encounter any problem with other tables with less data.
Tom Kyte
September 16, 2010 - 2:37 pm UTC

I wouldn't use rowid unless I really had to - there are many drawbacks to it - but the size of the table isn't relevant.

I cannot comment on what might have caused it to fail - no more than you could on a statement like:

"my car won't start"


Neither of us has any information to begin a diagnoses.

Update bulk data

sulochana, September 27, 2010 - 10:52 pm UTC

Hi Tom,
My self Sulochana, i want to update the table data.
eg: i have one table with date column,i have to update date from 1-1-99 to 1-12-99,that means i need to change the day and month only for multiple rows.no need to change the year.

Please suggest me..


Thanks & Regards,
Sulochana
Tom Kyte
September 28, 2010 - 8:03 am UTC

you change an attribute, not a piece of an attribute.

you will simply:

update t set your_column = <a_new_meaningless_value>
where your_column = <your_old_meaningless_value>;



performence problem

PRAKASH, November 17, 2010 - 12:22 am UTC

select query is hanging when ever i distinct clause is mentioned . Can u please tell me what are the other options for selecting distinct records . I have used group by also.


Tom Kyte
November 17, 2010 - 7:49 am UTC

my car won't start.

why not?

we have the same level of detail.



Your query is not hanging. Rather - it can no longer return records immediately and to you - waiting for the first record to come back - it seems like a long time. It is NOT HANGING however.

consider, think about this:

select * from really_really_big_table;


that query will start showing you the answer right away, immediately, it "appears to run fast" for you (but it doesn't really, it'll take a really long time to complete)


select distinct * from really_really_big_table;


Now, think about what has to happen there in all probability. That query would have to read the entire table and "distinct it" (regardless of the syntax used - it would have to DISTINCT it). You cannot see the first row (in general) until the LAST ROW has been produced and stored in temp (probably on disk since the table is really really big).


In short:

the query is NOT HANGING

your expectation of a fast, immediate response needs to be address - rather than the query itself in this case.


Partition problem

Parikshit Paul, February 24, 2011 - 5:14 am UTC

Hi Tom, 
Thank you for the answers above. They were really helpful.
I am trying to a performance measurement between a non partitioned table and the copy of the same table with partitions.
I created the copy table in the following way:

CREATE TABLE WORKORDER_partitioned
(
  ROWSTAMP            VARCHAR2(40 BYTE)         NOT NULL,
  WONUM               VARCHAR2(16 BYTE)         NOT NULL,
  .
  .
  .
 )
partition by range(reportdate)
(
partition data_uptodec_2009 values less than(to_date('01/01/2010','DD/MM/YYYY')),
partition data_Q1_2010 values less than(to_date('01/04/2010','DD/MM/YYYY')),
partition data_Q2_2010 values less than(to_date('01/07/2010','DD/MM/YYYY')),
partition data_Q3_2010 values less than(to_date('01/10/2010','DD/MM/YYYY')),
partition data_Q4_2010 values less than(to_date('01/01/2011','DD/MM/YYYY')),
partition data_Q1_2011 values less than(to_date('01/04/2011','DD/MM/YYYY')),
partition data_Q2_2011 values less than(to_date('01/07/2011','DD/MM/YYYY')),
partition data_Q3_2011 values less than(to_date('01/10/2011','DD/MM/YYYY')),
partition data_Q4_2011 values less than(to_date('01/01/2012','DD/MM/YYYY')),
partition data_Q1_2012 values less than(to_date('01/04/2012','DD/MM/YYYY')),
partition data_Q2_2012 values less than(to_date('01/07/2012','DD/MM/YYYY')),
partition data_Q3_2012 values less than(to_date('01/10/2012','DD/MM/YYYY')),
partition data_Q4_2012 values less than(to_date('01/01/2013','DD/MM/YYYY')),
partition data_Q1_2013 values less than(to_date('01/04/2013','DD/MM/YYYY')),
partition data_Q2_2013 values less than(to_date('01/07/2013','DD/MM/YYYY')),
partition data_Q3_2013 values less than(to_date('01/10/2013','DD/MM/YYYY')),
partition data_Q4_2013 values less than(to_date('01/01/2014','DD/MM/YYYY')),
partition data_Q1_2014 values less than(to_date('01/04/2014','DD/MM/YYYY')),
partition data_Q2_2014 values less than(to_date('01/07/2014','DD/MM/YYYY')),
partition data_Q3_2014 values less than(to_date('01/10/2014','DD/MM/YYYY')),
partition data_Q4_2014 values less than(to_date('01/01/2015','DD/MM/YYYY')),
partition data_Q1_2015 values less than(to_date('01/04/2015','DD/MM/YYYY')),
partition data_Q2_2015 values less than(to_date('01/07/2015','DD/MM/YYYY')),
partition data_Q3_2015 values less than(to_date('01/10/2015','DD/MM/YYYY')),
partition data_Q4_2015 values less than(to_date('01/01/2016','DD/MM/YYYY'))
)
Now when I try to insert data to this table from the original one I get the following error:

SQL> insert into workorder_partitioned select * from workorder;
insert into workorder_partitioned select * from workorder
            *
ERROR at line 1:
ORA-14400: inserted partition key does not map to any partition

Obviously there must be some data which cannot be put into any of these partitions. But how can these be possible? I created partitions upto 2015?

Tom Kyte
February 24, 2011 - 12:46 pm UTC

select * from workorder where date_column >= to_date('01/01/2016','DD/MM/YYYY') or date_column is null;

what does that return

You are genius

Parikshit Paul, February 25, 2011 - 1:26 am UTC

Got it!!
214 rows with null value.
You are really genius :)
Tom Kyte
February 25, 2011 - 8:43 am UTC

not really, it is a "what is the simpliest answer" case.

If the error message says


ORA-14400: inserted partition key does not map to any partition


then look for a key that doesn't map to any partition - some key that is not specified or is specified and is out of range :)

Sud, June 03, 2011 - 7:12 am UTC

Hi Tom,
I don't how to frame this...
I have a requirement from the ETL people. They need to know the partition into which a particular row might go using the data dictionary tables.
ex. Suppose a EMP table is partitioned by DEPTID. They will pass a row to me. Using the data dictionary tables, I have to find which column is partitioned the key column(which happens to be DEPTID in this case). Then to which partition, it belongs. I know that we have a HIGH_VALUE(LONG) column in all_tab_partitions table but at run time it is very difficult to match the data since the data type of columns may differ.
can you throw some light as how do I start with this?
Sorry...I don't have any code to make things easier for you..
Tom Kyte
June 06, 2011 - 8:58 am UTC

this is very specific to a table that is range partitioned by a number

ops$tkyte%ORA11GR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(30)
  6  )
  7  PARTITION BY RANGE (x)
  8  -- subpartition by hash(x)
  9  (
 10    PARTITION lt_10 VALUES LESS THAN (10),
 11    PARTITION lt_20 VALUES LESS THAN (20),
 12    PARTITION lt_30 VALUES LESS THAN (30),
 13    PARTITION lt_40 VALUES LESS THAN (40),
 14    PARTITION lt_50 VALUES LESS THAN (50),
 15    PARTITION lt_mv VALUES LESS THAN (MAXVALUE)
 16  )
 17  /

Table created.

Elapsed: 00:00:00.02
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create or replace function which_part( p_value in number ) return varchar2
  2  as
  3          l_part_name user_tab_partitions.partition_name%type;
  4  begin
  5          for x in ( select partition_name, high_value
  6                       from user_tab_partitions
  7                              where table_name = 'T'
  8                              order by PARTITION_POSITION )
  9          loop
 10                  if ( ( x.high_value = 'MAXVALUE' )
 11                       or
 12                       ( p_value < to_number( x.high_value ) )
 13                     )
 14                  then
 15                          l_part_name := x.partition_name;
 16                          exit;
 17                  end if;
 18          end loop;
 19          return l_part_name;
 20  end;
 21  /

Function created.

Elapsed: 00:00:00.01
ops$tkyte%ORA11GR2> column part format a10
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select rownum*2 val, which_part( rownum * 2 ) part
  2    from all_users
  3   where rownum <= 26
  4  /

       VAL PART
---------- ----------
         2 LT_10
         4 LT_10
         6 LT_10
         8 LT_10
        10 LT_20
        12 LT_20
        14 LT_20
        16 LT_20
        18 LT_20
        20 LT_30
        22 LT_30
        24 LT_30
        26 LT_30
        28 LT_30
        30 LT_40
        32 LT_40
        34 LT_40
        36 LT_40
        38 LT_40
        40 LT_50
        42 LT_50
        44 LT_50
        46 LT_50
        48 LT_50
        50 LT_MV
        52 LT_MV

26 rows selected.

A reader, August 07, 2011 - 1:11 am UTC

Let's say we have a partitioned table with 5 partitions. Do we need to collect the statistics once again after we dropped/truncated a partition? My DBA told me that we need to collect the statistics once again after we dropped/truncated a partition. Otherwise, it will impact the query performance. Is this true? Please explain.
Tom Kyte
August 13, 2011 - 3:10 pm UTC

it depends. If you are using incremental statistics in 11g - no, not really. If you only gather local statistics in 10g (and before), no, not really.

If you explicitly gather global table statistics - yes, you would.

It depends on how you gather in the first place and what version you are using.

"They need to know the partition into which a particular row might go ..."

David Aldridge, August 17, 2011 - 9:59 am UTC

I'm scratching my head to work out why ETL people would want to know this, but in some circumstances you could use the DBMS_MView.PMarker() function. This function returns the data object id for the partition that a row is in.

If there already exists in the table a row with the same value for the partitioning key as the new data then apply the function DBMS_MView.PMarker() to the row id and look up the corresponding data_object_id. Works for all partition types.

create table PMARKER_TEST(COL1 number)
partition by hash (COL1)
partitions 8;

select TABLE_NAME,PARTITION_NAME
from USER_TAB_PARTITIONS
where TABLE_NAME = 'PMARKER_TEST';

insert into PMARKER_TEST
select MOD(rownum,123)
from DUAL
connect by level <= 10000;

commit;

select DATA_OBJECT_ID,
OBJECT_NAME,
SUBOBJECT_NAME
from USER_OBJECTS
where DATA_OBJECT_ID = (
select DBMS_MVIEW.PMARKER(rowid)
from PMARKER_TEST
where COL1 = 7 and
rownum = 1);

There is an undocumented method using the function TBL$OR$IDX$PART$NUM, which is how Oracle works out which partition a row will go in.

Partitioning and Performance Improvement

Manu Batham, March 13, 2012 - 12:29 pm UTC

Hi Tom,

I have a list partitioned table having 4 locally partitioned and 2 global indexes.

Is moving data from partitions to backup tables (emptying partition) will help in making select query performance better?

Note: Select query is using locally partitioned indexes.

Thanks,
Manu
Tom Kyte
March 13, 2012 - 5:02 pm UTC

Is moving data from partitions to backup tables (emptying partition) will help
in making select query performance better?


If you are using partition elimination - no.

Only if you are doing queries whereby the optimizer cannot figure out it doesn't need to do a scan on some partition would it make sense.

Manu Batham, March 14, 2012 - 2:21 pm UTC

Hi Tom,

Many thanks for your quick response.

As you said-

If you are using partition elimination - no.

I didn't get the term partition elimination. Does that mean "Partition Pruning", if so, as far as I know this is done by optimizer automatically [Correct me if I am wrong or you want to add something]

Also what is the effect of mentioning partition in insert statement like below-

insert into <table> (partition1) (select ....)

and

insert into <table> (select ...)

Which would be an efficient statement, in case I am inserting data related/accepted by partition1 only (I mean to say, when I already know the data and exactly know that this will go to a certain partition only.)

Thank you,
Manu
Tom Kyte
March 14, 2012 - 3:47 pm UTC

partition pruning = partition elimination, yes

It is done by the optimizer if it CAN BE DONE. It depends entirely on the nature of your queries and the structure (partitioning scheme) of your tables.


You said you had all local indexes.

So, suppose you have an EMP table you hash partition into 8 partitions by EMPNO.

You have an index on last name.

You have a query "where last_name = 'KYTE'"

that will perform 8 index range scans because we can only do partition elimination in this case if the partition key is also referenced in the where clause. Here, eliminating partitions would "improve" the performance of the query as there would be less partitions to range scan - however it would also likely change the answer ;)



You do not want to use the partition extended name in general since that cannot be bound (would require unique sql for every partition) and you shouldn't have to know the partition to insert into anyway (why do that in the code when the partition scheme might change at some point, why waste CPU in the application to do something that is rapidly done, more efficiently than you can do in your code, in the server)

A use case for using the partition extended name is in a data warehouse where you want to do a large direct path load using insert /*+ APPEND */ . There - if you wanted to direct path load many partitions at the same time, you would use the partition extended name in order to have each insert lock only a single partition instead of the entire table (note: normal inserts do not lock the table - only a direct path insert would)

Manu Batham, March 15, 2012 - 2:04 pm UTC

Hi Tom,

I was aware about the fact that how optimizer does partition pruning/elimination. Thanks for letting me know the difference of insert and insert /*+ append */ with partitions.

However while exchaning partition, I am getting the error if I execute Exchange Partition Statement twice. Ideally, on first exchange, data of a partition should be swapped with non-partitioned table, and if I am executing same exchange statement once more, data should be back in the form as it was orignally.

Please consider the scenario below:

drop table manu_tst_hashed_tbl_org;

drop table manu_tst_hashed_tbl_bkp;

drop table manu_tst_smpl_tbl;

CREATE TABLE manu_tst_hashed_tbl_org
(
  nmbr  number  primary key,
  chr varchar2(20)
)
partition by hash (nmbr)
  partitions 10;

CREATE TABLE manu_tst_hashed_tbl_bkp
(
  nmbr number  primary key,
  chr varchar2(20)
)
partition by hash (nmbr)
  partitions 10;

CREATE TABLE manu_tst_smpl_tbl
(
  nmbr  number,
  chr varchar2(20)
);

insert into manu_tst_hashed_tbl_org
(select rownum rn, 'a' from dual connect by level <= 100);

commit;

select * from manu_tst_hashed_tbl_org order by nmbr;

create index idx_org_tbl on manu_tst_hashed_tbl_org(nmbr,chr) local;

create index idx_bkp_tbl on manu_tst_hashed_tbl_bkp(nmbr,chr) local;

create index idx_smpl_tbl on manu_tst_smpl_tbl(nmbr,chr);

select distinct nmbr, chr
from  manu_tst_hashed_tbl_org partition (sys_p771) order by 1; --55,74,81,97
--select * from user_ind_partitions where partition_name='SYS_P771';

select distinct nmbr, chr
from  manu_tst_hashed_tbl_org partition (sys_p772) order by 1; --28,32,52,90,92

select distinct nmbr, chr
from  manu_tst_hashed_tbl_org partition (sys_p773) order by 1; --5,21,24,36,44,48,51,56,65,72,77,89,93

select distinct nmbr, chr
from  manu_tst_hashed_tbl_org partition (sys_p774) order by 1; --3,4,16,35,37,47,49,62,64,87

select distinct nmbr, chr
from  manu_tst_hashed_tbl_org partition (sys_p775) order by 1; --13,27,40,50,53,61,76,79,80

select distinct nmbr, chr
from  manu_tst_hashed_tbl_org partition (sys_p776) order by 1; --9,12,17,19,22,39,42,43,58,66,83,95,98

select distinct nmbr, chr
from  manu_tst_hashed_tbl_org partition (sys_p777) order by 1; --2,8,18,20,23,33,41,68,73,78,85,100

select distinct nmbr, chr
from  manu_tst_hashed_tbl_org partition (sys_p778) order by 1; --1,7,14,15,25,29,38,45,57,59,60,63,71,75,82,84,96

select distinct nmbr, chr
from  manu_tst_hashed_tbl_org partition (sys_p779) order by 1; --6,11,30,34,46,54,88

select distinct nmbr, chr
from  manu_tst_hashed_tbl_org partition (sys_p780) order by 1; --10,26,31,67,69,70,86,91,94,99

alter table manu_tst_hashed_tbl_org
  exchange partition sys_p771 
  WITH TABLE manu_tst_smpl_tbl
  INCLUDING INDEXES
  without validation
  UPDATE GLOBAL INDEXES;



If I execute all statements serially and executing the last statement twice, I am getting the below error-

SQL Error: ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION
14097. 00000 -  "column type or size mismatch in ALTER TABLE EXCHANGE PARTITION"
*Cause:    The corresponding columns in the tables specified in the
           ALTER TABLE EXCHANGE PARTITION are of different type or size
*Action:   Ensure that the two tables have the same number of columns
           with the same type and size.


I am not getting why I can exchange back the partition if I exchanged it before, am I doing something wrong here?

Regards,
Manu
Tom Kyte
March 16, 2012 - 7:59 am UTC

first - and this is important - never hash partition with 10 partitions. Use 2, 4, 8, 16, 32, 64, powers of 2 only - never something like 10.

second, when providing examples, please make it reproducible. sys_p771 doesn't work for me, do things like:

ops$tkyte%ORA11GR2> CREATE TABLE manu_tst_hashed_tbl_org
  2  (
  3    nmbr  number  primary key,
  4    chr varchar2(20)
  5  )
  6  partition by hash (nmbr)
  7  ( partition p1,  partition p2, partition p3, partition p4, partition p5,
  8    partition p6,  partition p7, partition p8, partition p9, partition p10);

Table created.



lastly - don't confuse the issue by including non-relevant bits, make things as small as humanly possible. the table manu_tst_hashed_tbl_bkp was a 'red herring', it does nothing but make the example harder to follow...


this should shed light on it:

...

ops$tkyte%ORA11GR2> create index idx_org_tbl on manu_tst_hashed_tbl_org(nmbr,chr) local;

Index created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create index idx_smpl_tbl on manu_tst_smpl_tbl(nmbr,chr);

Index created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> <b>
ops$tkyte%ORA11GR2> insert into manu_tst_smpl_tbl select * from manu_tst_hashed_tbl_org partition(p1);

4 rows created.
</b>

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> alter table manu_tst_hashed_tbl_org
  2    exchange partition p1
  3    WITH TABLE manu_tst_smpl_tbl
  4    INCLUDING INDEXES
  5    without validation
  6    UPDATE GLOBAL INDEXES;
alter table manu_tst_hashed_tbl_org
*
ERROR at line 1:
ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION



if the table has any data in it - then the first switch doesn't work. It is the fact the segment exists and is populated with data that makes it fail right out of the gate.


the column mismatch is due to the NULL/NOT NULLness of nmbr. If you make nmbr NOT NULL, you get a unique constraint mismatch. If you make nmbr unique you'll get an index mismatch...


so.... you can:

ops$tkyte%ORA11GR2> CREATE TABLE manu_tst_hashed_tbl_org
  2  (
  3    nmbr  number,
  4    chr varchar2(20)
  5  )
  6  partition by hash (nmbr)
  7  ( partition p1,  partition p2, partition p3, partition p4, partition p5,
  8    partition p6,  partition p7, partition p8, partition p9, partition p10);

Table created.

<b>
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create unique index manu_tst_hashed_tbl_org_pk on manu_tst_hashed_tbl_org(nmbr) local;

Index created.

ops$tkyte%ORA11GR2> alter table manu_tst_hashed_tbl_org add constraint manu_tst_hashed_tbl_org_pk primary key(nmbr);

Table altered.
</b>


ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> CREATE TABLE manu_tst_smpl_tbl
  2  (
  3    nmbr  number primary key,
  4    chr varchar2(20)
  5  );

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> insert into manu_tst_hashed_tbl_org
  2  (select rownum rn, 'a' from dual connect by level <= 100);

100 rows created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> commit;

Commit complete.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create index idx_org_tbl on manu_tst_hashed_tbl_org(nmbr,chr) local;

Index created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create index idx_smpl_tbl on manu_tst_smpl_tbl(nmbr,chr);

Index created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> insert into manu_tst_smpl_tbl select * from manu_tst_hashed_tbl_org partition(p1);

4 rows created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> alter table manu_tst_hashed_tbl_org
  2    exchange partition p1
  3    WITH TABLE manu_tst_smpl_tbl
  4    INCLUDING INDEXES
  5    without validation
  6    UPDATE GLOBAL INDEXES;

Table altered.

ops$tkyte%ORA11GR2> /

Table altered.

ops$tkyte%ORA11GR2> /

Table altered.

ops$tkyte%ORA11GR2> /

Table altered.

Move partitioned table to smaller blocksize,

A reader, March 20, 2012 - 2:14 pm UTC

Hello,

We have a partitioned table that is involved in high amount of transactions by multiple users. I see a very high buffer busy waits and also enq Index contention. In order to address this problem, I decided to move this table and its primary key to a smaller block size tablespace (8k blocksize is default and now I trying to move this to 2k blocksize tablespace).

I was able to move the primary key index (recreated as global hash partitioned index) but not able to move the partitioned table to the smaller blocksize tablespace.

What is the work around for this problem?

Thanks,

Tom Kyte
March 20, 2012 - 3:40 pm UTC

smaller blocksize is not the way to go.

cutting down on the number of concurrent connections is, I will suggest you have a connection pool sized WAY WAY too high.

http://www.youtube.com/watch?v=xNDnVOCdvQ0



you didn't mention a single error message :(
you don't give a single test case :(

I am categorically opposed to using multiple block size databases for something like this.

The root cause is too many concurrently active sessions in the database - you want to reduce transaction response time and increase transactions per second?

Try reducing your connection pool.

follow up,

A reader, March 21, 2012 - 11:54 am UTC

The number of concurrent sessions is not through connection pool. When a application runs from a web server, the team have a parameter to specify how many parallel sessions can run. They typically want 100.

The flow is:
1. Application selects millions (200 million to be specific) of data from database A and split the amount of data pretty equally (they use mod function to split) and give it to each of those 100 sessions
2. the 100 sessions run on database B
3. Each session extracts one row at a time, process it, generates 5 different DML statement,and continues for the next record. This is how the application has designed it (not my plan)
4. Each session performs DML on the same set of tables on the database B causing for buffer busy waits, enq Index wait

One of the suggestion Oracle support recommended is to have a 2k blocksize for those two tables and its indexes that could reduce contention. Other thought I have is to create global hash partitioned index for the primary key index.

Thanks,

Tom Kyte
March 21, 2012 - 10:47 pm UTC

They typically want 100.

they are untypically foolish then. thing about it. How many cpus do you have. Please - do the basic, simple math.


watch that video, then LOCK THE DEVELOPERS IN A ROOM WITH THAT VIDEO and make them watch it about - oh, say 100 times - you know, to match their 100.


You do not have enough CPUs to do this - please, use common sense, use math, use science, use thinking.



this is a connection pool problem - your developers are using the equivalent of a connection pool (firing off 100 concurrent sessions) to literally KILL YOUR DATABASE.

How many CPU's do you have?

did you watch the video? did it make sense? do you have questions on what it shows you?

follow up,

A reader, March 22, 2012 - 10:11 am UTC

Our box has 16 CPUs and when all the 100 sessions were running on one instance (yes - we deliberately brought down the other 3 in a 4-node cluster to avoid global cache wait events) the CPU was at 80-90% used.

I have proposed them to perform bulk DML like one session working on 20 records (for example) and reduce the number of concurrent sessions. However, it will be a complete rewrite if they have to do that. So, it comes to database to handle performance and in that process - smaller block size, global hash partitioned index, bigger cache for sequence are work arounds.

Thanks,

Tom Kyte
March 22, 2012 - 10:57 am UTC

did you watch the video?

At 80-90% utilization you have run queues

When a machine is:

50% utilized, you have a 1 in 2 chance of getting on the cpu right away (a 1 in 2 chance of WAITING for it)

66% utilized, you have a 1 in 3 (or a 2 in 3 chance of WAITING for it)

80% utilized, you have a 1 in 5 chance, meaning you almost certainly will be waiting on CPU

90% utilized, you have a 1 in 10 chance - meaning YOU ARE going to wait for it


furthermore, at 80/90% utilized, the OS is spending a large amount of time managing you - you have run queues, you are spending much more time in SYS than you would be at a lower cpu utilization.


Crank it *down*, you'll have much less contention (how much of your 80/90% cpu was burnt spinning on latches? *a lot*), you'll have reduced transaction times, you'll wait less for cpu and everything.

In short, you'll DO MORE FASTER WITH LESS.



If your developers have it hard coded at 100 processes and cannot change that number instantly, you need to think about hiring an entirely new team of developers. The number of working processes obviously should be something *easily* reconfigured (it is why we call it "soft"ware, not "hard"ware).


smaller block size, hash partitioning, bigger cache, yadda yadda yadda - isn't going to cut it.


if you want single digit % changes in performance (like ones and twos), keep going the way you are. If you want orders of magnitude differences - crack open that software and change it.

A reader, March 22, 2012 - 2:35 pm UTC


The flow is:

1. Application selects millions (200 million to be specific) of data from database A and split the 
amount of data pretty equally (they use mod function to split) and give it to each of those 100 
sessions

>> ok that fine

2. the 100 sessions run on database B

>> Send it our dblink to different database????

3. Each session extracts one row at a time, process it, generates 5 different DML statement,and 
continues for the next record.  This is how the application has designed it (not my plan)

>> sound like slow by slow processing do it better follow TOM's mantra 

4. Each session performs DML on the same set of tables on the database B causing for buffer busy 
waits, enq Index wait

>> it's obivous you will wait for each other and do tons of latch spin




Best idea scrap the code and start it over again

Partioning of a huge table

A reader, July 13, 2012 - 3:11 pm UTC

Hi Tom,

I have a table which is expected to grow till 1.5 TB in size in an year. We have to retain the data in database till 5 years so which makes it as 7.5 TB rows

Number of columns in table: 185
Row Size: 3500

So what would be your option if you had to design the table? Will partitioning of table helps?

Thanks for your time.
Tom Kyte
July 16, 2012 - 3:51 pm UTC

yes, partitioning will help you purge data easily in 5 years. If you are not partitioned - you will never be able to effectively purge this data - ever.

So, yes, I would be looking at designing partitioning into this - definitely

Addon Information

A reader, July 16, 2012 - 1:53 pm UTC

In the query related to approx 7 TB data, I have an information which may help you.

The table has a control_number field which is Primary key for the table. Also the user can do range search on Begin and End Dates.

These are three main search criterias for the table: Control_number, Begin_date, End_date

Please suggest which partitioning may be helpful in better performance and administration.
Tom Kyte
July 16, 2012 - 4:03 pm UTC

If you are using indexes to retrieve data, partitioning is not really going to do very much if anything for performance.

partitioning impacts performance when you can exclude a large segment of data from being full scanned. If you are using an index - partitioning can actually make it slower if you are not careful.


Tell us - by what criteria do you purge data. You haven't told us that - begin date/end date - meaningless to me. I don't know what they mean.

Followup

A reader, July 17, 2012 - 1:36 pm UTC

Hi Tom,

It is a kind of history table.

The main fields of table are :

Employee_ID(PK in main table)
StartDate
EndDate
Rate
Region
CreateDateTime
UpdateDateTime
Control_Number

The table is kind of History table so the control_number will have a repeating id, e.g. Employee_ID and CreateDateTime.

E0120100101
E0120110101
E0220120201
.....


For pruging, we will consider the UpdateDateTime of the row. If the difference of today and UpdateDatetime of a row > 5 years, delete that row.

The BeginDate and EndDate were the fields which specify when RATES are applicable.

The user can search on any field since he is doing Select from SQL Developer. However, employeeID, startDate, EndDate, CreateDateTime are the most used search criterias at the moment.

What could be the best bet to partition the table. Is it Range Hash based on UpdateDateTime and Employee_ID?

Please suggest.
Tom Kyte
July 17, 2012 - 4:40 pm UTC

why do you want to partition - that is the most important thing to answer. pick one reason

if you say "to purge", then you need a range partition on updatedate

but then you have to realize that updates to updatedate will cause the row to potentially move from partition to partition (a delete+insert operation internally).

and you have to realize that when you search using an index, they'll need to be global indexes with their own partitioning (or not) strategy.

and when you do purge, you'll have to decide whether to let the indexes go invalid and rebuild them (offline purge) or if you want to maintain the global indexes (which will make the purge take longer, but hey - your end users are still going)


Followup

A reader, July 17, 2012 - 7:18 pm UTC

Hi Tom,

I am sorry but didn't understand when you said updateDate will be updated. That will be a new row. Every time a record is updated in Main table, a row with new updateDate will be inserted in HISTORY table as an Audit purpose. The CreateDate will remain same in all the corresponding rows.

Yes, there may be a problem when we had to purge the data after 5 years. To make the system scalable, what do you suggest ? On which field I should base my Partition?

Will that also help a bit in Performance considering I dont have specific search columns except those 5-6 most used columns.

If you had your way, what you had done?
Tom Kyte
July 18, 2012 - 9:34 am UTC

I don't know your application, only you do. I don't have the details of your processing, therefore I can only "guess" at what the meaning of things are.

if you do not update the updateData why do you have a createDate and an updateDate. Seems your data model is botched there if the createDate is supposed to be constant for all records. Sort of like carrying the department name into the EMP table along with deptno.


define scalable for me here, what does scalable mean?


and no, it is not going to help materially in your query performance if you are using index access. partitioning helps full scan performance. as I said above.

If I had my way, I'd be
a) reviewing the design, because it seems wrong
b) determining what my goal with partitioning is
c) designing a model that allows me to achieve (b)


which means we are probably looking at partitioning by updateDate as I said - using global indexes on the other attributes as I said - and deciding whether my purge will be online or offline as I said.


More partition is disadvantage?

Bala, August 05, 2012 - 4:00 am UTC

Hi Tom,

Thank you for your wonderful service to the oracle community. Is there any disadvantage of having more partitions on Oracle. Our application is running in Oracle 11g and every day 350 partition will be created. This will keep creating for 340 days. After that day1 partition will be dropped and set of 340 partitions will be created. So 119,000 partitions would be available at one point of time. All the partitons are on the same tablespace. This is in OLTP environment.
My question is will there be any memory issue or any disadvantge is there.
Tom Kyte
August 17, 2012 - 11:48 am UTC

look in the other place you asked this same question.

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.