Skip to Main Content
  • Questions
  • Multi Column Partition Keys in Hash Partitioning

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question.

Asked: December 24, 2004 - 10:26 am UTC

Last updated: May 21, 2021 - 12:35 pm UTC

Version: 8.1.7

Viewed 10K+ times! This question is

You Asked

Dear Tom,

I am very new to this group. Please let me know is there ant disadvantages of having hash partioning and also Multi column partition keys are possible in Hash partitioning.

thanks

and Tom said...

Hashing is a tool -- nothing more, nothing less.

You have to describe a problem you want to solve before you can assign "goodness" or "badness" to a tool.

Say you have a large OLTP system and your goal is to reduce contention on 'hot segments' or 'hot index root blocks' or 'hot right hand side indexes'. Your OLTP table has nothing to partition by -- nothing meaningful anyway, so you decide "lets hash it"

So, you hash by the primary key into 32 partitions. You now have 32 tables you are inserting into (1/32 the amount of concurrent activity on each). You have 32 local index partitions for your primary key (so, 1/32 the amount of activity on each -- reduced contention). Heck, you might have decreased the height of the index from 4 to 3 in doing so -- so now a primary key lookup does 4/5ths the IO it used to (3 index + 1 table block vs 4 index + 1 table block).. You would then want to globally range partition the remaining indices in all probability (if you want to know why and it is not obvious, I have a write up on this in Effective Oracle by Design where I talk about the physics of partitioning -- you need to partition your oltp indices independent of the base table partitioning scheme if they do not include the hash key -- else you'll kill your system with extra IO)


</code> http://docs.oracle.com/cd/B10501_01/server.920/a96540/statements_73a.htm#2129213 <code>

describes the syntax for hash partitioning,

<quote>
hash_partitioning

Use the hash_partitioning clause to specify that the table is to be partitioned using the hash method.
column

Specify an ordered list of columns
used to determine into which partition a row belongs (the partitioning key).

</quote>

You want your hash key(s) to be fairly UNIQUE (lots of values) and you want the number of partitions to be a power of 2

2, 4, 8, 16, 32, 64, ....



Rating

  (3 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

partition by HASH(column1, column2)

Edgar, February 27, 2020 - 11:14 am UTC

Be aware about partition pruning for multiple hash key columns.
create table t(column1 number, column2 number) partition by hash(column1,column2) partitions 8;

select * from t where column1=:1;
-------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    26 |     2   (0)| 00:00:01 |       |       |
|   1 |  PARTITION HASH ALL|      |     1 |    26 |     2   (0)| 00:00:01 |     1 |     8 |
|*  2 |   TABLE ACCESS FULL| T    |     1 |    26 |     2   (0)| 00:00:01 |     1 |     8 |
-------------------------------------------------------------------------------------------
select * from t where column1=:1 and column2=:2;
----------------------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |    26 |     2   (0)| 00:00:01 |       |       |
|   1 |  PARTITION HASH SINGLE|      |     1 |    26 |     2   (0)| 00:00:01 |   KEY |   KEY |
|*  2 |   TABLE ACCESS FULL   | T    |     1 |    26 |     2   (0)| 00:00:01 |   KEY |   KEY |
----------------------------------------------------------------------------------------------

So, if You specify HASH(column1,...,columnn) then for the partition pruning to happen, the query must have equality predicate on each column from the list.

Compared with RANGE(column1,...,columnn) it works differently:
create table p(column1 number, column2 number)  partition by range(column1, column2)
(partition p00 values less than (0,0), partition p10  values less than (1,0));

select * from p where column1=:1;
-------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |     1 |    26 |     2   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ITERATOR|      |     1 |    26 |     2   (0)| 00:00:01 |   KEY |   KEY |
|*  2 |   TABLE ACCESS FULL      | P    |     1 |    26 |     2   (0)| 00:00:01 |   KEY |   KEY |
-------------------------------------------------------------------------------------------------


Chris Saxon
February 27, 2020 - 11:24 am UTC

Good point, thanks for sharing.

Multicolumn list partitioning

lh, February 28, 2020 - 10:45 am UTC

Hi

depending of the values of the partitioning columns it might be possible to utilize multi column list partitioning (available from 12.2) and possibly automatic list partitioning.


lh

Range partition on (year,month) Vs DATE/TIMESTAMP column

Paul, May 20, 2021 - 7:18 am UTC

Greetings!
I got a request from application team to create range partition on (Year, month) columns. They have split the date into year and month columns. This is how it is queried it seems.


Questions:

1) Wouldn't be the correct way to store it as single DATE/TIMESTAMP column and then range partition monthly ?

2) The process involved is, they delete the existing records for the current month and then inserting a new set. I think if its stored as date/timestamp column, it will be much easier/faster to do partition truncate operation.


CREATE TABLE CMP_MONTHLY"
(
"T_ENTY_ID" NUMBER NOT NULL,
"YEAR" NUMBER(4) NOT NULL,
"MONTH" NUMBER(2) NOT NULL,
"CMP_ENTY_ID" NUMBER NOT NULL,
"CMP_TYPE" VARCHAR2(16 BYTE),
"LAST_UPDATED_TS" TIMESTAMP(6) DEFAULT CURRENT_TIMESTAMP
);

Can you provide your thoughts..
Do you see any point of having YEAR/MONTH stored as two separate columns.

Thanks
Chris Saxon
May 21, 2021 - 12:35 pm UTC

1 While storing the full date and using (interval) partitioning is generally better, a couple of thoughts on when separate year/month may be better:

- The incoming data may not include a full date, just a month and year. In which case the day is a guess; storing only the known values makes more sense to me

- Queries searching for a specific month/year are easy. e.g.:
where month = :mth and year = :yr


Though you could also do this using virtual columns.

2 You can do truncate partition with the current setup; it's not necessary to switch the partitioning method to do this.

Also bear in mind truncate commits, whereas delete doesn't. Switching to truncate means there's a brief period where other users see that partition as empty.

As this works with current data, there's a good chance truncate would cause problems.

Loading the data into a staging table then using exchange partition is both fast and avoids the empty table problem.

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.