You Asked

Hi Tom,

We have a table size of 100GB and planning to convert this table to range partition table based on month with compression to increase preformance and descrease size.

we have 10 writes/sec on this table.

1. will it major affect on write operation as it is compressed?
2. What is the best approach to do this?
Planning to use datapump for export/import but can not go for 5hrs outage window for the day of deployment.

Thanks

and Tom said...

well, depending on how you query this table, make sure you don't increase query response time - meaning - you'll probably have to work carefully to make sure that queries do not take much longer after partitioning then they did before.

I say that not knowing anything about your system - but by making some guesses. I'll guess this is OLTP (10 writes/sec) and not a warehouse. I'll guess that most access to the table is via indexes.

If so, when you partition the table, make sure to look at each and every index to decide how or even if it should be partitioned. For example, say this was a table called "orders" and you partitioned by order_date. Say you query frequently "select * from orders where cust_name = :x order by order_date". You probably have an index on (cust_name) and maybe on (cust_name,order_date). When you partition the orders table, say into 32 range partitions (just to pick a number) - you would NOT want a local index on the cust_name attribute. If you used a local index on that attribute, you would end up doing 32 index range scans (instead of one) when you retrieved this information.

What you would want is either a non-partitioned index on cust_name - or a globally partitioned index on customer name (index is partitioned differently than table basically).

In that fashion - performance would REMAIN EXACTLY THE SAME.

You won't make index access go faster in general with partitioning, in order for partitioning to have a positive impact on performance - you would have to be doing some FULL SCANS that would benefit from partition elimination.

This is important to fully understand: in order for partitioning to have a positive impact on performance - you would have to be doing some FULL SCANS that would benefit from partition elimination. With OLTP you have to be careful to not NEGATIVELY impact performance!!! But don't expect any "transparent, free, immediate, big" performance boost from it - it won't really happen.

As for the compression:

OLTP compression is designed to not noticeably impact insert performance in most cases. You will of course benchmark your application prior to going production with it (of course) to measure this. If you have a strange situation where it does noticeably impact you - you might consider a) not compressing the current month so no impact on inserts, b) at the end of each month performing a reorg of last months data to direct path compress it. However, that is probably not necessary. Your load test will show you this....

The best approach to partitioning the existing table depends on your needs.

If you can do this offline (application not running during reorganization), you would want to use a CREATE TABLE new_table AS SELECT or INSERT /*+ APPEND */ into new_table. New_table would be your partitioned table. You could then drop the old table (or rename it) and rename this new table to old table. You would then index, grant, etc etc etc on this newly created table. That would not take long with a medium sized 100gb table. It can be done using parallel, direct path - skipping undo and redo generation. I would not want to use data pump, especially if your plan was to datapump out to a file, drop table, create new empty table and data pump import - reading the file again. That is a lot of extra IO you don't need to do.

If you want to do this online, use dbms_redefinition. You would create a new empty partitioned table. You would start the redef process. You would then copy table dependents to get all grants, indexes, constraints, triggers etc. You would then sync the two objects a few times - and then invoke finish redef which would sync - lock tables - sync again, and then rename them. The duration of "down time" would just be the amount of time for that last sync (very short, very very short since you are only doing 10 inserts/second - not too much work to be synced during that last sync)

Is this answer out of date? If it is, please let us know via a Comment

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.

Breadcrumb

Question and Answer

You Asked

and Tom said...

More to Explore

VLDB

VLDB

Importing data into new Partition Table 100GB

Breadcrumb

Question and Answer

You Asked

and Tom said...

More to Explore

VLDB

VLDB