A Step-by-Step Introduction for Developers
Partitioning in the database reflects the same way we handle large tasks in the real world. When a task gets too large to tackle in one hit, whether it be trimming a tree, taking a long drive, or washing the dishes, we can split the task up into smaller pieces to make them more manageable. It can be as simple as trying to relocate one's catalog of Oracle technical books!
Partitioning is that same thought applied to data stored in the database. As the demands to store more and more data in the database increase, the performance of operations against those large tables can suffer. And in applying the Pareto principle to the storage of data, typically it is only a subset of the entire dataset that is actively worked upon to satisfy the day to day needs of our businses users. Using the Partitioning option in the Oracle Database, data can be segmented into smaller, more manageable chunks to make maintenance tasks easier for Database Administrators, but also give scope for better application performance via more efficient execution of the queries being issued by Application Developers.
There are different types of partitioning options available to cater for specific business requirements. There might be a requirement to break up SALES data into each calendar year. Or there might be information being gathered on popular SPORTS that will be separated because minimal cross-sport queries will ever be run. Or a table of cell phone CALLS might just be so large that it needs to be evenly scattered across smaller segments to keep them at a manageable size. All such options are possible with the Oracle Partitioning option.
There are other partitioning strategies as well for more esoteric requirements, including partitioning strategies between tables linked by referential integrity, and multi-dimensional forms of partitioning (partitions of partitions).
You get started by simply heading over to a free service from Oracle called livesql.oracle.com. This service lets you run SQL and create database objects with no software required other than your browser. There are also hundreds of sample scripts and tutorials on a multitude of topics to assist you with learning about the Oracle Database
Oracle Partitioning is also available as a fully supported feature of all Oracle Database Cloud Services and on-premise Oracle Database Enterprise edition. Here is a quick primer on LiveSQL by Oracle Vice President Mike Hichwa:
Oracle LiveSQL features:
LiveSQL is completely free for any usage. Sign up is free, quick and easy.
The SQL you write in LiveSQL can be metadata tagged, saved, shared with others, or shared to the entire Oracle community.
LiveSQL contains hundreds of tutorials written by experts both inside and outside Oracle Corporation to fast track your productivity.
LiveSQL runs on the latest version of the Oracle Database, so you can safely test out new features before you upgrade your own systems.
Perhaps the most common example of partitioning in Oracle databases is to divide up a large data into partitions based on a time attribute. The largest tables in your applications are often a time-based record of critical business activities. It could be sales transactions for a retail business, mobile phone calls for a telecommunications business, or deposits and withdrawals for a banking institution. In all of these cases, there are a couple of common elements, namely each transaction (row) in the table has a time stamp of when the transaction occurred, and typically the volume of such transactions is high, making the table large in size in a short period of time. Partitioning is a natural fit for such tables, because queries often want to only peruse time-based subsets of the data, for example, transactions for the current month, or current week. Also, breaking the large table into smaller more manageable sized pieces is useful for adminstrators from the perspective of maintenance. Time is a continuous (analog) measurement, and thus, a large table would be segmented into time-based ranges, hence the term used for this opertaion is range based partitioning. This video walks you through the creation a simple range partitioned table.
The keyword that defines that a table is partitioned is PARTITION BY which follows the normal table column definitions. The clause BY RANGE nominates the type of partitioning scheme that the table will use.
Range partitions are not given lower and upper bounds, only an upper bound. The lower bound is implicitly defined as the upper bound of the previous partition. A partition can contain values up to but not including the value nominated in the VALUES LESS THAN clause.
After the PARTITION BY clause, at least one partition must always be defined.
The data dictionary tracks all of the defined partitions for a table. USER_TAB_PARTITIONS displays one row per defined partition for each partitioned table in your schema.
Even with just a simple range partitioning example, we have enough tools at our disposal to examine the potential performance benefits possible by partitioning a table. When SQL queries consume too much I/O, often the only resolution considered is to create indexes on the table. The premise of indexing is simple: locate data more quickly and avoid scanning data unnecessarily. Partitioning a table takes the same approach via a concept known as "pruning" or "elimination". If a query on a partitioned table contains appropriately phrased predicates that include the partitioning column(s), the optimizer can generate an execution plan that will bypass those partitions that by definition, could not contain data relevant to the query. The following video shows a demonstration of this, including a comparison of the cost of partition pruning versus a conventional indexing strategy.
If the optimizer can eliminate partitions from consideration, query performance can be dramatically improved. In later videos, you will see how you can interpret the optimizer execution plan output to determine if partition elimination will occur for a given SQL query.
You can use the DUAL technique from the video to generate arbitrary test data for any table, partitioned or otherwise. As per the video, keep the number of rows generated from a single DUAL CONNECT BY query to less than tens of thousands, and use cartesian joins if you need to scale beyond that. See Tanel Poder's blog post on how PGA memory is impacted by these queries for the reason why you should not go to extremes.
In some circumstances, partitioning a table allows for existing indexes to be consolidated or dropped, which can reduce overall database size, and improve insert, update and delete performance.
For many tables, a single column accurately defines the ranges via which you might want to partition your data. But range partitioning in the Oracle Database is not limited to that.
You can nominate more than one column to be the partitioning "key", ie, the columns that define where data will be stored. The order of the columns is critical as the video demonstration shows:
Secondary and tertiary columns in the partitioning definition are only used as "tie breaker" values. When inserting a row into a multi-column range partitioned table, the first column in the partitioning key is used to determine the partition into which to store the row. If multiple partitions have the same value for the first column, then the second partitioning column is used and so forth. Multiple columns in the partition key does not a "matrix" or "n-dimensional" structure of partitions.
One of the examples in the video uses a NUMBER datatype to store date based information. As the video states, this is generally a poor design idea. See Storing Dates by Richard Foote for more examples of why you might want to rethink such an approach in your databases.
In the same way that USER_TABLES contains various columns to reflect the current optimizer statistics, USER_TAB_PARTITIONS also contains this information at a partition level. Thus you can use NUM_ROWS, BLOCKS et al to get information on the volumne of each partition, the accuracy being limited to the timing/granularity of statistics gathering with DBMS_STATS.
Sometimes partitioning is not about a logical segmentation of a table based on its attributes as is the case for range partitioning. When any database table gets massive in size, it is harder to manage for adminstrators because maintenance on such a table typically involves longer downtime for your business applications. Hash partitioning lets you segment a table in equi-sized chunks based on a hashing function applied to one or more columns of the table.
Creating a hash partitioned table is easy to do, but the number of hash partitions nominated is critical, as the video demonstration shows:
For equi-sized partitions, the number of partitions must be a power of 2.
The hashing algorithm is not documented but the ORA_HASH function is observed to return results consistent with the data segmentation that occurs with hash partitioning.
A partition can be split using the ALTER TABLE SPLIT PARTITION command, which is a task normally done by the Database Administrator. Spliting a partition is a resource heavy activity as potentially an entire partition's worth of data will be moved.
Range partitioning, as the name suggests, is about carving up data that is analog in nature, that is, a continuous range of values. This is why dates are a natural candidate for a range-based partitioning scheme.
But sometimes the column you might want to partition on contains a discrete set of values, which is when LIST partitioning is the best solution. Creating a LIST partitioned table requires nominating the discrete values for each partition, or relying on the new AUTOMATIC clause in 12c Release 2. The explicit value nomination technique is shown in the following video demonstration:
A single partition can contain one or more discrete values.
A "catch all" partition can be defined using the DEFAULT clause in the VALUES statement. Nulls also go into this partition.
A DML statement can explictly nominate a partition in a table to act upon using the PARTITION clause following the table name.
Even once partitioned, a table may still be extremely large. Or the partitioning scheme may result in partitions that are not equally sized on disk. For example, archiving off of older data might mean that historical partitions are far smaller than the current partitions. Equi-sized partitions might be beneficial in particular when it comes to performing operations in paralle on a per-partition basis.
Partitions can be segmented further into subpartitions. Each partition can have it's own subpartitioning scheme, or all partitions can share a common schema. The syntax for subpartitioning is explained in the following video demonstration:
The subpartitioning scheme can be different to the parent partitioning scheme.
Each partition can have its own subpartition definition. It is also valid for a partition to have no subpartitions.
The SUBPARTITION_POSITION column in USER_TAB_SUBPARTITIONS refers to the position of a subpartition within the parent position.
Subpartition templates make it easy to enforce a common scheme for all partitions and reduce the size of DDL scripts.
The existence of a subpartition determines whether a partition is a physical segment on disk, or just a logical collection of physical subpartition segments.
For incoming data, a partition must already exist that the incoming partitioning key would map to. If a partition does not exist, transactions will fail with ORA-14400. In earlier versions of Oracle Database, this meant that administrators had to take extreme care to ensure that partitions for future data were defined or risk application outages. Interval partitioning (introduced in Oracle Database 11g) removes this risk by automatically creating new partitions on a partitioned table as required when new data arrives.
Interval partitioned tables are different from range partition in that logical "gaps" in the partition coverage are permitted. How to define interval partitioned tables is explained in the following video demonstration:
As partitions are dynamically created, system generated names are assigned to the new partitions. They can be renamed to meet existing business standards.
Partitions are fixed in partitition key range size, namely the size of the interval. Unlike range partitions, the upper and lower bounds are defined by the interval, not the adjoining partitions.
If the name of an interval partition is not known, it can be referenced using the FOR ( key-value ) syntax.
Existing range partitioned tables can be converted into interval partition tables with a simple conversion command, requiring no downtime or data migration. The existing partitions in the range partitioned table remained defined as "range partitions" whilst new partitions created dynamically are "interval partitions". Thus a table that was not initially created as an interval partitioned table is a hybrid between the two, containing both types of partitions. Because intervals are defined as an offset from an initial fixed range partition boundary, you cannot drop all of the range partitions in an interval partitioned table. At least one must always remain. In Oracle Database 12c Release 2, the database will automatically convert the oldest interval partition into a range partition if no initial range partition boundary can be found.
How to convert a range partitioned table to an interval partitioned one is explained in the following video demonstration:
Simply specifying the desired interval via ALTER TABLE ... SET INTERVAL converts a range partitioned table to an interval partitioned one.
On a converted table, the INTERVAL column on USER_TAB_PARTITIONS indicates whether each partition is a range or interval partition.
At least one range partition must remain in the table. Before Oracle Database 12c Release 2, re-running the SET INTERVAL command will mark existing interval partitions as range partitions.
You are ready to start taking advantage of the Oracle Partitioning option. Check the resources below to keep your partitioning knowledge growing
Partitioning features have been in the database for over 20 years, since Oracle Database 8.0. Partitioning has its very own manual in the Oracle Documentation, and there are many other sources of information to boost your knowledge.