Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Babloo.

Asked: February 11, 2006 - 5:34 pm UTC

Last updated: November 28, 2009 - 11:12 am UTC

Version: 9.2.0

Viewed 10K+ times! This question is

You Asked

Hi Tom ,
we have an application which uses range partitions on begin dates and end dates (just 2 fields) . almost always we run a query which has a between 2 dates clause in where part of
eg
begin_dt > input 1 and end_dt < input 2
TAble is partitioned like this

PARTITION APWS_20031115 VALUES LESS THAN (TO_DATE('20031116', 'YYYYMMDD'), TO_DATE('20031116', 'YYYYMMDD')),

If we create a differnt tables rather than partitioning on the begin and end dates , would it have any impact on performance ?
I understand We will have to make code changes to figure out the exact name of the table every time we run it
which approach you would like better ?


and Tom said...

I would not even consider multiple tables.

Partitions give you the "physics" of multiple tables (physically, there are many segments) without the hassles (like trying to figure out "what table did I mean to query this time", always having to use dynamic sql, trying to write queries that NEVER span partitions, and so on)

Rating

  (9 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

multiple tables are useful

A reader, February 13, 2006 - 6:00 am UTC

when you use partition views

Tom Kyte
February 13, 2006 - 8:40 am UTC

well, it might be more correct to say

"if you have chosen to use multiple tables, you might well be interested in an optimizer feature called partitioned views"

It is not that multiple tables are useful when you use partition views.

It is true however that partition view optimizations may well be useful when you yourself opt to use multiple tables.

But it would also be true what I said above.

Partition tables

Amar, February 13, 2006 - 12:58 pm UTC

Hi Tom,

Recently on the datawarehouse I support one of my colleague has changed many normal tables into partition tables. In what ways the performance will be affected?( increase or decrease)and how is that going to happed?

Regards
Amar

Tom Kyte
February 13, 2006 - 3:04 pm UTC

One of three things will happen:

a) things will run faster against some tables/queries
b) things will run slower against some tables/queries
c) there will be no noticable change in performance either way for some tables/queries


(how many people saw that coming :)

Here is an excerpt from Expert Oracle: Database Architecture on this topic:

<quote>

Partitioning Overview


Partitioning facilitates the management of very large tables and indexes using "divide and conquer" logic. Partitioning introduces the concept of a partition key that is used to segregate data based on a certain range value, a list of specific values, or the value of a hash function. If I were to put the benefits of partitioning in some sort of order, it would be

1. Increases availability of data: This attribute is applicable to all system types, be they OLTP or warehouse systems by nature.

2. Eases administration of large segments by removing them from the database: Performing administrative operations on a 100GB table, such as a reorganization to remove migrated rows or to reclaim "whitespace" left in the table after a purge of old information, would be much more onerous than performing the same operation ten times on individual 10GB table partitions. Additionally, using partitions, we might be able to conduct a purge routine without leaving whitespace behind at all, removing the need for a reorganization entirely!

3. Improves the performance of certain queries: This is mainly beneficial in a large warehouse environment where we can use partitioning to eliminate large ranges of data from consideration, avoiding accessing this data at all. This will not be as applicable in a transactional system, since we are accessing small volumes of data in that system already.

4. May reduce contention on high-volume OLTP systems by spreading
out modifications across many separate partitions: If you have a segment experiencing high contention, turning it into many segments could have the side effect of reducing that contention proportionally.

Let's take a look at each of these potential benefits of using partitioning.

....


Enhanced Statement Performance

The last general (potential) benefit of partitioning is in the area of enhanced statement (SELECT, INSERT, UPDATE, DELETE, MERGE) performance. We'll take a look at two classes of statements-those that modify information and those that just read information-and discuss what benefits we might expect from partitioning in each case.


Parallel DML

Statements that modify data in the database may have the potential to perform parallel DML (PDML). During PDML, Oracle uses many threads or processes to perform your INSERT, UPDATE, or DELETE instead of a single serial process. On a multi-CPU machine with plenty of I/O bandwidth, the potential increase in speed may be large for mass DML operations. In releases of Oracle prior to 9i, PDML required partitioning. If your tables were not partitioned, you could not perform these operations in parallel in the earlier releases. If the tables were partitioned, Oracle would assign a maximum degree of parallelism to the object, based on the number of physical partitions it had. This restriction was for the most part relaxed in Oracle9i and later with two notable exceptions. If the table you wish to perform PDML on has a bitmap index in place of a LOB column, then the table must be partitioned in order to have the operation take place in parallel, and the degree of parallelism will be restricted to the number of partitions. In general, you no longer need to partition to use PDML.


Query Performance

In the area of strictly read query performance (SELECT statements), partitioning comes into play with two types of specialized operations:
* Partition elimination: Some partitions of data are not considered in the processing of the query. We have already seen an example of partition elimination.
* Parallel operations: Examples of this are parallel full table scans and parallel index range scans.
However, the benefit you can derive from these depends very much on the type of system you are using.


OLTP Systems

You should not look toward partitions as a way to massively improve query performance in an OLTP system. In fact, in a traditional OLTP system, you must apply partitioning with care so as to not negatively affect runtime performance. In a traditional OLTP system, most queries are expected to return virtually instantaneously, and most of the retrievals from the database are expected to be via very small index range scans. Therefore, the main performance benefits of partitioning listed previously would not come into play. Partition elimination is useful where you have full scans of large objects, because it allows you to avoid full scanning large pieces of an object. However, in an OLTP environment, you are not full scanning large objects (if you are, you have a serious design flaw). Even if you partition your indexes, any increase in performance achieved by scanning a smaller index will be miniscule-if you actually achieve an increase in speed at all. If some of your queries use an index and they cannot eliminate all but one partition from consideration, you may find your queries actually run slower after partitioning since you now have 5, 10, or 20 small indexes to probe, instead of one larger index. We will investigate this in much more detail later when we look at the types of partitioned indexes available to us.

Having said all this, there are opportunities to gain efficiency in an OLTP system with partitions. For example, they may be used to increase concurrency by decreasing contention. They can be used to spread the modifications of a single table out over many physical partitions. Instead of having a single table segment with a single index segment, you might have 20 table partitions and 20 index partitions. It could be like having 20 tables instead of 1, hence contention would be decreased for this shared resource during modifications.
As for parallel operations, as we'll investigate in more detail in the next chapter, you do not want to do a parallel query in an OLTP system. You would reserve your use of parallel operations for the DBA to perform rebuilds, create indexes, gather statistics, and so on. The fact is that in an OLTP system, your queries should already be characterized by very fast index accesses, and partitioning will not speed that up very much, if at all. This does not mean that you should avoid partitioning for OLTP; it means that you shouldn't expect partitioning to offer massive improvements in performance. Most OLTP applications are not able to take advantage of the times where partitioning is able to enhance query performance, but you can still benefit from the other two possible partitioning benefits, administrative ease and higher availability.


Data Warehouse Systems

In a data warehouse/decision-support system, partitioning is not only a great administrative tool, but something that will speed up processing. For example, you may have a large table on which you need to perform an ad hoc query. You always do the ad hoc query by sales quarter, as each sales quarter contains hundreds of thousands of records and you have millions of online records. So, you want to query a relatively small slice of the entire data set, but it is not really feasible to index it based on the sales quarter. This index would point to hundreds of thousands of records, and doing the index range scan in this way would be terrible (refer to Chapter 11 for more details on this). A full table scan is called for to process many of your queries, but you end up having to scan millions of records, most of which won't apply to our query. Using an intelligent partitioning scheme, you can segregate the data by quarter such that when you query the data for any given quarter, you will full scan only that quarter's data. This is the best of all possible solutions.

In addition, in a data warehouse/decision-support system environment, parallel query is used frequently. Here, operations such as parallel index range scans or parallel fast full index scans are not only meaningful, but also beneficial to us. We want to maximize our use of all available resources, and parallel query is a way to do it. So, in this environment, partitioning stands a very good chance of speeding up processing.
</quote>

Archiving Tables vs Partitions

MK, June 01, 2008 - 3:40 pm UTC

Hi Tom,
I wanted to get your suggestion on a design consideration in an OLTP Environment. We have a situation where some of the tables are growing at a very fast rate with about 10 million rows each and want to "age" out some of the data that we don't need. I feel that partitions on a date range would be perfect for this and we can follow the "sliding window" principle outlined in Chap 13 of your book -- Expert Oracle DB Architecture. I am not so sure if we should use partitioning as these are constantly growing tables that keep getting written into and queried from - Session Information, Tracking Events Occurred etc. In your book you mentioned Segment Compression with Partition Exchanging. I was wondering what the best way to age out this data would be in an OLTP environment? I have also noticed that some partitioning done in the past (HASH) caused the query plans to do full scans on some partitions. The other approach was to move data manually with a PL/SQL archiving program to another archive table that would be compressed. What would you suggest as the best approach to tackling this problem in an OLTP environment? I am trying to mix some DW concepts into the OLTP to keep these tables lean and constantly archive data we don't require in the run time environment.

Thanks in advance,
MK
Tom Kyte
June 02, 2008 - 11:12 am UTC

Please - keep reading that chapter... Once you "get" partitions - you'll know what to do with them.

There was no full scans on some partitions with the hashing example, I was just demonstrating there that "by enabling partitioning on a schema object you have not just set FAST=TRUE, you have to DESIGN, PLAN with this stuff"

the example was someone taking their largest OLTP table and hash partitioning it. Period. No other thought was given. So, all of their indexes are local, including their primary key index on the sequence they use to identify rows. Now, instead of a single index range scan, table access by index rowid to find a single row by primary key - they had 8/16/32/..../N (where N = number of hash partitions) index range scans plus one table access by index rowid. They didn't think about what they were doing to the data and hence increased dramatically the work needed to retrieve a row.

They needed to globally partition most of the indexes, so that when Oracle would use an index - it could find just the ONE index partition it needed, then access the table. Resulting in either the same or a slightly smaller bit of work (the individual global index partitions MIGHT be one level shorter than the single index was - MIGHT be).


As for aging of data - this can be perfect for that. You would range partition the table by the date field and then carefully global partition the indexes so as to not negatively impact performance.

Now to "age out" old data - you could just drop or truncate the oldest data *and maintain all indexes* (yes, this will take longer, but you want this to be ONLINE - see the partitioning chapter again for a discussion of this...). You can, if needed, rebuild individual global index partitions (online) afterwards if you deem it absolutely necessary.


Or, you can use this to archive data online, keep it in the table, by compressing old partitions that are not accessed anymore - reducing their storage needs, but keeping them online. It would be up to you.

Partitioning vs New Table

MK, June 03, 2008 - 7:13 am UTC

Thanks for the helpful advice. In regards to your response. The full scans on the partitions were noticed by myself on our system. The query was trying to select data over a range and instead of having done a index range scan it went ahead and did a full scan of the partition. The rows returned were less than 4% of the total rows in that partition. IMHO HASH partitioning doesn't present that great of an advantage unless like you mention in the book we try to reduce Disk contention by placing each partition in their own tablespace spread out on separate disks.
Ok I like the idea of aging the data by compressing the older range partitions (by date or timestamp).

1. Do I have to exchange older partitions with tables to age them (following the book's example)? Is this the best way of archiving data?

2. Would it be better to take some of the aged partitions offline so the run time environment doesn't bother with them?

3. Would you suggest global partitioned indexes on these tables for faster access? Right now we have global non-partitioned indexes on the table. I just want to hear what your experience is and I will not take it as a "FAST=true" switch! :)


Thanks,
MK
Tom Kyte
June 03, 2008 - 11:58 am UTC

that had nothing to do with hash partitioning - I'll show you queries where the optimizer decided to stop using an index for less than 2% of the data (rightly so) and the table wasn't partitioned at all. The choice to use an index (or not) has to do with estimated cardinalities and how the data is layed out on DISK (clustering factors and the like)

hash partitioning could have dramatically affected the clustering factor of some indexes however, but it was not - is not that "hash partitioning makes the optimizer not use indexes". See below, at the bottom for an example


HASH partitioning is good for the following:

o higher availability (if you have 128 hash partitions and query "where key = ?" a lot, and a partition is offline - you have only a 1 in 128 chance of hitting an error, many people can continue to function)

o higher concurrency, if you have a table inserted into frequently, and use a sequence to populate an artificial key, you'll have a really hot right hand side index - lots of waits on that right hand side to keep inserting, decreased ability to scale. So, you hash it (the table) by primary key into 128 partitions and locally partition the primary key index. Now you have 128 "luke warm" right hand sides of indexes - greater ability to scale. And you would globally partition the remaining indexes that are inserted into randomly - the ones on last_name and the like - they can either be NOT partitioned or partitioned in any way you like - just do not locally partition them of course as you would have 128 index range scans for a query like "where last_name = ?"

o ease of administration - if for whatever reason you wanted to reorganize the table, you could do so in bite sized pieces, doing 1/128th of the work (continuing to use 128 from above, 128 is not magic number, just a number I picked to converse with here...)


disk contention is something I'd rather fix with ------- disks. RAID or ASM (which gives us the striping of RAID)

1) you can a) exchange with empty table, so as to ARCHIVE the table before dropping it. b) truncate old partition - just nuke the data. c) drop the partition - just nuke the data. What you do is up to you.

2) why would the run time environment "bother" with them at all. runtime environment would hopefully have something in its predicate so as to make it so that is ONLY VISITS DATA OF INTEREST. This is the point that always confuses ME. A big table is as efficient as a small table - the fact that there is 77 years of data matters not a bit to me - because I only access data of interest. The runtime system should never (by design) visit these partitions if there is no data of interest there. We use predicates on dates to eliminate partitions from consideration, we use predicates on other attributes to eliminate index partitions from consideration and so on.


3) global partitioned indexes are not likely to decrease access times in general. You *might* decrease an index from 4 to 3 - which would decrease the amount of IO's we do to get to a key - which would be good. In short, you would have to analyze your data to see if there would be any benefit to partitioning the indexes.


Now, looking at clustering factors:


ops$tkyte%ORA10GR2> create table organized
  2  as
  3  select x.*
  4    from (select * from stage order by object_name) x
  5  /
Table created.

ops$tkyte%ORA10GR2> create table disorganized
  2  as
  3  select x.*
  4    from (select * from stage order by dbms_random.random) x
  5  /
Table created.

ops$tkyte%ORA10GR2> create index organized_idx on organized(object_name);
Index created.

ops$tkyte%ORA10GR2> create index disorganized_idx on disorganized(object_name);
Index created.

ops$tkyte%ORA10GR2> begin
  2  dbms_stats.gather_table_stats
  3  ( user, 'ORGANIZED',
  4    estimate_percent => 100,
  5    method_opt=>'for all indexed columns size 254'
  6  );
  7  dbms_stats.gather_table_stats
  8  ( user, 'DISORGANIZED',
  9    estimate_percent => 100,
 10    method_opt=>'for all indexed columns size 254'
 11  );
 12  end;
 13  /
PL/SQL procedure successfully completed.

<b>those two tables contain the same bits and bytes - just in different orders on disk....</b>

ops$tkyte%ORA10GR2> select table_name, blocks, num_rows, 0.05*num_rows, 0.10*num_rows from user_tables
  2  where table_name like '%ORGANIZED' order by 1;

TABLE_NAME                         BLOCKS   NUM_ROWS 0.05*NUM_ROWS 0.10*NUM_ROWS
------------------------------ ---------- ---------- ------------- -------------
DISORGANIZED                          707      49926        2496.3        4992.6
ORGANIZED                             707      49926        2496.3        4992.6

ops$tkyte%ORA10GR2> select table_name, index_name, clustering_factor from user_indexes
  2  where table_name like '%ORGANIZED' order by 1;

TABLE_NAME                     INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ ------------------------------ -----------------
DISORGANIZED                   DISORGANIZED_IDX                           49858
ORGANIZED                      ORGANIZED_IDX                                687

<b>same number of rows - 5% of the data would be 2,500 records.  2.5% would be about 1,250 rows...

note the clustering factors.  for the disorganized table - the clustering factor is near the number of rows in the table (hence the table is NOT sorted on disk in the same way the index keys are in the index).  For the organized table, the clustering factor is near the number of blocks - hence the table data is sorted rather nicely by the indexed key values on disk - makes sense, we loaded the table sorted by the index key</b>


ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select * from organized where object_name like 'X%';

Execution Plan
----------------------------------------------------------
Plan hash value: 1925627673

-------------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CP
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |    68 |  6324 |     3   (
|   1 |  TABLE ACCESS BY INDEX ROWID| ORGANIZED     |    68 |  6324 |     3   (
|*  2 |   INDEX RANGE SCAN          | ORGANIZED_IDX |    68 |       |     2   (
-------------------------------------------------------------------------------


ops$tkyte%ORA10GR2> select * from disorganized where object_name like 'X%';

Execution Plan
----------------------------------------------------------
Plan hash value: 3767053355

-------------------------------------------------------------------------------
| Id  | Operation                   | Name             | Rows  | Bytes | Cost (
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                  |    68 |  6324 |    71
|   1 |  TABLE ACCESS BY INDEX ROWID| DISORGANIZED     |    68 |  6324 |    71
|*  2 |   INDEX RANGE SCAN          | DISORGANIZED_IDX |    68 |       |     2
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_NAME" LIKE 'X%')
       filter("OBJECT_NAME" LIKE 'X%')

<b>when selecting 68 rows (0.13%) both tables use the index - but look at the cost - the cost is a function of the number of blocks Oracle expects to visit (in addition to other things, but the number of blocks is a big part of it)</b>

ops$tkyte%ORA10GR2> select * from organized where object_name like 'A%';

Execution Plan
----------------------------------------------------------
Plan hash value: 1925627673

-------------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CP
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |  1125 |   102K|    23   (
|   1 |  TABLE ACCESS BY INDEX ROWID| ORGANIZED     |  1125 |   102K|    23   (
|*  2 |   INDEX RANGE SCAN          | ORGANIZED_IDX |  1125 |       |     7   (
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_NAME" LIKE 'A%')
       filter("OBJECT_NAME" LIKE 'A%')

ops$tkyte%ORA10GR2> select * from disorganized where object_name like 'A%';

Execution Plan
----------------------------------------------------------
Plan hash value: 2727546897

-------------------------------------------------------------------------------
| Id  | Operation         | Name         | Rows  | Bytes | Cost (%CPU)| Time
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |              |  1125 |   102K|   224   (3)| 00:00:0
|*  1 |  TABLE ACCESS FULL| DISORGANIZED |  1125 |   102K|   224   (3)| 00:00:0
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_NAME" LIKE 'A%')

<b>but when selecting 1,125 rows - a mere 2.25% - we see different plans.  Same data, no partitioning involved, just different layout on disk...



Clustering Factor on Disks

MK, June 05, 2008 - 4:19 pm UTC

Thanks a million for the example. Point noted. I will read more into the actual data distribution and clustering factor of the disks and perform some tests too. Is there a good source to read from in regards to the disk clustering factor?

Now a bunch of questions. :)

1. You are right in noting that the partitioning scheme did not introduce the full table scans as I today decided to copy all the data into a non-partitioned table and noticed that the full scan was still happening. It was weird to me but I suppose it gets a bit clearer now. I noted that someone was joining two tables with table A returning about 100 rows and table B returning 10,000 rows both having a one-one relationship between IDs and this query did not use table A as a driving table to always restrict getting 100 rows from table B. So I was looking into Jonathan Lewis' idea of subquery factoring with a materialize hint (but first I will look into the clustering). Do you think that will mess up the CBO in anyway?

1. Can I perform a review on the problematic tables to see what their clustering factors are and if I do see problems that you illustrated what is the way to fix it? Would it mean an entire drop and recreate with the data entered in an orderly fashion to improve the cluster factor?
-
2. Should I rely on 10g's auto-analyze feature that runs every night I think to gather system statistics. Would you recommend writing a script that gathers these statistics using the DBMS_STATS package instead?

3. In an OLTP environment if you had columns with low cardinality on a table would you consider using a Bitmap index on it? If not ... why not?

Cheers,
MK

Clustering Factor on Disks

MK, June 11, 2008 - 10:47 am UTC

Hi,
I have been reading up on Jonathan Lewis CBO book and one of the chapters goes into quite some detail about the Clustering factor. I was wondering if there was ways of improving the clustering factor other than having to manipulate the statistics of the table to "trick" the CBO?
Waiting for a response.

Cheers,
MK
Tom Kyte
June 11, 2008 - 11:18 am UTC

manipulating the statistics does not "improve" the clustering factor. It changes an input into the optimizer and causes the optimizer to do something different.

In order to improve the actual computed clustering factor - the moving of BITS and BYTES is necessary.

You need to have the rows in the table sequenced in order of the index keys.

Create table emps
( empno number primary key,
hiredate date,
ename varchar2(30)
)
/

create index empno_idx on emp(empno);
create index hiredate_idx on emp(hiredate)
create index ename_idx on emp(ename);

inserts into this table are of the form:

insert into emps( empno_seq.nextval, sysdate, :ename );


In all likelihood:

the index on empno will be nicely clustered (cluster factor = number of blocks in table). This is because we TEND to append to the end of the table in most cases (assuming no deletes, or few deletes - emp information has to stick around forever...). So the rows on disk are accidentally ordered by empno.

ditto for hiredate - sysdate is always increasing.


but - because we don't hire people whose names start with A and then with B and then with C - but rather we hire people with randomly distributed names - the clustering factor of the ename idx will be poor - near the number of rows - and the only way to "improve it" would be to reorder the rows on disk (create table as select order by ename, dbms_redefinition - which can reorder rows as well in 10g)

But remember, if we improve the clustering factor of the ename_idx, we'll destroy the clustering factor of the other two

But that might not be an issue - if you "where empno = ?", "where hiredate = ?", the clustering factor won't be a big issue (small range scans). The clustering factor comes into play with LARGE range scans - and if you "where ename like 'A%'" then having the data on disk sorted....

Partition basics

Ross, September 10, 2008 - 10:34 pm UTC

Can you please advise if there's any limitations to the number of partitions / subpartitions on a table?

I have a large table which is approx 40GB & increases by 20Million rows/day. The data is only kept for 3 weeks for aggragation. As such I'm wondering if I can set hourly partitions....
Tom Kyte
September 16, 2008 - 12:00 pm UTC

Physics Database

MK, November 25, 2009 - 3:28 pm UTC

Hi,
I am working on a Physics project at the moment that needs to look at some very large mathematical data sets for which I wish to use an Oracle implementation. Maybe explaining with Pseudo code would work best.

1. Reduce the space by eliminating lot of points with a query or function.
2. From the reduced space, generate more points using a genetic algorithm that relate to this reduced space.
3. Keep repeating steps 1 and 2 till you reach a certain threshold value and then stop!

Now, I am thinking of how best to do this as I want to restrict my queries to scan only the so called interesting dataset and prune the rest. I don't want to have dynamic SQL which generates these temporary tables into which I can store my subset data and then get rid of it when I am done. Also note that at a given time I might have to keep a few of these "Reduced" data sets. Also step 1 can be parallelised too. So I was thinking along the lines of Partitioning and making sure that my queries did partition elimination all the time and looked at only the interesting subsets.
The problem is that this kind of data will shrink/grow depending on the step in the algorithm. Whats the best way forward?

Thanks,
MK

Physics Database

MK, November 25, 2009 - 3:30 pm UTC

Sorry to add to the question, do you think using Materialized views might assist in solving the problem and can they be partitioned with local indexes?
Tom Kyte
November 28, 2009 - 11:12 am UTC

MK

sorry, I cannot really advise on this one - it is too vague


...
1. Reduce the space by eliminating lot of points with a query or function.
reduce the space - how? what is this query or function? what is the set of inputs? what is the volume of data? What is the process to "reduce"?

2. From the reduced space, generate more points using a genetic algorithm that relate to this
reduced space.
same questions - all centered about "how"?

3. Keep repeating steps 1 and 2 till you reach a certain threshold value and then stop!
ditto...
....


sounds recursive and a bit procedural - you might look at the model clause

http://docs.oracle.com/docs/cd/B19306_01/server.102/b14223/sqlmodel.htm

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.