Skip to Main Content
  • Questions
  • User-defined functions with arbitrary #s of arguments; partition sizes/granularity

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Duke.

Asked: October 06, 2006 - 4:59 pm UTC

Last updated: October 09, 2006 - 8:54 am UTC

Version: 9.2.0

Viewed 1000+ times

You Asked

1. Some Oracle-supplied functions, like GREATEST, LEAST, COALESCE, allow an arbitrary number of parameters to be passed in. Is there some convenient means of creating user-defined functions with an arbitrary number of arguments? (If I put the function in a package, I know I could overload the function with increasing number of arguments, but that's rather verbose).

2. When partitioning a table, is there a minimum recommended # of records per partition? I just finished reading
</code> https://asktom.oracle.com/Misc/oramag/on-rownum-and-limiting-results.html <code>
and it doesn't seem to address the granularity of partitions. In your original book Expert 1-on-1, you says say that "Partitioning is designed to facilitate the management of very large tables and indexes" a "100 GB table for example, [if] it is partitioned into fifty 2 GB partitions.".

What is the threshold of "very large"?

For example, we have a partitioned table with 2.8 million records, about 2G in size. It's partitioned by month: the largest partition has just 17,000 records. I've never seen any query invoke the partition key so I see no obvious performance benefit (generally the queries use a global index, then get the data). I can't imagine partitioning improves managability either. I'd be inclined to unpartition such tables until some far future time.

As I recall, a DBA at a previous company of mine experimented, and reported that one million was the minimum size of a partition that contributed positively to performance. The fact table he experimented upon was only a couple of million records in size (like the one I mentioned). He said that for full table scans across all the data, which were the most common query, paritions slowed the scans. He de-partitioned the table.

Thank you!



and Tom said...

1) well, they all have limits too...

ops$tkyte%ORA10GR2> declare
2 l_cursor sys_refcursor;
3 l_query long := 'select greatest(1';
4 l_output number;
5 begin
6 for i in 1 .. 254
7 loop
8 l_query := l_query || ','||i;
9 end loop;
10 l_query := l_query || ') x from dual';
11
12 execute immediate l_query into l_output;
13 dbms_output.put_line( l_output );
14 end;
15 /
254

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> declare
2 l_cursor sys_refcursor;
3 l_query long := 'select greatest(1';
4 l_output number;
5 begin
6 for i in 1 .. 2000
7 loop
8 l_query := l_query || ','||i;
9 end loop;
10 l_query := l_query || ') x from dual';
11
12 execute immediate l_query into l_output;
13 dbms_output.put_line( l_output );
14 end;
15 /
declare
*
ERROR at line 1:
ORA-00939: too many arguments for function
ORA-06512: at line 12

all of them do, so you can simply have a fixed number of arguments (as many as you like) and just default then, the following function could be called as:

f()
f(1)
f(1,2)


ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create or replace function f( p_arg1 in number default null,
2 p_arg2 in number default null )
3 return number
4 as
5 begin
6 return null;
7 end;
8 /

Function created.

Using a collection, you can do it even "easier" if you ask me, an array is a good way to pass as many things as you like

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create or replace type varargs as table of varchar2(4000)
2 /

Type created.

ops$tkyte%ORA10GR2> create or replace function f( p_args in varargs default varargs() ) return number
2 as
3 begin
4 for i in 1 .. p_args.count
5 loop
6 dbms_output.put_line( 'varargs('||i||') = ' || p_args(i) );
7 end loop;
8 return null;
9 end;
10 /

Function created.

ops$tkyte%ORA10GR2> variable x number
ops$tkyte%ORA10GR2> exec :x := f( varargs(1,2,3) );
varargs(1) = 1
varargs(2) = 2
varargs(3) = 3

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> exec :x := f( varargs(1,2,3,4,5,6) );
varargs(1) = 1
varargs(2) = 2
varargs(3) = 3
varargs(4) = 4
varargs(5) = 5
varargs(6) = 6

PL/SQL procedure successfully completed.



2) absolutely... NOT

partitioning isn't not necessarily about size (although if my segments where nearing say the 10gig size, I'd think about using partitioning for EASE OF ADMINISTRATION).

Very large is very subjective. What is your threshold for pain?


the number of records - not relevant.

Your goal for partitioning (ease of admin OR partition elmination in a warehouse OR increased availability - pick one, and typically only one - and then use partitioning to achieve it) - priceless.


I'm laughing at that one million number. sorry, but that is just funny.

First, partitioning is not entirely about performance. Many times you have to be really careful not to NEGATIVELY impact performance (eg: it can be done so that performance is not affected at all - BUT it is easier to administer or more available).


I doubt he tested very well. For a table of any size, the partitions are just not going to affect full scan performance if you have to full scan all of the data. It is like the "single extent myth".


I do not concur with the findings at all.


But - you do need a reason to use partitioning, it is NOT fast=true, it is a tool, nothing more, nothing less.

Rating

  (3 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

David Aldridge, October 07, 2006 - 12:20 am UTC

The only mechanism that I can think of whereby "excessive" partitioning might impact full scan performance would be where each partition's data occupied space smaller than block_size*DFMBRC, in which case the number of i/o's would be unneccessarily large and hence inefficient. It'd have to be a huge number of partitions to be significant though, I'd think.

Arbitrary

Duke Ganote, October 07, 2006 - 9:28 am UTC

1. Thanks! Of course, an "large" number of defaulted-value parameters is the obvious solution (well, obvious NOW :). I didn't know if the Oracle-supplied functions used some extra bit of recursive parameter "magic" that just didn't immediately occur to me.

2. So for my world, where:
* table record counts are x,000,000
* we've never removed any historical data
* the partition key isn't used in any known queries
then the utility of partitioning is not obvious. The DIS-utility for us seems to be just extra work when constructing the DDL. My half-baked experiments didn't seem to show any performance impact, which surprised me given the previous results someone else reported.

The only other disutility appears to be complaints from a DBA about the extra effort in restoring the data (say, if a table was accidently truncated) from multiple tablespaces. The partitions are in unique tablespaces.

I'm inferring there's little utility in de-partitioning existing tables, but little gain in partitioning any future tables.

I'm glad I could ask, thanks for the answers!

Tom Kyte
October 07, 2006 - 11:01 am UTC

...
then the utility of partitioning is not obvious.
......

that is very well said.


I've said over and over - you need to have a GOAL in mind before partitioning.

It can be "ease of administration", we have really big things and want them more manageable, or we need to PURGE data....

It can be "higher availability", we would like to hash partition by customer ID in our OLTP database, so that if we take a disk hit, only 1/32nd of the customer data is unavailable when we restore that tablespace...

It can be "possible performance boost", mostly in a warehouse (but maybe in a transactional system as it tends to naturally cluster related data...) due to partition elimination.

but you need to have a goal, before you start using it, else it is just "extra stuff"

Anything worth doing is worth overdoing...

Duke Ganote, October 09, 2006 - 8:22 am UTC

Your and David Alridge's comments led me to begin experimenting with "excessive" partitioning (and made me reflect back on this question:
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:26039880025641
)
I didn't find the trade-off point, but partitioning can be a performance-inhibitor.  I built 2 tables, one with 10,000 partitions, one with none, and compared performance under a few extreme conditions:

<b>0. THE TEST VERSION & PLATFORM:</b>

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

running on my XP laptop with 1GB RAM.

<b>1. META-QUERY TO BUILD LOTSA PARTITIONS:</b>

select 'partition y'||level
     ||' values less than ('||level
     ||') tablespace users,' 
  from dual 
connect by level < 10000

<b> TABLE CREATIONS: </b>

create table parti_test (
    trans_id    number,
    trans_dt    date,
    product_code    number,
    store_id    number,
    trans_amount    number(12,2),
        constraint parti_test_pk primary key ( product_code)
)
partition by range (product_code)
(
partition y1 values less than (1) tablespace users,
partition y2 values less than (2) tablespace users,
partition y3 values less than (3) tablespace users,
partition y4 values less than (4) tablespace users,
...
partition y9998 values less than (9998) tablespace users,
partition y9999 values less than (9999) tablespace users,
partition pmax values less than (maxvalue) tablespace users
)
/

Table created.

Elapsed: 00:17:36.02
SQL> alter table parti_test add primary key ( product_code );

Table altered.

Elapsed: 00:03:04.21

SQL> ed
Wrote file afiedt.buf

  1  create table noparti_test (
  2     trans_id        number,
  3     trans_dt        date,
  4     product_code    number,
  5     store_id        number,
  6     trans_amount    number(12,2),
  7          constraint parti_test_pk primary key ( product_code)
  8* )
  9  /

Table created.

Elapsed: 00:00:01.97

17 MINUTES vs 2 SECONDS: LOTS QUICKER TO CREATE UNPARTITIONED TABLE.

<b> HOW DOES INSERT PERFORMANCE COMPARE? </b>

SQL> ed
Wrote file afiedt.buf

  1  insert into noparti_test
  2  select level, sysdate, level, level, level/2
  3    from dual
  4* connect by level < 10000
SQL> /

9999 rows created.

Elapsed: 00:00:00.30

Execution Plan
----------------------------------------------------------
Plan hash value: 1236776825
-----------------
| Id  | Operation                    | Name | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | INSERT STATEMENT             |      |     1 |     2   (0)| 00:00:01 |
|   1 |  CONNECT BY WITHOUT FILTERING|      |       |            |          |
|   2 |   FAST DUAL                  |      |     1 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------


Statistics
----------------------------------------------------------
        418  recursive calls
       1092  db block gets
        274  consistent gets
          0  physical reads
     902148  redo size
        929  bytes sent via SQL*Net to client
       1026  bytes received via SQL*Net from client
          6  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
       9999  rows processed

SQL> ed
Wrote file afiedt.buf

  1  insert into parti_test
  2  select level, sysdate, level, level, level/2
  3    from dual
  4* connect by level < 10000
SQL> /

9999 rows created.

Elapsed: 00:03:17.39

Execution Plan
----------------------------------------------------------
Plan hash value: 1236776825

-----------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | INSERT STATEMENT             |      |     1 |     2   (0)| 00:00:01 |
|   1 |  CONNECT BY WITHOUT FILTERING|      |       |            |          |
|   2 |   FAST DUAL                  |      |     1 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------


Statistics
----------------------------------------------------------
        861  recursive calls
     210854  db block gets
      30451  consistent gets
      40240  physical reads
   13049900  redo size
        929  bytes sent via SQL*Net to client
       1024  bytes received via SQL*Net from client
          6  SQL*Net roundtrips to/from client
         12  sorts (memory)
          0  sorts (disk)
       9999  rows processed

UGH! 200,000 LIOs vs 1,000 LIOs.

<b> HOW ABOUT SOME SIMPLE SELECT COUNTS? </b>

SQL> select count(*) from parti_test;

            COUNT(*)
--------------------
               9,999

Elapsed: 00:00:01.30

Execution Plan
----------------------------------------------------------
Plan hash value: 3285618050
-----------------
| Id  | Operation             | Name        | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |             |     1 |     8   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE       |             |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| SYS_C007839 |  7812 |     8   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
        668  recursive calls
          0  db block gets
        231  consistent gets
        100  physical reads
        116  redo size
        412  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> select count(*) from noparti_test;

            COUNT(*)
--------------------
               9,999

Elapsed: 00:00:00.18

Execution Plan
----------------------------------------------------------
Plan hash value: 1046195881

-------------------------------------------------------------------------------
| Id  | Operation             | Name          | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |               |     1 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE       |               |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| PARTI_TEST_PK |  9999 |     6   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
         92  recursive calls
          0  db block gets
         88  consistent gets
         62  physical reads
        116  redo size
        412  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
          1  rows processed

<b> EVEN THE TRUNCATES ARE NOT EVEN CLOSE: </b>

SQL> truncate table parti_test;

Table truncated.

Elapsed: 00:35:46.93

SQL> truncate table noparti_test;

Table truncated.

Elapsed: 00:00:03.72
 

Tom Kyte
October 09, 2006 - 8:54 am UTC

but, we'll come back to this:

what was your goal with partitioning in this example? why did you do it? What was the point? did you achieve it?

How often do you create a table?
How often do you truncate the entire thing?
Why didn't you test removing all of the product_code = 42 values and its affect on undo redo...


it comes back to - this is a tool, you can use it to solve certain issues, do certain things.

it is not fast=true.
it is neither all goodness nor all evil

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.