Skip to Main Content
  • Questions
  • Merge syntax to update selected partitions only

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Prasad.

Asked: November 10, 2011 - 2:39 am UTC

Answered by: Tom Kyte - Last updated: July 30, 2012 - 9:33 am UTC

Category: Developer - Version: 11g

Viewed 10K+ times! This question is

You Asked

HI Tom,

I have a table with multiple partitions and each partition has 5-10 millions. I want to update one column from source (source has 20 millions and spread in target table for multiple partitions) using merge statement for few selected partitions. Could you provide me the correct syntax?.

is the below syntax is correct

merge into table a partition (partition_name)
using source table b
when matched then update set a.col1=b.col2.

can i use any parallel hints if so do i need to use in update clause or in merge clause.


In fact i want to run this in loop for selected partitions selecting from one table where all selected partitions are listed and every time replacing the partition name.

Could help
Thx
Prasad

and we said...

Yes, you can use the partition extended name in the merge:

ops$tkyte%ORA11GR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(30)
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /

Table created.

ops$tkyte%ORA11GR2> insert into t
  2  select to_date( '12-mar-2003', 'dd-mon-yyyy' ) + mod(rownum,3), 0, null
  3    from all_users;

50 rows created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create table t2 ( x int, y varchar2(30) );

Table created.

ops$tkyte%ORA11GR2> insert into t2 values ( 0, 'hello' );

1 row created.

ops$tkyte%ORA11GR2> commit;

Commit complete.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> merge into t partition(part1)
  2  using t2
  3  on (t.x = t2.x)
  4  when matched then update set t.y = t2.y;

16 rows merged.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select dt, x, y from t;

DT                  X Y
---------- ---------- ------------------------------
03/12/2003          0 hello
03/12/2003          0 hello
03/12/2003          0 hello
03/12/2003          0 hello
03/12/2003          0 hello
03/12/2003          0 hello
03/12/2003          0 hello
03/12/2003          0 hello
03/12/2003          0 hello
03/12/2003          0 hello
03/12/2003          0 hello
03/12/2003          0 hello
03/12/2003          0 hello
03/12/2003          0 hello
03/12/2003          0 hello
03/12/2003          0 hello
03/13/2003          0
03/13/2003          0
03/13/2003          0
03/13/2003          0
03/13/2003          0
03/13/2003          0
03/13/2003          0
03/13/2003          0
03/13/2003          0
03/13/2003          0
03/13/2003          0
03/13/2003          0
03/13/2003          0
03/13/2003          0
03/13/2003          0
03/13/2003          0
03/13/2003          0
03/14/2003          0
03/14/2003          0
03/14/2003          0
03/14/2003          0
03/14/2003          0
03/14/2003          0
03/14/2003          0
03/14/2003          0
03/14/2003          0
03/14/2003          0
03/14/2003          0
03/14/2003          0
03/14/2003          0
03/14/2003          0
03/14/2003          0
03/14/2003          0
03/14/2003          0

50 rows selected.



to do this in a loop, you'll need to use dynamic sql in plsql since you cannot bind an identifier and the partition name is an identifier (so you'll build the merge in a string and then use execute immediate to execute it)


to understand the implications, implementation and how to invoke parallel DML, please read:

http://docs.oracle.com/cd/E11882_01/server.112/e25523/parallel003.htm#VLDBG1431

from start to finish (seriously, read the implications, the implementation - not just the "here is the magic to invoke this" part please)

and you rated our response

  (24 ratings)

Is this answer out of date? If it is, please let us know via a Review

Reviews

November 15, 2011 - 11:48 pm UTC

Reviewer: A reader

Tom,

Thanks for your quick response and i have few more questions on this.I am trying to update around 1 billion rows which are spread across different partitions and planning one in each partition.

1) Can i use parallel hint in Merge statement like
merge /*+ parallel ( tablename1 8) (tablename2 8) */ tablename1 using tablename2
on
( tablename1.X=tablename2.y
and tablename1.p=tablename2.q)
when matched then set A=b;

2) in second scenario the join columns from 2 tablename1 and tablename2 like X,Y,p and q doesn't have any indexes. what is the best way to write a merge statement

3) In second scenario how merge perform if we have indexes on join columns on 2 tables.


Could you help.
Thx
Prasad


Tom Kyte

Followup  

November 16, 2011 - 9:47 am UTC

please read (or re-read) my previous comment right above, that stands.

1) yes.

however if you are going to update 1,000,000,000 rows - I would suggest a CREATE TABLE AS SELECT rather than an update.

that can be parallel
it will be direct path
it can bypass redo
it will bypass undo
it will avoid migrated rows
it will probably be the best approach in this case.


2) if you are doing a large operation, indexes would be the worst thing to use. You want full scans and hash joins.

http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:6749454952894

3) poorly if you forced us to use indexes, we won't though unless you make us.

merge update and insert.

November 19, 2011 - 1:11 pm UTC

Reviewer: Amir Riaz

Tom,

In one of the thread at ask tom, you said in case of /*+append*/ hint merge statement append the inserts above high water mark and keep on built index on that data, when operation is over, merge command merge the real index and append index. you called it a bulk operation.

So in case of update, merge statement should do the same, instead of going to each row and updating, merge state build the complete row and insert it above high water mark. it remove the rowid from the update block.

That is why update statement and merge state which has only update clause has different performance which run in parallel.

update statement even when run parallel goes to block, find rowid, goes to row, update the column and logs redo. merge statement gets cursor row, update the column in cursor and insert it above high water mark, add it to small index. remove the rowid from block.

What is you opinion. I am trying for a test but dont know how to prove this theory.
Tom Kyte

Followup  

November 21, 2011 - 1:46 pm UTC

So in case of update, merge statement should do the same,

no it shouldn't, in order to do that, merge would have to DELETE the original row and then RE-INSERT it.

Which would be a lot more work than just updating it. DELETE generates the most UNDO in general (all of the row has to be written to UNDO), DELETE generates the most REDO in general. DELETE is in general the most expensive of DML statements. You wouldn't want to do that.

That is why update statement and merge state which has only update clause has
different performance which run in parallel.


I was unable to parse and understand what you meant by that.



Merge should NOT delete and then insert - that would be a lot more resource intensive than just updating.



If you have a MASSIVE amount of data to merge - you should be considering a CREATE TABLE AS SELECT - instead of using DML - so you can skip undo and redo, index maintenance, etc altogether

January 11, 2012 - 2:49 am UTC

Reviewer: A reader

Very useful

January 11, 2012 - 2:56 am UTC

Reviewer: A reader

Tom

For performance tuning one of our DBA suggested to add parallel hints parallel (tablename, 4) etc in all queries to get better performance.

is that a good idea adding hints in all queries even for small and big tables? is there a best way to handle this.

My doubt was if we keep on using hints like this in all queries my understanding is that later at one point of time queries waiting for resources(parallel slots) instead start immediately. Could you explain how this affect our system going forward.
Tom Kyte

Followup  

January 11, 2012 - 9:32 am UTC

do you have access to my book "Expert Oracle Database Architecture" - the chapter on parallel.

It is way too simplistic to say "add parallel hints" everywhere.

January 11, 2012 - 11:16 am UTC

Reviewer: mfz from 08831


In SQL Server , I can do the following with MERGE...

If the record does not exist in the source table , I can delete that record in the destination table .
( See the line 19 in the code below)

I dont see that option in Oracle's implementation of MERGE . How can I achieve that in Oracle's MERGE?


1> CREATE TABLE t_source
2> (
3> col1 INTEGER NOT NULL PRIMARY KEY ,
4> col2 CHAR(30) NOT NULL ,
5> col3 CHAR(1) NOT NULL
6> ) ;
7> go

1> CREATE TABLE t_dest
2> (
3> col1 INTEGER NOT NULL PRIMARY KEY ,
4> col2 CHAR(30) NULL ,
5> col3 CHAR(1) NULL
6> ) ;
7> go





1> Merge t_dest d USING
2> ( SELECT col1 scol1 , col2 scol2 , col3 scol3 FROM t_source
3> ) s ON ( s.scol1 = d.col1 )
4> WHEN NOT matched THEN
5> INSERT
6> (
7> col1 ,
8> col2 ,
9> col3
10> )
11> VALUES
12> (
13> s.scol1 ,
14> s.scol2 ,
15> s.scol3
16> )
17> WHEN matched AND col3 = 'N' THEN
18> UPDATE SET col2 = s.scol2 , col3 = s.scol3

19> WHEN NOT matched BY source THEN
20> DELETE ;


Tom Kyte

Followup  

January 11, 2012 - 3:24 pm UTC

our merge does not do that, we would have to make the source be a full outer join of source+dest

and then in the 'when matched' use the delete clause on the matched clause where the source record key was null.


or just delete from dest where (key) not in (select key from source)


At that point, since EVERY record in DEST is getting modified - every record will either be updated or deleted and new records inserted - I would opt for a CREATE TABLE AS SELECT from source and dest and just drop the old dest table.

No redo
No undo
No fragmentation of anything
Easily compressed
Faster than maintain every single index in real time


In short, I wouldn't want to use DML at all for something like that, I'd want to use bulk direct path DDL plain and simple

Updating Million Rows

January 11, 2012 - 11:21 am UTC

Reviewer: PP from New York

Agreed, instead of updating million rows create a new table would help.
Here is your answer:
however if you are going to update 1,000,000,000 rows - I would suggest a CREATE TABLE AS SELECT rather than an update.

that can be parallel
it will be direct path
it can bypass redo
it will bypass undo
it will avoid migrated rows
it will probably be the best approach in this case.

Clearly understood what you mean and the performance. But, there are hundreds of objects depend on this table and instead of updating the table, creating new table and renaming the actual table and renaming the new table back to the actual table would bring the entire system down and make it un-usable until all the depend objects are compiled back to valid status. What is your answer, for this?
Tom Kyte

Followup  

January 11, 2012 - 3:30 pm UTC

create your table as a partitioned table - with one partition.

create your new table as select from old (this new table is just a normal table). index this new table.

alter table partitioned and exchange its only partition with this new table including indexes without validation.

nothing goes invalid, grants, triggers, whatever all stay in place.

You can go the otehr way, too!

January 12, 2012 - 12:58 pm UTC

Reviewer: Mike from Richfield, OH USA

You said:
> create your table as a partitioned table - with one partition.
>
>create your new table as select from old (this new table is just a normal table). index this new table.
>
>alter table partitioned and exchange its only partition with this new table including indexes without validation.
>

If your 'permanent' table is already in place, and it is not partitioned, you can:
- Create (CTAS) your new table as a partitioned table with one partition
- ALTER TABLE your_CTAS_table EXCHANGE PARTITION its_only_partition WITH your_permanent_table INCLUDING INDEXES WITHOUT VALIDATION;
Tom Kyte

Followup  

January 13, 2012 - 9:15 am UTC

indeed, it can go either way

ORA-14763

January 16, 2012 - 9:26 am UTC

Reviewer: Sokrates

in your first answer on this thread, you say

to do this in a loop, you'll need to use dynamic sql in plsql since you cannot bind an identifier and the partition name is an identifier

this seems to be correct, however, the reason you give seems not to be correct - it sounds like you would need the partition name to do a merge into a specific partition, but

merge into t partition for (date'2003-03-12')
using t2
on (t.x = t2.x)
when matched then update set t.y = t2.y;

also works (without using the partition name), even

merge into t partition for (to_date('2003-03-12', 'yyyy-mm-dd'))
using t2
on (t.x = t2.x)
when matched then update set t.y = t2.y;

works.

But
variable s varchar2(10)
exec :s := '2003-03-12'
merge into t partition for (to_date(:s, 'yyyy-mm-dd'))
using t2
on (t.x = t2.x)
when matched then update set t.y = t2.y;

fails with

ORA-14763: Unable to resolve FOR VALUES clause to a partition number


why this implementation restriction ?
Seems to me that the partition must be resolvable during compile time, is this true ?

Tom Kyte

Followup  

January 17, 2012 - 2:35 pm UTC

Seems to me that the partition must be resolvable during compile time, is this true ?


yes, you cannot bind identifiers - the plan (the row source generated object, the byte code that is the stuff the SQL VM processes - runs) would necessarily be different if you used different segments.

Identifiers must be known at compile time to build the program that is your sql statement...

In the above, you are not binding anything (and when you do - bamm - it fails to compile). they are static, they cannot change no matter how many times you run that statement.

If you use a to_date that is not deterministic, it won't work either:



ops$tkyte%ORA11GR2> merge into t partition for (to_date('03-03-12', 'rr-mm-dd'))
  2  using t2
  3  on (t.x = t2.x)
  4  when matched then update set t.y = t2.y;
using t2
      *
ERROR at line 2:
ORA-14763: Unable to resolve FOR VALUES clause to a partition number




necessarily ?

January 18, 2012 - 2:05 am UTC

Reviewer: Sokrates

the plan (the row source generated object, the byte code that is the stuff the SQL VM processes - runs) would necessarily be different if you used different segments.

I don't see the the "necessarily" part.

See:
the plan for
select * from t where dt=to_date(:s, 'yyyy-mm-dd')
is
-----------------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |     2 |    78 |     8   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE SINGLE|      |     2 |    78 |     8   (0)| 00:00:01 |   KEY |   KEY |
|*  2 |   TABLE ACCESS FULL    | T    |     2 |    78 |     8   (0)| 00:00:01 |   KEY |   KEY |
-----------------------------------------------------------------------------------------------


(partition pruning at runtime, at runtime different segments can be used, can't they ? - no necessity here to use different plans for different segments )

the plan for
merge into t partition for (to_date('2003-03-12', 'yyyy-mm-dd'))
using t2
on (t.x = t2.x)
when matched then update set t.y = t2.y;

is
--------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT           |      |    37 |  1258 |     7  (15)| 00:00:01 |       |       |
|   1 |  MERGE                    | T    |       |       |            |          |       |       |
|   2 |   VIEW                    |      |       |       |            |          |       |       |
|*  3 |    HASH JOIN              |      |    37 |  2997 |     7  (15)| 00:00:01 |       |       |
|   4 |     TABLE ACCESS FULL     | T2   |     1 |    30 |     2   (0)| 00:00:01 |       |       |
|   5 |     PARTITION RANGE SINGLE|      |    37 |  1887 |     4   (0)| 00:00:01 |     1 |     1 |
|   6 |      TABLE ACCESS FULL    | T    |    37 |  1887 |     4   (0)| 00:00:01 |     1 |     1 |
--------------------------------------------------------------------------------------------------

also shows a
TABLE ACCESS FULL on T, here with Pstart=1, Pstop=1

The plan for
merge into t partition for (to_date(:s, 'yyyy-mm-dd'))
using t2
on (t.x = t2.x)
when matched then update set t.y = t2.y;


could be the same one with
Pstart=KEY, Pstop=KEY,

couldn't it ?

( on my wishlist for features which could be implemented: that the partition for-clause would support binds )

Tom Kyte

Followup  

January 18, 2012 - 7:31 am UTC

I'm not talking about a full scan versus a full scan. I'm talking about the generated byte code, the thing we store in the shared pool.


The BYTE code - not the plan, but the thing that the sql vm executes - would be necessarly different DEPENDING ON THE INPUTS.

The segment(s) to be accessed are part of this byte code, part of this program. They are not inputs to it, they are an integral part of it (goes back to identifiers cannot be bound, they are burned into this byte code, this program)


I don't care about the plan - I care about the generated byte code, the program, the executable that is generated. It depends on KNOWING what it is accessing.


The merge with the "for" clause is using an identifier, it is burnt into the program that is generated, that is what that syntax is for. If you want key/key - use a bind variable and a where clause. If you want a single partition to be identified - use the 'for' clause.

plan vs. byte code

January 18, 2012 - 12:26 pm UTC

Reviewer: Sokrates

so what is the difference between the [SQL execution] plan and the [SQL] byte code ?
Seems that I have a false understanding because I assumed (until now) they were the same (in the sense that a plan is an algorithm and the code a runnable implementation of this algorithm).

The full scan was only an example.

What I wanted to say:
take any case where partition pruning at runtime occurs.

There, the segment(s) involved at runtime are *not* burnt into the byte code, rather they are computed at runtime.
I still don't see why for a
"partition for"
clause this cannot also hold.


Tom Kyte

Followup  

January 18, 2012 - 1:09 pm UTC

an executing plan is for us human beings, it is just a text report of what that program will be doing. it doesn't include things like "is sql trace going to happen or not" - but the 'program' will (that is why if you turn on sql trace after running a query and run it again, it will hard parse - we need an entirely new program for it)


Seems that I have a false understanding because I assumed (until now) they
were the same (in the sense that a plan is an algorithm and the code a runnable
implementation of this algorithm).


no, that is an EXCELLENT analogy. the plan is an algorithm, generic, reusable, not fixed on a single 'thing' - an approach. And we take that algorithm and turn it into a specific program for a specific segment with a specific set of extra stuff (like sql_trace=true) - to actually implement that algorithm.




I still don't see why for a
"partition for"
clause this cannot also hold.


because partition for is, was designed specifically as a different way to specify the identifier:

"table_name partition(partition_name)"


partition for is an identifier, specified in a different - perhaps more flexible manner - than using the partition extended name. But it is an identifier none the less - not a where clause specification - and hence like all identifiers must be known at compile time.

If you want the flexibility of the "partition for" with the dynamic features of the where clause, we'll have to incorporate the data dictionary in your query so we can get the high/low values of the partition key and go from there.

so, what about ...

January 18, 2012 - 2:24 pm UTC

Reviewer: Sokrates

SQL> CREATE TABLE t
  2      (
  3        dt  date,
  4        x int
  5      )
  6      PARTITION BY RANGE (dt)
  7      (
  8        PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
  9        PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
 10        PARTITION junk VALUES LESS THAN (MAXVALUE)
 11      )
 12  enable row movement;

Table created.

SQL> insert into t values(date'2003-03-12', 1);

1 row created.

SQL> insert into t values(date'2003-03-13', 2);

1 row created.

SQL> select count(*) from t partition for (date'2003-03-12');

  COUNT(*)
----------
         1

SQL> select count(*) from t partition for (date'2003-03-13');

  COUNT(*)
----------
         1


Now, we merge into the "burned into the bytecode" partition for (date'2003-03-12') - and will end in another one !:

SQL> merge into t partition for (date'2003-03-12')
  2  using
  3  (
  4     select 1 x, date'2003-03-13' d
  5     from dual
  6  ) t2
  7  on (t.x = t2.x)
  8  when matched then update set t.dt = t2.d;

1 row merged.



we have merge into partition for (date'2003-03-12') (an identifier, the byte code of our merge statement does know nothing about other partitions of t, correct ?) - and where did we end ?

SQL> select count(*) from t partition for (date'2003-03-12');

  COUNT(*)
----------
         0

SQL> select count(*) from t partition for (date'2003-03-13');

  COUNT(*)
----------
         2



In another one !

So, how could this happen ?
How could our merge statement move one row from one partition into another one which was not burned into it's compiled code ?
Tom Kyte

Followup  

January 18, 2012 - 3:12 pm UTC

Now, we merge into the "burned into the bytecode" partition for (date'2003-03-12') - and will end in another one !:


so what? the program knows

a) it will be reading from only a given partition
b) but it could be updating into any partition

the src (the thing we'll join to, the part of T we read) is a given partition

the merge is for the table in this case.


The into clause told it what to read.

ops$tkyte%ORA11GR2> merge into t partition for (date'2003-03-12')
  2  using
  3  (
  4     select 1 x, date'2003-03-13' d
  5     from dual
  6  ) t2
  7  on (t.x = t2.x)
  8  when matched then update set t.dt = t2.d;

1 row merged.


Execution Plan
----------------------------------------------------------
Plan hash value: 432980191

--------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT           |      |     1 |    28 |    16   (0)| 00:00:01 |       |       |
|   1 |  MERGE                    | T    |       |       |            |          |       |       |
|   2 |   VIEW                    |      |       |       |            |          |       |       |
|   3 |    NESTED LOOPS           |      |     1 |    34 |    16   (0)| 00:00:01 |       |       |
|   4 |     FAST DUAL             |      |     1 |       |     2   (0)| 00:00:01 |       |       |
|   5 |     PARTITION RANGE SINGLE|      |     1 |    34 |    14   (0)| 00:00:01 |     1 |     1 |
|*  6 |      TABLE ACCESS FULL    | T    |     1 |    34 |    14   (0)| 00:00:01 |     1 |     1 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - filter("T"."X"=1)



see the pstart/pstop - it isn't key/key. the merge for bit only said what to read.


and that shows that it could easily be burnt into the code - the partition was well known AT COMPILE TIME.


It is interesting to see what happens if you run the merge again with a when not matched:

ops$tkyte%ORA11GR2> merge into t partition for (date'2003-03-12')
  2  using
  3  (
  4     select 1 x, date'2003-03-13' d
  5     from dual
  6  ) t2
  7  on (t.x = t2.x)
  8  when matched then update set t.dt = t2.d
  9  when not matched then insert (dt,x) values (t2.d, t2.x);
merge into t partition for (date'2003-03-12')
           *
ERROR at line 1:
ORA-14401: inserted partition key is outside specified partition




makes me wonder if the update behavior is a bug.... I'll file one - it doesn't look entirely right to me.


But still - that further shows the data we READ for the merge is burnt into the program (the when not matched clause was fired, but we know that x=1 is in there, just in another partition)




but in reality, all of this discussion really isn't relevant.

There are identifiers -

tablenames
tablename partition(partition-name)
tablename for ....


identifiers CANNOT BE BOUND, period. They must be, by definition, known at compile time. The binary plan, the program, depends on knowing them all.

ORA-14401

January 19, 2012 - 1:27 am UTC

Reviewer: Sokrates

actually, I expected kind of such exception for the update only as well ( and agree with you that it should be raised also in this case )

identifiers CANNOT BE BOUND, period. They must be, by definition, known at compile time.

which definition are you talking about ?
The index of SQL Language Reference
http://docs.oracle.com/cd/E11882_01/server.112/e26088/index.htm
does not contain identifier

So, where can I find this definition ?
Tom Kyte

Followup  

January 19, 2012 - 5:25 am UTC

it is part of the language as defined by ANSI SQL.

have you *ever* been able to bind any identifier, anywhere, in any circumstance?




I'm flip flopping as to whether this is a bug or not. The MERGE actually seems to be working ok.

It will match a row in the specified partition, and if it does, it updates it. If the existing row, which is in the specified partition, needs to move, it will move.

If it does not match a row in the specified partition (even if that row exists elsewhere in the table), it inserts it. If the insert isn't into that partition, it fails.



but again, you cannot bind identifiers, it is just part of the language definition. The sql language reference manual does not contain the entire language (it doesn't really even mention bind variables, that is left to the programming guides)

ok

January 19, 2012 - 5:42 am UTC

Reviewer: Sokrates

ok, thanks, I don't have the ANSI SQL specification available nor am I willing to read it

have you *ever* been able to bind any identifier, anywhere, in any circumstance?

what I thought of, is, that I can use a
table(some_func(:s))

(think of the famous in_list - function
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:210612357425 )
for example in SQL, so there we are not talking about identifiers then ?

I just think, it would be nice to have it available for the partition for - clause, but when it is forbidden by ANSI at the moment, I have to live without it.

Bug or not: probably a matter of taste ?
When the behaviour is not specified explicitly ( in the DOCs ), it may work anyway.

Thank you anyway for sharing your thoughts and insights

Tom Kyte

Followup  

January 19, 2012 - 7:22 am UTC

table(some_func(:s))


the identifier there is some_func - can you 'bind' the function you want to call?


the famous in-list function

can you 'bind' in the name of the famous function? Or - just the inputs to it?



select f(:x), substr(:x,:y,:z) from dual;


F is an identifier
substr is an identifier
dual is an identifier

none of them can be 'bound'


You cannot bind an identifier - in all of these, the identifier is the name of the function - you can't bind that.



got it

January 19, 2012 - 7:23 am UTC

Reviewer: Sokrates


running update concuently for multiple partions at once

January 29, 2012 - 8:50 pm UTC

Reviewer: A reader

Hi Tom,

I have a target table for update and has 100 partitions. I have a source table and has 20 millions records spread in target table for 10 partitions.

I want to run a fire an update concurrently for these 10 partitions or some times may be 5( want to parametrize this number). How do i achieve this through a oracle procedure call. I want to capture the updated record counts as well.

i have seen some suggestions using dbms_job.submit. Could you help.
Tom Kyte

Followup  

January 31, 2012 - 5:48 pm UTC

why would you not use parallel DML - we already do all of this and we take care of the locking.


if you want to do your own "parallel sort of stuff", you could use dbms_job or dbms_scheduler to run as many copies of a stored procedure you write yourself as you want concurrently (up to 1000 of them anyway).


In 11g, you can use dbms_parallel_execute to automate this.

but tell me, why not just parallel dml?


which functions are allowed within identifiers ?

March 13, 2012 - 4:09 am UTC

Reviewer: Sokrates

... you cannot bind an identifier and the partition name is an identifier ...

ok, but some functions are obiously allowed within identifiers , to_date and to_number for example :

sokrates@11.2 > create table ti ( i int ) partition by range(i) ( partition pi values less than (maxvalue) );

Table created.

sokrates@11.2 > select * from ti partition for ( to_number('0') );

no rows selected

sokrates@11.2 > create table td ( d date ) partition by range(d) ( partition pd values less than (maxvalue) );

Table created.

sokrates@11.2 > select * from td partition for ( to_date('13.03.2012', 'dd.mm.yyyy') );

no rows selected


, so they must be available to the SQL-Compiler.

Exceptions when evaluating them by the SQL-Compiler seem to be raised as ORA-14763's:
sokrates@11.2 > select * from td partition for ( to_date('13.03.2012', 'abc') );
select * from td partition for ( to_date('13.03.2012', 'abc') )
                                                       *
ERROR at line 1:
ORA-14763: Unable to resolve FOR VALUES clause to a partition number

sokrates@11.2 > select * from ti partition for ( to_number('a') );
select * from ti partition for ( to_number('a') )
                                           *
ERROR at line 1:
ORA-14763: Unable to resolve FOR VALUES clause to a partition number



to_timestamp, for example, seems to be unknown to the SQL-Compiler:

sokrates@11.2 > create table tt ( t timestamp ) partition by range(t) ( partition pt values less than (maxvalue) );

Table created.

sokrates@11.2 > select * from tt partition for ( to_timestamp ( '13-Mar-12', 'DD-Mon-RR' ));
select * from tt partition for ( to_timestamp ( '13-Mar-12', 'DD-Mon-RR' ))
              *
ERROR at line 1:
ORA-14763: Unable to resolve FOR VALUES clause to a partition number



also, to_date seems to underly some restrictions:
sokrates@11.2 > select * from td partition for ( to_date ( '13-03-1212', 'DD-MM-RRRR' ));

no rows selected

sokrates@11.2 > select * from td partition for ( to_date ( '13-Mar-12', 'DD-Mon-RR' ));
select * from td partition for ( to_date ( '13-Mar-12', 'DD-Mon-RR' ))
              *
ERROR at line 1:
ORA-14763: Unable to resolve FOR VALUES clause to a partition number

sokrates@11.2 > REM, though
sokrates@11.2 > select to_date ( '13-Mar-12', 'DD-Mon-RR' ) from dual;

TO_DATE('13
-----------
13-mar-2012



and, for example:

sokrates@11.2 > select * from td partition for ( to_date('12-Maj-2012', 'DD-Mon-RRRR', 'nls_date_language=''swedish''') );
select * from td partition for ( to_date('12-Maj-2012', 'DD-Mon-RRRR', 'nls_date_language=''swedish''') )
                                                        *
ERROR at line 1:
ORA-14763: Unable to resolve FOR VALUES clause to a partition number

sokrates@11.2 > REM BUT:
sokrates@11.2 > select * from td partition for ( to_date('12-May-2012', 'DD-Mon-RRRR', 'nls_date_language=''spanish''') );

no rows selected

sokrates@11.2 > REM though
sokrates@11.2 > select
  2     to_date('12-Maj-2012', 'DD-Mon-RRRR', 'nls_date_language=''swedish'''),
  3     to_date('12-May-2012', 'DD-Mon-RRRR', 'nls_date_language=''spanish''')
  4  from dual;

TO_DATE('12 TO_DATE('12
----------- -----------
12-may-2012 12-may-2012


Questions:
- are to_date and to_number the only supported functions ? do you have a complete list of supported functions ?
- this behaviour of to_date and to_number seems to be undocumented ?
http://docs.oracle.com/cd/E11882_01/server.112/e26088/sql_elements009.htm#i165979
for example does not mention it
- which nls_date_languages are supported by the SQL-Compiler ? ( funny that Spanish is but Swedish isn't ! )
Tom Kyte

Followup  

March 13, 2012 - 7:49 am UTC

they have to be deterministic. They have to be constants.

to_date ( '13-Mar-12', 'DD-Mon-RR' )

is not deterministic. It could be 1912, 2012, 2112 and so on - depending on what year you run it in.

You would need a function that is deterministic. It isn't a restriction in as much as the values clause must evaluate to a literal, a constant. and RR doesn't do that.

sokrates@11.2 > select
  2     to_date('12-Maj-2012', 'DD-Mon-RRRR', 'nls_date_language=''swedish'''),
  3     to_date('12-May-2012', 'DD-Mon-RRRR', 'nls_date_language=''spanish''')
  4  from dual;

TO_DATE('12 TO_DATE('12
----------- -----------
12-may-2012 12-may-2012



look carefully at your output...

Swedish vs. Spanish: haha

March 13, 2012 - 6:17 am UTC

Reviewer: Sokrates

seems that nls_date_language is ignored and only american is supported:

sokrates@11.2 > select
  2    to_char(date'2012-01-01', 'DD-Mon-RRRR', 'nls_date_language=''spanish'''),
  3    to_char(date'2012-01-01', 'DD-Mon-RRRR', 'nls_date_language=''swedish''')
  4  from dual
  5  /

TO_CHAR(DAT TO_CHAR(DAT
----------- -----------
01-Ene-2012 01-Jan-2012

1 row selected.

sokrates@11.2 > select * from td partition for ( to_date('01-Jan-2012', 'DD-Mon-RRRR', 'nls_date_language=''swedish''') );

no rows selected

sokrates@11.2 > select * from td partition for ( to_date('01-Ene-2012', 'DD-Mon-RRRR', 'nls_date_language=''spanish''') );
select * from td partition for ( to_date('01-Ene-2012', 'DD-Mon-RRRR', 'nls_date_language=''spanish''') )
                                                        *
ERROR at line 1:
ORA-14763: Unable to resolve FOR VALUES clause to a partition number


I don't understand

March 13, 2012 - 11:37 am UTC

Reviewer: Sokrates

..look carefully at your output

I don't understand.

my point is:
to_date('12-Maj-2012', 'DD-Mon-RRRR', 'nls_date_language=''swedish''')

is the same as
to_date('12-May-2012', 'DD-Mon-RRRR', 'nls_date_language=''spanish''')
:
sokrates@11.2 > select count(*) from dual
  2  where to_date('12-Maj-2012', 'DD-Mon-RRRR', 'nls_date_language=''swedish''') =
  3        to_date('12-May-2012', 'DD-Mon-RRRR', 'nls_date_language=''spanish''');

  COUNT(*)
----------
         1

1 row selected.


Both are deterministic.
However, only one of them works in the partition-for-clause:
sokrates@11.2 > exec execute immediate q'|select * from td partition for ( to_date('12-May-2012', 'DD-Mon-RRRR', 'nls_date_language=''spanish''') )|'

PL/SQL procedure successfully completed.

sokrates@11.2 > exec execute immediate q'|select * from td partition for ( to_date('12-May-2012', 'DD-Mon-RRRR', 'nls_date_language=''swedish''') )|'
BEGIN execute immediate q'|select * from td partition for ( to_date('12-May-2012', 'DD-Mon-RRRR', 'nls_date_language=''swedish''') )|'; END;

*
ERROR at line 1:
ORA-14763: Unable to resolve FOR VALUES clause to a partition number
ORA-06512: at line 1



So, I suspect the function
to_date
the SQL-Parser uses in order to resolve the identifier specified by a partition-for-clause is another one as the
to_date
-function we normally use in SQL and which is documented on http://docs.oracle.com/cd/E11882_01/server.112/e26088/functions203.htm#SQLRF06132

(for example, the 'nlsparam' argument seems to be ignored) ?
Tom Kyte

Followup  

March 13, 2012 - 4:57 pm UTC

my discussion on deterministic was limited to your use of RR, that was non-deterministic



on that other one, i was pointing out they both came out in english.

you came to that conclusion


seems that nls_date_language is ignored and only american is supported:



it can be made to work in 'swedish', as long as it isn't in swedish


ops$tkyte%ORA11GR2> select * from td partition for ( to_date('12-05-2012', 'DD-MM-RRRR', 'nls_date_language=''swedish''') );

no rows selected

<b>or at the session level</b>

ops$tkyte%ORA11GR2> alter session set nls_date_language='swedish';

Session altered.

ops$tkyte%ORA11GR2> select * from td partition for ( to_date('12-Maj-2012', 'DD-Mon-RRRR' ) );

no rows selected




there is a conversion (probably to make it the long format) happening somewhere and the swedish isn't being applied.

It could be argued that this is a bug I suspect.

thanks

March 14, 2012 - 11:31 am UTC

Reviewer: Sokrates


alternative solution ?

March 14, 2012 - 11:42 am UTC

Reviewer: Sokrates

to return to the original question and to your very first answer to it, I have another solution and would like to know your opinion about it.

Setup (taken from your first answer)
drop table t purge
/

CREATE TABLE t
(
  dt  date,
  x   int,
  y   varchar2(30)
)
PARTITION BY RANGE (dt)
(
   PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
   PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
   PARTITION junk VALUES LESS THAN (MAXVALUE)
)
/

insert into t
select to_date( '12-mar-2003', 'dd-mon-yyyy' ) + mod(rownum,3), 0, null
from all_users;

drop table t2 purge
/

create table t2 ( x int, y varchar2(30) );

insert into t2 values ( 0, 'hello' );
commit;



Then, your example continued with
merge into t partition(part1)
using t2
on (t.x = t2.x)
when matched then update set t.y = t2.y;


Here is the alternative I am thinking about:
merge into
  (select * from t
   where dt < date'2003-03-13'
  ) t
using t2
on (t.x = t2.x)
when matched then update set t.y = t2.y
/


autotrace shows me the same execution plan as for your statement


to do this in a loop, you'll need to use dynamic sql in plsql since you cannot bind an identifier and the partition name is an identifier (so you'll build the merge in a string and then use execute immediate to execute it)


dynamic sql wouldn't be necessary with the alternative, we could use a where clause with parameters, setting these parameter appropriate would result into a loop over all partitions

What do you think about it ?


Tom Kyte

Followup  

March 14, 2012 - 3:42 pm UTC

but you used a date literal? You don't bind those.

and you'd need a between to hit a single partition.

March 15, 2012 - 3:51 am UTC

Reviewer: Sokrates

but you used a date literal? You don't bind those.

so, what ?
it doesn't matter in principle, it only is a matter of partition pruning at compile time vs. partition pruning at runtime.

Of course the plan generated with date literals is different than that one generated with parameters, since in the first case, the optimizer is clever enough to see which partition is is affected ( assuming that the where-clause of the inline-view we merge into specifies only one partition, resulting in a PARTITION RANGE SINGLE PARTITION Row Source Operation )
and in the latter case at compile time it is unknown that only one partition might be affected, resulting in a KEY/KEY PARTITION RANGE ITERATOR PARTITION Operation, which at runtime affects only one partition

drop table t purge
/

CREATE TABLE t
(
  dt  date,
  x   int,
  y   varchar2(30)
)
PARTITION BY RANGE (dt)
(
   PARTITION part1 VALUES LESS THAN (date'2003-03-13'),
   PARTITION part2 VALUES LESS THAN (date'2003-03-14'),
   PARTITION part3 VALUES LESS THAN (date'2003-03-15'),
   PARTITION part4 VALUES LESS THAN (date'2003-03-16'),
   PARTITION part5 VALUES LESS THAN (date'2003-03-17'),
   PARTITION part6 VALUES LESS THAN (date'2003-03-18'),
   PARTITION part7 VALUES LESS THAN (date'2003-03-19'),
   PARTITION part8 VALUES LESS THAN (date'2003-03-20'),
   PARTITION part9 VALUES LESS THAN (date'2003-03-21'),
   PARTITION part10 VALUES LESS THAN (date'2003-03-22'),
   PARTITION junk VALUES LESS THAN (MAXVALUE)
)
/

insert into t select date'2003-03-10'+level,level,null from dual connect by level<=13;
commit;


create table ranges(pos int primary key, highval date);
insert into ranges values(1, date'2003-03-13');
insert into ranges values(2, date'2003-03-14');
insert into ranges values(3, date'2003-03-15');
insert into ranges values(4, date'2003-03-16');
insert into ranges values(5, date'2003-03-17');
insert into ranges values(6, date'2003-03-18');
insert into ranges values(7, date'2003-03-19');
insert into ranges values(8, date'2003-03-20');
insert into ranges values(9, date'2003-03-21');
insert into ranges values(10, date'2003-03-22');
insert into ranges values(11, null);
commit;



create or replace procedure merge_in_a_loop
as
  cursor r is
     select pos, highval, lag(highval) over(order by pos) lhighval from ranges;
begin
  for rr in r loop
    if rr.lhighval is null then
       merge into
         (select * from t
          where dt < rr.highval
         ) t
       using t2
       on (t.x = t2.x)
       when matched then update set t.y = t2.y;
    elsif rr.highval is null then
       merge into
         (select * from t
          where dt >= rr.lhighval
         ) t
       using t2
       on (t.x = t2.x)
       when matched then update set t.y = t2.y;
    else
       merge into
         (select * from t
          where dt >= rr.lhighval and dt < rr.highval
         ) t
       using t2
       on (t.x = t2.x)
       when matched then update set t.y = t2.y;
    end if;
  end loop;
end merge_in_a_loop;
/


exec DBMS_MONITOR.SESSION_TRACE_ENABLE(null, null, waits => false, binds => true)

merge into
  (select * from t
   where dt >= date'2003-03-14' and dt < date'2003-03-15'
  ) t
using t2
on (t.x = t2.x)
when matched then update set t.y = t2.y;

exec merge_in_a_loop

exec DBMS_MONITOR.SESSION_TRACE_DISABLE(null, null)




tkprof shows as execution plan for

merge into
  (select * from t
   where dt >= date'2003-03-14' and dt < date'2003-03-15'
  ) t
using t2
on (t.x = t2.x)
when matched then update set t.y = t2.y;


:

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  MERGE  T (cr=29 pr=0 pw=0 time=1383 us)
         0          0          0   VIEW  (cr=29 pr=0 pw=0 time=1372 us)
         0          0          0    HASH JOIN  (cr=29 pr=0 pw=0 time=1371 us cost=7 size=81 card=1)
         1          1          1     TABLE ACCESS FULL T2 (cr=7 pr=0 pw=0 time=628 us cost=2 size=30 card=1)
         1          1          1     PARTITION RANGE SINGLE PARTITION: 3 3 (cr=22 pr=0 pw=0 time=463 us cost=4 size=51 card=1)
         1          1          1      TABLE ACCESS FULL T PARTITION: 3 3 (cr=22 pr=0 pw=0 time=458 us cost=4 size=51 card=1)


and for the parametrized merge
Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  MERGE  T (cr=29 pr=0 pw=0 time=1111 us)
         0          0          0   VIEW  (cr=29 pr=0 pw=0 time=1103 us)
         0          0          0    FILTER  (cr=29 pr=0 pw=0 time=1102 us)
         0          0          0     HASH JOIN  (cr=29 pr=0 pw=0 time=1097 us cost=7 size=81 card=1)
         1          1          1      TABLE ACCESS FULL T2 (cr=7 pr=0 pw=0 time=575 us cost=2 size=30 card=1)
         1          1          1      PARTITION RANGE ITERATOR PARTITION: KEY KEY (cr=22 pr=0 pw=0 time=326 us cost=4 size=51 card=1)
         1          1          1       TABLE ACCESS FULL T PARTITION: KEY KEY (cr=22 pr=0 pw=0 time=320 us cost=4 size=51 card=1)



'table scan rows gotten' stat showed me further that partition pruning at runtime took place, only one partition was affected by each of the parametrized merges.
Tom Kyte

Followup  

March 15, 2012 - 7:36 am UTC

it doesn't matter in principle, it only is a matter of partition pruning at compile time vs. partition pruning at runtime.


but it would not remove the need for dynamic sql. You wrote:

dynamic sql wouldn't be necessary with the alternative

please provide example that doesn't need dynamic sql.

that is what I thought we were talking about.

example

March 15, 2012 - 7:43 am UTC

Reviewer: Sokrates

please provide example that doesn't need dynamic sql.

I did ?
procedure merge_in_a_loop doesn't need dynamic sql
Tom Kyte

Followup  

March 15, 2012 - 7:51 am UTC

and does not use a date literal.

Look at your original posting please.

Here is the alternative I am thinking about:
merge into
(select * from t
where dt < date'2003-03-13'
) t
using t2
on (t.x = t2.x)
when matched then update set t.y = t2.y
/


I said in response to that:

but you used a date literal? You don't bind those.

and you'd need a between to hit a single partition.


you proposed a date literal. I said "no, that won't work"


you then wrote exactly what I said you'd have to write, something that you can bind and that uses betwee - but since this would be run so infrequently - it screams for "just use a tiny bit of dynamic sql"

no need for dynamic sql

July 25, 2012 - 7:06 am UTC

Reviewer: Sokrates

there is no need for dynamic sql, we also can use DATAOBJ_TO_PARTITION, see
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:953139476145#5142342900346545075
or
http://laurentschneider.com/wordpress/2007/09/oracle-database-11g-the-top-features-for-dbas-and-developers.html
( <CRTL-F> for DATAOBJ_TO_PARTITION ):

drop table t purge
/

drop table t2 purge
/

CREATE TABLE t
(
  dt  date,
  x   int,
  y   varchar2(30)
)
PARTITION BY RANGE (dt)
(
   PARTITION part1 VALUES LESS THAN (date'2003-03-13'),
   PARTITION part2 VALUES LESS THAN (date'2003-03-14'),
   PARTITION part3 VALUES LESS THAN (date'2003-03-15'),
   PARTITION part4 VALUES LESS THAN (date'2003-03-16'),
   PARTITION part5 VALUES LESS THAN (date'2003-03-17'),
   PARTITION part6 VALUES LESS THAN (date'2003-03-18'),
   PARTITION part7 VALUES LESS THAN (date'2003-03-19'),
   PARTITION part8 VALUES LESS THAN (date'2003-03-20'),
   PARTITION part9 VALUES LESS THAN (date'2003-03-21'),
   PARTITION part10 VALUES LESS THAN (date'2003-03-22'),
   PARTITION junk VALUES LESS THAN (MAXVALUE)
)
/

insert into t select date'2003-03-10'+level,level,null from dual connect by level<=13;
commit;



create table t2 ( x int, y varchar2(30) );

insert into t2 select level, 't2: ' || level from dual connect by level <= 13;
commit;



begin
   for p in (
      select *
      from user_objects 
      where object_name='T' 
      and object_type='TABLE PARTITION'
   ) loop
      merge into t partition (DATAOBJ_TO_PARTITION(t, p.DATA_OBJECT_ID))
      using t2 
      on (t.x = t2.x) 
      when matched then update set t.y = t2.y;
   end loop;
end;
/


You wrote

... You cannot bind an identifier ...


Seems to be disproved ?

what about

... identifiers CANNOT BE BOUND, period. They must be, by definition, known at compile time. The binary plan, the program, depends on knowing them all.
...


?
Tom Kyte

Followup  

July 30, 2012 - 9:33 am UTC

that is interesting - it is just as if you had a predicate on there to hit the values in a single partition - it add a key/key to the plan.

meaning, it evaluated the dataobj_to_partition before the query is executed and just used regular old partition elimination to do this.

nice. I knew about dataobj_to_partition - but forgot about it.

memory

July 31, 2012 - 1:19 am UTC

Reviewer: Sokrates

I knew about dataobj_to_partition - but forgot about it.

probably triggered by the documentation http://docs.oracle.com/cd/E11882_01/server.112/e26088/functions047.htm#SQLRF20028 which suggests that this function is only for very special cases? :

...
Purpose

DATAOBJ_TO_PARTITION is useful only to Data Cartridge developers who are performing data maintenance or query operations on system-partitioned tables that are used to store domain index data. ...


I tried to add a Reader Comment to that with the proposal that they should review that and remove this restriction, but unfortunately received


We're sorry, the topic you requested was not found.

We have recorded this error (404) to help us fix the problem.


I doubt they will fix it.

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.