Skip to Main Content
  • Questions
  • load a pl/sql table with lookup table and use it in my program for look up values

Breadcrumb

May 4th

Question and Answer

Tom Kyte

Thanks for the question.

Asked: September 15, 2002 - 9:38 pm UTC

Last updated: April 09, 2008 - 2:54 pm UTC

Version: 8.1.7

Viewed 10K+ times! This question is

You Asked

Tom

Recently I was told that instead of repeatedly querying a lookup table for lookup values(like querying a description table for description using description_id), it is much more efficient to load up a pl/sql table and do a lookup on the pl/sql table.

1.Can you show me how we can load up a pl/sql table

2.How to do a look up using pl/sql table

3.If I want to load a pl/sql table with values from more than columns, can we do it. Arent pl/sql tables single dimensional?

Thanks

Tom

and Tom said...

Why do you believe all that you hear? Did they give you a test case to "prove" this? Hmmmmm.

Did you know that databases were born to be queried? It is one of the things they do best.


I would ask this person who gave you this advice for their test case that proves it. In there you would find out

1) how to load the plsql table (eg: look up BULK COLLECT in the plsql manuals)
2) how to look up using plsql table (eg: they either wrote their own binary search -- which we happen to have in the database already, or they are using a simple, yet non-scalable linear search, start at index 1 and look till you find a match)
3) in 8i, yes they are -- you can bulk collect into a RECORD of ARRAYS however.

But, the scientific proof offered by this person will answer all of your questions.

Rating

  (95 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

I think this was published in Oracle Professional

Scott Watson, September 16, 2002 - 2:46 pm UTC

I beleive I read this in Oracle Professional written by Steve Feuerstein. Here is a link to the article. Basically it is for 9iR2 and uses associative arrays.

</code> http://www.oracleprofessionalnewsletter.com/op/OPmag.nsf/0/F2E754B65828DEF985256C1A0056DF3F <code>

I can rationlize this article by saying that the PL/SQL engine does not have to perform a context switch to the SQL layer, however, using this method would require that each connection maintain a lookup table which may require more memory if the number of users using this approach is large. Like any good solution it requires testing in your enviroment etc. etc.

HTH
Scott.

Tom Kyte
September 16, 2002 - 8:13 pm UTC

In 9iR2 (not 817 and not before) this may be true with the new associative arrays which are b-tree indexed RAM data structures (very efficient for searching).  This is "brand new" at this point.  A quick and dirty test will show that they are in fact faster and probably more scalable but at the cost of RAM (tradeoffs as usual).

This will NOT work in 817 (note to the original questioner, this will not work for you)

ops$tkyte@ORA920.US.ORACLE.COM> create table t ( code varchar2(5) constraint t_pk primary key, data varchar2(40) )
  2  organization index
  3  /

Table created.

ops$tkyte@ORA920.US.ORACLE.COM> exec gen_data( 'T', 5000 );

PL/SQL procedure successfully completed.

ops$tkyte@ORA920.US.ORACLE.COM> select count(*) from t;

  COUNT(*)
----------
      5000

ops$tkyte@ORA920.US.ORACLE.COM>
ops$tkyte@ORA920.US.ORACLE.COM> declare
  2      type  lookupTable is table of varchar2(40) index by varchar2(5);
  3      type  indexByTable is table of varchar2(40) index by binary_integer;
  4
  5      l_lookupTable lookupTable;
  6      l_codes       indexByTable;
  7      l_values      indexByTable;
  8      l_value       varchar2(40);
  9
 10      l_start number;
 11      l_run1  number;
 12      l_run2  number;
 13  begin
 14      select code, data BULK COLLECT into l_codes, l_values from t;
 15
 16      insert into run_stats select 'before', stats.* from stats;
 17
 18      l_start := dbms_utility.get_time;
 19      for i in 1 .. l_codes.count
 20      loop
 21          select data into l_value from t where code = l_codes(i);
 22      end loop;
 23      l_run1 := (dbms_utility.get_time-l_start);
 24      dbms_output.put_line( l_run1 || ' hsecs' );
 25
 26      insert into run_stats select 'after 1', stats.* from stats;
 27      l_start := dbms_utility.get_time;
 28
 29
 30      for i in 1 .. l_codes.count
 31      loop
 32          l_lookupTable( l_codes(i) ) := l_values(i);
 33      end loop;
 34      for i in 1 .. l_codes.count
 35      loop
 36          l_value := l_lookupTable( l_codes(i) );
 37      end loop;
 38      l_run2 := (dbms_utility.get_time-l_start);
 39      dbms_output.put_line( l_run2 || ' hsecs' );
 40      dbms_output.put_line
 41      ( 'run 1 ran in ' || round(l_run1/l_run2*100,2) || '% of the time' );
 42
 43      insert into run_stats select 'after 2', stats.* from stats;
 44  end;
 45  /
74 hsecs
3 hsecs
run 1 ran in 2466.67% of the time

PL/SQL procedure successfully completed.

ops$tkyte@ORA920.US.ORACLE.COM> select a.name, b.value-a.value run1, c.value-b.value run2,
  2         ( (c.value-b.value)-(b.value-a.value)) diff
  3    from run_stats a, run_stats b, run_stats c
  4   where a.name = b.name
  5     and b.name = c.name
  6     and a.runid = 'before'
  7     and b.runid = 'after 1'
  8     and c.runid = 'after 2'
  9     and (c.value-a.value) > 0
 10     and (c.value-b.value) <> (b.value-a.value)
 11   order by abs( (c.value-b.value)-(b.value-a.value))
 12  /

NAME                                 RUN1       RUN2       DIFF
------------------------------ ---------- ---------- ----------
STAT...recursive calls               5031         50      -4981
STAT...calls to get snapshot s       5010         16      -4994
cn: kcmgss

STAT...execute count                 5005          7      -4998
STAT...index fetch by key            5000          1      -4999
LATCH.shared pool                    5127        123      -5004
LATCH.cache buffers chains          10167        291      -9876
STAT...session logical reads        10056        103      -9953
STAT...consistent gets              10012         24      -9988
STAT...consistent gets - exami      10000          2      -9998
nation

LATCH.library cache pin             10105        101     -10004
LATCH.library cache                 10258        248     -10010<b>
STAT...session pga memory               0     476412     476412</b>

62 rows selected.

ops$tkyte@ORA920.US.ORACLE.COM>


All in all, if you got the RAM and you got 9iR2, this makes sense.  Before then -- I question it.
 

Tried and Tested

Adrian Billington, September 16, 2002 - 3:58 pm UTC

Tom

Initialising package global index-bys with data from lookup tables, using the numeric-PK as the index (can be varchar2 now with associative arrays of course) is a tried and tested technique of returning lookup values much quicker than repeated database access. I took an hour off a one hour fifteen minutes INSERT using this method and I have a couple of demos "proving" them to be quicker (both methods have also been through your run_stats harness as well to great effect).

I am off work at the moment so can't get to the demos, but if no-one beats me to it, then I'll post something next week.

The SF article in Oracle Professional this month takes it from an associative array context, but I already have a demo of that as well and again, they are much more efficient than repeated lookup table access.

BUT, the only thing I have not tested against is repeated table access with a BUFFER POOL KEEP / KEEP CACHE, so maybe that's for someone else to try and report on. But hey, SQL can't ALWAYS win, surely ;).

Regards

Adrian

Tom Kyte
September 16, 2002 - 8:17 pm UTC

I betcha using a single INSERT -- removing the procedural code all together -- would get rid of the rest of the time.


I rarely see codes that are 1, 2, 3, 4, 5, .... N -- making the index by tables of some what limited use. It is usually something like

AK Alaska
VA Virginia
...



What about the other way around

Scott Watson, September 17, 2002 - 8:46 am UTC

Hi Tom,

I was thinking performing lookups in PL/SQL is faster than doing the same lookup in sql but what about the other way around. (I don't have 9i so I cannot test this myself) What would be faster in this case.

Select a.col1, b.col2
from a, b
where a.id = b.a_id;

or

select a.col1, lookup_in_pl_sql(a.id)
from a

Thanks,
Scott.

Tom Kyte
September 17, 2002 - 12:34 pm UTC

Select a.col1, b.col2
from a, b
where a.id = b.a_id;


Try it and see.

That is not what I meant

Scott watson, September 17, 2002 - 1:31 pm UTC

Funny!!!

I know I can do the table lookup but what about the lookup using a PLSQL function in terms of the perfomance we have beed discussing. We have already asserted that these types of lookups are faster when executing within PLSQL than SQL but what about when we access PLSQL from our sql statement.

select name_en, convert_to_french(name_en) PLSQL_FUNCTION
from tableA

This is what I was refering to when I said I did not have 9i and could not test it.

Tom Kyte
September 17, 2002 - 8:03 pm UTC

well, you could have tested with an INDEXED BY BINARY_INTEGER table (eg: number to string lookup)....

but here you go -- sql will be FASTER here:


ops$tkyte@ORA920.LOCALHOST> /*
DOC>drop table t;
DOC>
DOC>create table t ( code varchar2(5) constraint t_pk primary key, data varchar2(40) )
DOC>organization index
DOC>/
DOC>
DOC>exec gen_data( 'T', 5000 );
DOC>
DOC>select count(*) from t;
DOC>
DOC>pause
DOC>*/
ops$tkyte@ORA920.LOCALHOST> 
ops$tkyte@ORA920.LOCALHOST> 
ops$tkyte@ORA920.LOCALHOST> create or replace package demo_pkg
  2  as
  3           function get_value( p_code in varchar2 ) return varchar2;
  4  end;
  5  /

Package created.

ops$tkyte@ORA920.LOCALHOST> create or replace package body demo_pkg
  2  as
  3      type  indexByTable is table of varchar2(40) index by binary_integer;
  4      type  lookupTable is table of varchar2(40) index by varchar2(5);
  5
  6      l_lookupTable lookupTable;
  7
  8          function get_value( p_code in varchar2 ) return varchar2
  9          as
 10          l_codes       indexByTable;
 11          l_values      indexByTable;
 12          begin
 13                  if nvl( l_lookupTable.count,0 ) = 0
 14                  then
 15                          dbms_output.put_line( 'filling table' );
 16                  select code, data BULK COLLECT into l_codes, l_values from t;
 17                  for i in 1 .. l_codes.count
 18                  loop
 19                          l_lookupTable( l_codes(i) ) := l_values(i);
 20                  end loop;
 21                  end if;
 22                  return l_lookupTable( p_code );
 23          exception
 24                  when no_data_found then
 25                          raise program_error;
 26          end;
 27  end;
 28  /

Package body created.

ops$tkyte@ORA920.LOCALHOST>
ops$tkyte@ORA920.LOCALHOST>
ops$tkyte@ORA920.LOCALHOST> declare
  2       l_start number;
  3       l_run1  number;
  4       l_run2  number;
  5  begin
  6       insert into run_stats select 'before', stats.* from stats;
  7
  8       l_start := dbms_utility.get_time;
  9       for x in ( select a.code, b.data from t a, t b where a.code = b.code )
 10       loop
 11           null;
 12       end loop;
 13       l_run1 := (dbms_utility.get_time-l_start);
 14       dbms_output.put_line( l_run1 || ' hsecs' );
 15
 16       insert into run_stats select 'after 1', stats.* from stats;
 17       l_start := dbms_utility.get_time;
 18
 19
 20       for x in ( select a.code, demo_pkg.get_value( a.code ) from t a )
 21       loop
 22           null;
 23       end loop;
 24       l_run2 := (dbms_utility.get_time-l_start);
 25       dbms_output.put_line( l_run2 || ' hsecs' );
 26       dbms_output.put_line
 27       ( 'run 1 ran in ' || round(l_run1/l_run2*100,2) || '% of the time');
 28
 29       insert into run_stats select 'after 2', stats.* from stats;
 30  end;
 31  /
23 hsecs
filling table
65 hsecs
run 1 ran in 35.38% of the time

PL/SQL procedure successfully completed.

ops$tkyte@ORA920.LOCALHOST>
ops$tkyte@ORA920.LOCALHOST>
ops$tkyte@ORA920.LOCALHOST> select a.name, b.value-a.value run1, c.value-b.value run2,
  2         ( (c.value-b.value)-(b.value-a.value)) diff
  3    from run_stats a, run_stats b, run_stats c
  4   where a.name = b.name
  5     and b.name = c.name
  6     and a.runid = 'before'
  7     and b.runid = 'after 1'
  8     and c.runid = 'after 2'
  9     and (c.value-a.value) > 0
 10     and (c.value-b.value) <> (b.value-a.value)
 11   order by abs( (c.value-b.value)-(b.value-a.value))
 12  /

NAME                                 RUN1       RUN2       DIFF
------------------------------ ---------- ---------- ----------
....
LATCH.cache buffers chains          25135      10515     -14620
LATCH.library cache pin                95      30215      30120
LATCH.shared pool                     112      35210      35098
LATCH.library cache                   236      50445      50209
STAT...session uga memory               0     589176     589176
STAT...session pga memory         -131072     589824     720896

70 rows selected.

ops$tkyte@ORA920.LOCALHOST>

Also, I'll REVISE my earlier answer.  

Instead of doing it PROCEDURALLY, lets do the lookup where it belongs!  In SQL in the first place.  EG: I would not be caught reading a table and then doing another query to LOOKUP data in the first place.  Only if I was doing some data load would I do that -- else, I would have JOINED in the first place...

What I mean by that -- is that the basic PREMISE that you should


  for x in ( select * from t1 )
  loop
      select value into l_var from t2 where t2.code = t1.code;
      .....


is flawed in the FIRST place, you should have JOINED (or selected a select)

Now that will outperform using a index by table (associated array or not)... 

Strange behaviour...

Padders, September 18, 2002 - 4:31 pm UTC

Since when did collection.COUNT return NULL for empty collection? I think you'll find you never even populate your collection. 

This raises another issue (and I am testing on 9.2, btw) - if the collection is never populated, RETURN l_lookuptable (p_code) should raise NO_DATA_FOUND, and yet we get no error from test script, although if we call function from PL/SQL, we get NO_DATA_FOUND raised as expected, e.g.

SQL*Plus: Release 9.2.0.1.0 - Production on Wed Sep 18 21:12:43 2002

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.


Connected to:
Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.1.0 - Production

SQL> SELECT demo_pkg.get_value ('Rubbish') 
  2  FROM   dual;

DEMO_PKG.GET_VALUE('RUBBISH')
---------------------------------------------------------------------


SQL> BEGIN
  2    DBMS_OUTPUT.PUT_LINE (demo_pkg.get_value ('Rubbish'));
  3  END; 
  4  /
BEGIN
*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at "SCOTT.DEMO_PKG", line 28
ORA-06512: at line 2

Bizarrely, it has nothing to do with collection, since we can create function which just raises NO_DATA_FOUND and get the same behaviour, e.g.

SQL> CREATE OR REPLACE FUNCTION ndf
  2    RETURN VARCHAR2 
  3  IS
  4  BEGIN
  5    RAISE NO_DATA_FOUND;
  6  END;
  7  /

Function created.

SQL> SELECT ndf 
  2  FROM   dual;

NDF
----------------------------------------------------------------------


SQL> BEGIN
  2    DBMS_OUTPUT.PUT_LINE (ndf);
  3  END; 
  4  /
BEGIN
*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at "SCOTT.NDF", line 5
ORA-06512: at line 2

Note that the same does not appear to apply to other exceptions, and no I haven't tried them all, but VALUE_ERROR, for example, appears to work as expected, e.g.

SQL> CREATE OR REPLACE FUNCTION ve
  2    RETURN VARCHAR2 
  3  IS
  4  BEGIN
  5    RAISE VALUE_ERROR;
  6  END;
  7  /

Function created.

SQL> SELECT ve 
  2  FROM   dual;
SELECT ve
       *
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at "SCOTT.VE", line 5


SQL> BEGIN
  2    DBMS_OUTPUT.PUT_LINE (ndf);
  3  END; 
  4  /
BEGIN
*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at "SCOTT.NDF", line 5
ORA-06512: at line 2

Please confirm - have I lost the plot or is this expected behaviour (not the me losing the plot bit, the other bit)? 

Tom Kyte
September 18, 2002 - 6:53 pm UTC

whoops -- corrected that. surprisingly, the results were not materially affected. see above

Still wondering...

Padders, September 19, 2002 - 4:10 am UTC

Thanks for correcting.

But why is NO_DATA_FOUND exception ignored by SQL, and NULL returned instead?

Tom Kyte
September 19, 2002 - 7:55 am UTC

I guess cause the answer is "unknown"

just like if I


select ( select * from dual where 1=0 ) from dual;


that does not fail, even though the subquery (function) returns NO_DATA_FOUND. or

select * from dual where 1 = ( select 0 from dual where 1=0 )

the subquery returns NULL - well, actually the subquery doesn't return anything, it is "unknown" what it returns. the function is treated in the same way.

Some results...

Adrian Billington, September 19, 2002 - 5:12 am UTC

Tom

I agree with your comment about SQL being faster when the lookup table is available (i.e. just join to it), but I don't think that was ever disputed. We wouldn't write a PL/SQL function to do such simple SQL work in the case of a straight insert. But when you don't have access to that lookup table (i.e. during a data load), the use of an index-by (or extended assoc. array in 9i) beats the hell out of a database-accessing function.

On a simple 10,000 row insert demo, looking up the description from a lookup table for each FK row in the staging table, the results were as follows (consistently - ran about 4 times):-

SQL with join to lookup table........ 0 secs 16 hsecs
PL/SQL function using index-by....... 0 secs 67 hsecs
PL/SQL function using lookup table...10 secs 66 hsecs

Which I guess validates the assertions.

I would have thought that from an OLTP perspective, the use of index-bys for the common lookup tables should be encouraged, to save all those repeated accesses to the numerous lookup tables that OLTP systems seem to have, which is possibly relevant the original poster...

Regards

Adrian

Tom Kyte
September 19, 2002 - 7:57 am UTC

Yes, that is exactly what i said tho:

...
Instead of doing it PROCEDURALLY, lets do the lookup where it belongs! In SQL
in the first place. EG: I would not be caught reading a table and then doing
another query to LOOKUP data in the first place. Only if I was doing some data
load would I do that
-- else, I would have JOINED in the first place...
.....


What I see CONSTANTLY, continously and all of the time is code like

for x in ( select * from t )
loop
select ... into ... from t2 where t2.code = t.code;
......


that should be a join, if you see yourself doing that, stop.

During a data load -- you are 100% correct.


(consider an OLTP application a loader and the premise continues ;)

illustrate

Nag, September 19, 2002 - 8:44 pm UTC

"
What I see CONSTANTLY, continously and all of the time is code like

for x in ( select * from t )
loop
select ... into ... from t2 where t2.code = t.code;
......
"


Tom, can you give us an example and show us , how a loop like the above can be converted into sql

Tom Kyte
September 20, 2002 - 7:55 am UTC

It is called a join. join t to t2.

Have a look at this , and tell us wha tyou have to say

Famy, September 26, 2002 - 3:43 pm UTC

this is from quest-pipelines , a very well respected resource..


July's Tip of the Month


Loading a Lookup Table in Memory to Speed Up Queries

Compliments of Dan Clamage, PL/SQL Pipeline SYSOP (danielj@clamage.com)

Recently, I worked with a developer who faced the challenge of speeding up his Pro*C application. It was taking about 2 hours to run. After examining his application, I discovered that he had one table on which every row returned in the driving loop required a random lookup. Furthermore, he could expect nearly every row in this lookup table to be accessed at some point over the life of the program. I instructed him to preload this moderately sized table (~50k rows) into a persistent PL/SQL table. In order to be persistent, the PL/SQL table had to be declared in the package body, outside of any routine:

CREATE OR REPLACE
PACKAGE BODY lookup
IS
TYPE typ_val_data IS TABLE OF tbl_lookup.val_data%TYPE
INDEX BY BINARY_INTEGER;
tab_val_data typ_val_data;
I explained to him that the numeric primary key could be used as the index offset, and the column value he needed could then be stored at that offset. For example:
Lookup Table

OID NUMBER(10) VAL_DATA VARCHAR2(4)
3 'GHGL'
27 'ACBF'
101 'LEFP'
157 'KPOI'
PL/SQL Table (also known as index-by table or array)
Array Offset Array Value
3 'GHGL'
**no data in array elements 4-26**
27 'ACBF'
**no data in array elements 28-100**
101 'LEFP'
**no data in array elements 158+**
157 'KPOI'
This would result in a sparsely populated array. He populated the array up front, as part of his program initialization, using a packaged procedure. This procedure simply read every row in his lookup table in a cursor FOR loop, and used the primary key as the element offset into the PL/SQL table. For example:
PROCEDURE populate_lookup
IS
CURSOR get_all_data IS
SELECT oid, val_data
FROM tbl_lookup;
BEGIN
FOR rec IN get_all_data LOOP
tab_val_data(rec.oid) := rec.val_data;
END LOOP;
END populate_lookup;
Then he provided a packaged function to interface with this array. He could use this function in PL/SQL or in SQL. For example:
FUNCTION get_val_data(p_oid IN tbl_lookup.oid%TYPE)
RETURN tbl_lookup.val_data%TYPE
IS
BEGIN
RETURN(tab_val_data(p_oid));
EXCEPTION
WHEN NO_DATA_FOUND THEN
RETURN(NULL);
END get_val_data;
Inside his main loop, he coded something like this:
INSERT INTO tbl_target (val_data, ...)
VALUES (lookup.get_val_data(fk_oid), ...);
He had to fully qualify the function, since it's being called from SQL. He reported that his program now runs in 20 minutes, a 5x speed improvement.

Tom Kyte
September 26, 2002 - 4:40 pm UTC

And he could have speed it up 10x more by simply coding:

insert into tbl_target ( val_data, .... )
select max(t.val_data), :bv1, :bv2, :bv3, :b4, ....
  from tbl_lookup t
 where oid = :bv0;

or 80x faster by using array binds and the above!!!!

Here is the output from my testing (totally consistent, ran it many many times):


ops$tkyte@ORA920.US.ORACLE.COM> !./test

Connected to ORACLE as user: /

Using SQL 2002-09-26 16:38:11.041532
+000000000 00:00:02.851382000
Using plsql 2002-09-26 16:38:13.902126
+000000000 00:00:04.002790000
Using SQL 2002-09-26 16:38:17.915989
+000000000 00:00:00.534158000

ops$tkyte@ORA920.US.ORACLE.COM> select count(*) from t2;

  COUNT(*)
----------
      9000

ops$tkyte@ORA920.US.ORACLE.COM>

so, SQL -- no PLSQL = 2.8 seconds.  
PLSQL = 4 seconds

SQL with array binds of 100 = 0.58 seconds......



Here is my code:

drop table t;

create table t  ( oid primary key, val_data )
organization index
as
select rownum oid, object_name val_data from all_objects;


drop table t2;

create table t2 ( val_data varchar2(30), data varchar2(400) );

create or replace package pkg
as
    function f( n in number ) return varchar2;
end;
/

create or replace package body pkg
as

type lookup is table of varchar2(30) index by binary_integer;

g_data lookup;

function f( n in number ) return varchar2
is
begin
    return g_data(n);
exception
    when no_data_found then return null;
end;

begin
    for x in ( select * from t )
    loop
        g_data(x.oid) := x.val_data;
    end loop;
end;
/




#include <stdio.h>
#include <string.h>

static char *     USERID = "/";

#define SQLCA_INIT
EXEC SQL INCLUDE sqlca;

static void sqlerror_hard( int lineno)
{
    EXEC SQL WHENEVER SQLERROR CONTINUE;

    printf("\nORACLE error detected (%d):", lineno);
    printf("\n% .70s \n", sqlca.sqlerrm.sqlerrmc);

    EXEC SQL ROLLBACK WORK RELEASE;
    exit(1);
}

static void process1( void )
{
EXEC SQL BEGIN DECLARE SECTION;
varchar        data[400];
varchar        time[125];
int            i;
EXEC SQL END DECLARE SECTION;

    EXEC SQL WHENEVER SQLERROR DO sqlerror_hard(__LINE__);

    memset( data.arr,'*',400 );
    strncpy( data.arr, "process1", strlen("process1") );
    data.len = 400;

    EXEC SQL SELECT to_char(systimestamp,'YYYY-mm-dd hh24:mi:ss.ff')
               INTO :time
               FROM DUAL;
    printf( "Using SQL %.*s\n", time.len, time.arr );

    for( i = 0; i < 3000; i++ )
    {
        EXEC SQL insert into t2
        select max(val_data), :data
          from t
         where oid = :i;
    }
    EXEC SQL SELECT SYSTIMESTAMP-to_timestamp(:time,'yyyy-mm-dd hh24:mi:ss.ff') INTO :time FROM DUAL;
    printf( "%.*s\n", time.len, time.arr );
    EXEC SQL COMMIT;
}


static void process2( void )
{
EXEC SQL BEGIN DECLARE SECTION;
varchar        data[400];
varchar        time[125];
int            i;
EXEC SQL END DECLARE SECTION;

    EXEC SQL WHENEVER SQLERROR DO sqlerror_hard(__LINE__);

    memset( data.arr,'*',400 );
    strncpy( data.arr, "process2", strlen("process2") );
    data.len = 400;

    EXEC SQL SELECT to_char(systimestamp,'YYYY-mm-dd hh24:mi:ss.ff')
               INTO :time
               FROM DUAL;
    printf( "Using plsql %.*s\n", time.len, time.arr );

    for( i = 0; i < 3000; i++ )
    {
        EXEC SQL insert into t2
        values ( pkg.f(:i), :data );
    }
    EXEC SQL SELECT SYSTIMESTAMP-to_timestamp(:time,'yyyy-mm-dd hh24:mi:ss.ff') INTO :time FROM DUAL;
    printf( "%.*s\n", time.len, time.arr );
    EXEC SQL COMMIT;
}


static void process3( void )
{
EXEC SQL BEGIN DECLARE SECTION;
varchar        data[100][400];
varchar        time[125];
int            oids[100];
int            i;
EXEC SQL END DECLARE SECTION;

    EXEC SQL WHENEVER SQLERROR DO sqlerror_hard(__LINE__);

    for( i = 0; i < 100; i++ )
    {
        memset( data[i].arr,'*',400 );
        strncpy( data[i].arr, "process3", strlen("process3") );
        data[i].len = 400;
    }

    EXEC SQL SELECT to_char(systimestamp,'YYYY-mm-dd hh24:mi:ss.ff')
               INTO :time
               FROM DUAL;
    printf( "Using SQL with Array binds %.*s\n", time.len, time.arr );

    for( i = 0; i < 3000; i++ )
    {
        oids[ i%100 ] = i;

        if ( i%100 == 99 )
        {
            EXEC SQL insert into t2
            select max(val_data), :data
              from t
             where oid = :oids;
        }
    }
    EXEC SQL SELECT SYSTIMESTAMP-to_timestamp(:time,'yyyy-mm-dd hh24:mi:ss.ff') INTO :time FROM DUAL;
    printf( "%.*s\n", time.len, time.arr );
    EXEC SQL COMMIT;
}

main( argc, argv )
int    argc;
char *    argv[];
{
EXEC SQL BEGIN DECLARE SECTION;
VARCHAR   oracleid[50];
EXEC SQL END DECLARE SECTION;

    strcpy( oracleid.arr, USERID );
    oracleid.len = strlen(oracleid.arr);

    EXEC SQL WHENEVER SQLERROR DO sqlerror_hard(__LINE__);
    EXEC SQL CONNECT :oracleid;
    printf("\nConnected to ORACLE as user: %s\n\n", oracleid.arr);

    process1();
    process2();
    process3();

    exit(0);
}

 

Then why use pl/sql tables?

Kashif, October 04, 2002 - 1:14 pm UTC

Hi Tom,

Great discussion, thanks for the invaluable input. Just a quick couple of questions. Firstly, you mention:

"
What I see CONSTANTLY, continously and all of the time is code like

for x in ( select * from t )
loop
select ... into ... from t2 where t2.code = t.code;
......
"

and you mention this is not the correct way of doing lookups. Does this apply when there is the need to outer join the driving table in the main query, i.e.

select t.*
from t, t2
where t.code (+) = t2.code

Is it not more efficient to read the main table and then read the lookup table, and return nulls (using exception when no_data_found then return null) when the lookup table doesn't retrieve anything?

Secondly, your response to the original poster seemed to indicate that pl/sql tables aren't always the best solution when it comes to lookup tables. Can you give examples of when they are the best solution? Thanks in advance.

Kashif

Tom Kyte
October 04, 2002 - 6:54 pm UTC

I would outer join or select a select (eg:

select dept.*, (select count(*) from emp where emp.deptno = dept.deptno )
from dept

never procedurally loop running a select based on other selects if I can.

A plsql table to do some lookups might make sense when doing a flat file data load and you don't have external tables (eg: pre 9i) in 9i and beyond, maybe NEVER.

Whoops...

Kashif, October 04, 2002 - 4:32 pm UTC

I meant the outer join should be on the lookup table (t2), i.e. the query would be something like

select t.*
from t, t2
where t.code = t2.code (+)

Wouldn't it be more efficient to simply read the rows needed from the driving table (t) and then do inner selects on the lookup (t2) table as needed? THanks.

Kashif


Tom Kyte
October 04, 2002 - 7:20 pm UTC

if you can do it in a single query -- do it, it's almost always faster, easier, more maintainable, faster to type, etc etc etc

Decode on select (select)?

Kashif, October 11, 2002 - 12:04 pm UTC

Hi Tom,

Thanks for your tip, it works wonders in my query! Quick question, if I can control when I execute the 'select a select' somehow, for example using a decode, will that be more efficient versus executing the 'select a select' for each row? Example:

select deptno, decode (deptno, null, null, (select deptname from dept where dept.deptno = emp.deptno)
from emp;

versus

select deptno, (select deptname from dept where dept.deptno = emp.deptno)
from emp

assuming the deptno in emp is not always populated. Thanks for your feedback in advance.

Kashif

P.S. Hope you're having fun at the VOUG conference!

Tom Kyte
October 11, 2002 - 8:37 pm UTC

Well, lets see....


DOC>create table emp ( deptno int );
DOC>
DOC>create table dept as select * from scott.dept;
DOC>alter table dept add constraint dept_pk primary key(deptno);
DOC>
DOC>insert into emp
DOC>select decode( mod(rownum,5), 0, null, mod(rownum,5)*10 )
DOC>from all_objects;
DOC>*/
ops$tkyte@ORA920.US.ORACLE.COM>

ops$tkyte@ORA920.US.ORACLE.COM> set autotrace traceonly
ops$tkyte@ORA920.US.ORACLE.COM> select deptno, (select dname from dept where dept.deptno = emp.deptno)
  2    from emp
  3  /

29284 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'EMP'




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2012  consistent gets
          0  physical reads
          0  redo size
     570620  bytes sent via SQL*Net to client
      21971  bytes received via SQL*Net from client
       1954  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      29284  rows processed

ops$tkyte@ORA920.US.ORACLE.COM> select deptno, decode( deptno, null, null, (select dname from dept where dept.deptno = emp.deptno) )
  2    from emp
  3  /

29284 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'EMP'




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2012  consistent gets
          0  physical reads
          0  redo size
     570645  bytes sent via SQL*Net to client
      21971  bytes received via SQL*Net from client
       1954  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      29284  rows processed

ops$tkyte@ORA920.US.ORACLE.COM> select deptno from emp
  2  /

29284 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'EMP'




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2006  consistent gets
          0  physical reads
          0  redo size
     347959  bytes sent via SQL*Net to client
      21971  bytes received via SQL*Net from client
       1954  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      29284  rows processed

ops$tkyte@ORA920.US.ORACLE.COM> set autotrace off
ops$tkyte@ORA920.US.ORACLE.COM>



I threw in that last query to show you that there is "magic" happening with select of selects.  See the number of consistent gets on that last query?  Very very close to the others


There are optimizations happening here that go beyond what we can see. 

exec gen_data( 'T', 5000 );

A reader, October 12, 2002 - 4:23 am UTC

Hi Tom,
I've seen this procedure executed in the response above,
can you tell what the procedure is doing, Of course it is populating the table with test data, but how?

thanks in advance.



Tom Kyte
October 12, 2002 - 9:14 am UTC

Comments on this solution?

andrew, October 25, 2002 - 4:14 pm UTC

Tom
The URL below advocates pre-loading a pl/sql table for speed. The solution looks efficient to me and the performance claim sounds convincing. What do you think of it?
</code> http://www.quest-pipelines.com/Pipelines/PLSQL/tips01.htm#july <code>

Tom Kyte
October 26, 2002 - 12:12 pm UTC

its the same as the other discussions here.

if you CANNOT DO IT in a single sql statement and must loop procedurally (which many people do when they SHOULD NOT), this can help.

If you can do it in a join in the first place, rather then lookup lookup lookup all of the time -- do it that way.



Whoops - nevermind...

Andrew, October 25, 2002 - 4:28 pm UTC

Sorry Tom - I just saw your comments on the link above.

Lookups in SGA?

Clark, July 31, 2003 - 8:22 am UTC

Tom, my first time, so please be gentle! And please excuse (and explain) any misunderstandings I may have of Oracle.

To summarise the whole thread, I think, and a common point you make in all discussions, do it in SQL unless it HAS to be PLSQL - so do a(n sql) JOIN to a lookup table and not an x-ref to a (plsql) memory structure pre-loaded with a lookup. (Tho, exception from 9iR2 noted, but I'm on 817.)

However, I often have the need to perform a lookup within plsql but without a join to another table, e.g to validate input. Right, now I think I've got this. This is an OLTP system, so it should be considered as a data loader and therefore keeping these lookups in plsql memory is an ok way of doing it. Yes?

So -- is there not some way of sticking them in some part of the SGA and avoiding duplicating UGA for each session? Something along the lines of a SERIALLY_REUSABLE package that doesn't lose state between calls would do the trick -- it would only ever need a SINGLE copy of the package, globals wouldn't need re-initialising on every call, etc.

Or am I worrying too much about memory/processing usage? Maybe there's so little overhead in doing a genuine sql lookup, from plsql, and letting Oracle store the lookups in the KEEP pool (thus managing SGA usage for me too) that I should just do it that way -- but comments about sql-plsql context switching lead me to believe that this SHOULD be avoided.

I could/can do straight performance tests for speed, but I am not experienced enough to understand and find out the wider implications regarding SGA usage etc. Speed is not always the primary concern in an OLTP system -- an extra 2 millisecs per user query, say, would not be of concern if you can shave 20% from the memory you need.

Many thanks for your time.
Cheers,
Clark.

Tom Kyte
July 31, 2003 - 8:56 am UTC

give me a concrete example whereby plsql would have to perform a validation PRIOR to inserting the data to work with. Me, I would let SQL do the validation itself.

If I needed to lookup some data,


insert into t
select plsql_variable_a, plsql_variable_b, plsql_variable_c, column_x_from_table
from lookup_table
where lookup_table.key = plsql_variable_d;


that would do my "lookup" for me straight away. SQL%rowcount would tell me if the lookup "failed"


If this was a data validation -- well, then I would use declaritive RI and let the database tell me when the data was garbage.

lookup x-ref

Clark, July 31, 2003 - 9:43 am UTC

Sorry, put it wrongly, I do use RI constraints, it wasn't for validation it was for x-ref.

The example I have just implemented uses a table of response codes (ids) with associated meanings and error-levels, eg.

create table response_codes as
( id NUMBER
, descr VARCHAR2(100)
, errlevel NUMBER
);

Crucially, MULTIPLE "detail" response codes are possible, which get stored in a child table to an "overall" response -- the overall being dictated by the worst-case detail, the highest response_codes.errlevel.

So, during validation, I log a detail response when a problem is found and keep a global g_worst_errlevel var which is a reflection of the worst response_codes.errlevel that has been found.

I could, of course, do an sql-query each time I have a detail response, but I have implemented the worst-lookup by pre-storing the lookup errlevels in a plsql table indexed by the id-col from response_codes. Here's the key proc:

-- - - - - - - - - - - - - - - - - - - - - - - -
PROCEDURE pr_detail_response
-- - - - - - - - - - - - - - - - - - - - - - - -
-- Acts on a 'detail' response
( pi_detail_response IN s_o_TYPE )
IS
BEGIN
g_detresps_TAB( pi_detail_response ) := NULL ;
g_worst_err :=
GREATEST
( g_worst_err
, g_det_errlvls_TAB( pi_detail_response ) );
END ;

g_detresps_TAB stores all detail responses that I've had**.

g_det_errlvls_TAB is the pre-stored lookup to the response_codes.errlevel col.

**In case you're wondering, it's not sensible to INSERT these at this point because the overall response-ID is the parent -- but I don't know my overall response until I've found out all my detail responses.

Thanks,
Clark

Suri, July 31, 2003 - 4:32 pm UTC


Loading Data into PL/SQL table

Ketan. Bengali, July 31, 2003 - 4:57 pm UTC

Hi Tom,

My data is like this:
Level 1.1:
Level 2.1 (Parent = Level1.1, Previous Sibling=NULL)
Level 3.1.1(Parent = Level2.1, Prev Sibling=NULL)
Level 3.1.2(Parent = Level2.1, Prev Sibling=Level 3.1.1)
Level 2.2(Parent = Level1.1, Previous Sibling=Level2.1)
Level 3.2.1(Parent = Level2.2, Prev Sibling=NULL)
Level 3.2.2(Parent = Level2.2, Prev Sibling=Level 3.2.1)

Basically, it is hierarchial and ordered data. To display this data to the users, have created all the records in pl/sql table and provided view on top of it.

Package Test_Pack is
-- rec tree_hierarchy_array;
function get_tree return tree_hierarchy_array;
--
end test_pack;
--
View is :
SELECT *
FROM TABLE(CAST(test_pack.get_tree() AS tree_hierarchy_array))

Today, I recreate pl/sql table whenever user selects from the view because variable, which stores pl/sql table, is local variable declared inside the get_tree function. I could store this variable in a session by declaring it in a package body.

If I store it to session level then the PL/SQL table has to be recreated only when the data is changed in the database. This will improve the performance. How do I find out whether database is changed since last run?

Creating a package and displaying it using a view, is it a right approach?

Thanks,

Ketan.

Tom Kyte
July 31, 2003 - 7:20 pm UTC

to discover that the data has changed is "hard" to do in a scalable fashion, unless you add auditing to the base table.

If you added a trigger that

create table last_update( x int primary key ) organization index;
create sequence s;

create trigger t_trigger after insert or update or delete of t
begin
insert into last_update values ( s.nextval );
end;


your routine can now keep a package variable that was the max(x) from t and every time it is called, it simply selects max(x) from t again to see if it changed -- if so, it must requery.


if modifications are really really "light" on this table, you could update a single row instead of inserting a new row each time. that will cause all modifications of the table to serialize by session though.


is creating a package and displaying as view the right approach? depends -- if it CANNOT be done in pure sql, maybe. if it could be done in pure sql, probably not.

The Questions.

Clark, August 01, 2003 - 8:38 am UTC

Hi Tom,

wrt my comments yesterday - (a) sorry for the low review grade, that was mainly because I asked the wrong q.. and (b) the questions I still have are:

1.
Is there anything inherently wrong/ bad practice with using a plsql index-by table to store values from the db this way - would you personally do it with a query each and every time you needed to x-ref the errlevel? If so, what are your reasons?

2.
Is there any way of storing common data (available to plsql) in the SGA to avoid duplicating 'application-static' data across all sessions? Are you aware of any plans to introduce such a concept if not?

Cheers,
Clark.

P.S. Keep up the good work!

Tom Kyte
August 01, 2003 - 9:45 am UTC

1) if I had procedural code that I could not turn into a query (a single query, not a single query executed 500 times, a single query executed once) I would be looking at using lookup tables. Yes, I do that.

2) there have been talks about it -- it is a hard problem to solve actually. introduces a new address space, concurrency issues (session 1 updates a variable at the same time session 2 does) and so on.

right now, you would have to do it "per session"

Perfect

Clark, August 01, 2003 - 10:21 am UTC

Thanks Tom, at least I know I'm not missing something.

Point 2 tho - my concept was simply as somewhere to store absolutely-static data, which you know absolutely will not change; or alternatively, if it does change, every session needs the current values. (Why? To provide for different installations of the same application with, say, different application parameters.)

Thanks again.
Clark.

Tom Kyte
August 01, 2003 - 11:13 am UTC

and we call that a database table ;)

it'll be in the cache if

a) it is small
b) it is used.

insert select for load (best practices)

Joe, September 12, 2003 - 4:00 pm UTC

Hi tom,
best practices question related to this thread

during a load into a normalized schema, i have to determine if the dimension row of the values I am loading already exists in the dim table. if it does, than use the key assigned to that set of values in my fact table.

did I understand all the posts correctly that I should

insert into t
select pk_from_dimension_table,plsql_var_a,plsql_var_b
FROM dimension_table
where dim_col_a = 'plsql_variable_d;
and dim_col_b='plsql_variable_e'

and if I get no rows inserted, assume that combo didn't exist and go ahead and insert it into the dim table?

This single line approach potentially followed by an insert would be better than a lookup query, followed by one or two inserts. (fact row and dim row)

In this case, after loading starts, I would assume that almost all combinations will be exist after a fairly short time.



Tom Kyte
September 12, 2003 - 7:55 pm UTC


how would you populate the dim table? if the dim table is just "number" to "value" and nothing else -- it is not very useful?

give me a real case to work with.

good, but one 'problem'

joachim mayer, September 22, 2003 - 11:05 am UTC

Hi tom,

i spend some time with this article before i started to use lookup-tables myself – which turned out to be very helpful and improved the performance of my applications a lot. But there is still one “problem” – or better unpleasant thing – in it.

I will try to describe it as good as I can. Sorry, I am not an native English speaker!

Most of my oracle-applications are called by other programs which stay connected to the database and therefore keep their sessions. When the first time call one of my package it reads (inialize) the lookup-table(s) for the package and it keeps going as long as it takes. Sometimes for hours or longerÂ… very rare, but it happens I would like to chance one value in the lookup-table and want this change to sow effect at once. But I can only change the value in the table where the process did get its original data but of course it wont get reread.
So, my question is – is there a way to inform this package to this session it should now reread its table and renew the lookup-table. (function-call, signal?)? At the moment, we must restart the process which is connected to the DB and uses my packages – which is sometimes cumbersome.

Thx

hannibal


Tom Kyte
September 22, 2003 - 11:56 am UTC

you would have to poll the database using something like dbms_alert, or AQ or just a simple single row table that contains the date/time of last update to see if you needed to "refresh"

Associative arrays

Zoran Martic, December 01, 2003 - 12:10 pm UTC

Tom,

This is very nice discussion.

I want to add one more case in this story to have complete picture.
I want to cover this situation:
You have the application (forms, c, c++, ...) and you have all lookup tables in Oracle database.
It is commonly known that doing lookups one by one from the client side is going to be slow (or to say not fast enough) in some cases because of SQL*Network roundtrip (TCP, IPC or even BEQ). On the newest hardware you can do to say up to few thousands lookups from C or C++. If you do this with somekind of in-process-memory structure (as associative arrays, hash table, ...) you can get 10 times more lookups. In this case the biggest overhead in doing Oracle lookups are roundtrips (including context switches) then executing lookup SELECT.
Of course this is the case only if you cannot load the data into Oracle with bulk inserts and after that do the magic in PL/SQL. You need to do validation and other things on the client side and in this case perform lookups one by one.
In this scenario you need somekind of client cache.

It looks to me that Oracle filled the gap with Oracle Forms if Oracle Forms can use associative arrays on the client side PL/SQL engine.
Associative arrays are much faster because it is in-process-memory, no need for locking, .....

For example if you use associative array on the Forms client side (if possible) you can get much better performances then doing lookups on a table.

This is from my perspective the biggest gap that I cannot solve with Oracle yet.
What with C, C++, Java, .... Oracle 9iAS.
Are there any way that Oracle is going to address this issue on the client side (as in PL/SQL with associative arrays)?
What you can tell about OCI object cache ? Is it to do anything with this story?

Thanks for any response.

Regards,
Zoran


Tom Kyte
December 02, 2003 - 7:55 am UTC

question -- why are you doing lookups on the client.

Why to do lookups on the client?

Zoran Martic, December 02, 2003 - 10:52 am UTC

In some special cases we need (or want) to do lookups on the client because of the needed speed.
The main idea is that it is faster because you do not have networking overhead (as SQL in this case), everything is in your process memory (as for associative arrays).

If you need to respond to some "real-time" (if something like that exists) event faster then you want to spend on the everhead on communicating between the client and the database (OCI) you need to consider somekind of client cache.
All other cases I covered by doing bulk insert and later on join with other tables and process that.

You saw how much faster is associative array then SQL lookup. I tested simple 2 column IOT lookup with 1M rows versus associative array lookup based on the same table. The difference was 25 times.
Of course this is limited to just one value lookup and it is unique lookup.
Also I tested some so called in-process or in-memory databases (Berkley Sleepycat) and they are about 5 times faster for simple 1-column lookup then PL/SQL, but much more faster if it is OCI call from the client even on the same server (SQL*Network time).

This is the only case when I cannot find the solution for the problem in Oracle.

From the results I have associative arrays are faster then B*Tree implementations in those fastest in-memory databases (they are simple, small, not reliable and robust like Oracle), even 5 times, but I cannot use PL/SQL in C or C++.
I want to use Oracle 100% because I am Oracle DBA/Dev, but I am facing problems to be fast enough in these special cases.

OCI object cache is looking like to simekind of caching mechanism, but hard to apply to normal tables and it is probably not like associative arrays (B*Tree) in PL/SQL.
Also 9iAS needs to be fast with dealing with data. They do not want for any validation to go to the server.
I found network roundtrips plus type conversion between C or Java and Oracle types very expencive.

I am asking you because I know that you are curious about this as I am.
I was thinking a few months ago that there is nothing faster then standard SQL lookup from IOT (or hash table or normal table). But as you saw with associative arrays this is not true (also by similar implementations in some small in-process databases like Sleepycat and TimesTen).
One-task process is not supported by Oracle anymore, that means no way to be fast as PL/SQL if you are using OCI.

What are you thinking about all of this?
Any new things in 10g with associative arrays or something similar on the OCI client side?

Regards,
Zoran


Tom Kyte
December 02, 2003 - 11:07 am UTC

I would encourage you to post your simple test -- I'll betcha we can change the algorithm to go the OTHER WAY (iot creams associative array ;)

it is all about thinking in sets, if you go slow by slow (opps -- meant row by row), the assoc array might win but if you do set based processing (single sql), the DB will cream you.

I blame the "procedural mindset", not the database here (i prototype this in my latest book effective oracle by design if you are interested).


I don't do lookups on the client side. to me -- the database IS the client side (mod_plsql).

For OCI, you'd be looking at a C subroutine (binary tree, avl tree, something like that).

Java's got hash tables...

for the 10g database -- its all about "the server", wouldn't really do "the client"


Client lookups

Zoran Martic, December 02, 2003 - 11:42 am UTC

Tom,

Thanks for your answer.
I have read all your books.

I agree that one SQL join per 50,000 rows in set is better then doing 50,000 lookups.
I am using that idea for a long time (as you described it in your books):
1. simple SQL
2. PL/SQL
3. everything else

I tested stupid associative lookups here because I wanted to know the difference between these fastest algorithms (or databases, TimesTen, Sleepycat, ...) and Oracle pure B*Tree implementation.
I was happy to see that even doing lookups on associative arrays was 5 times faster in my case then using these "fast" databases (algorithms).

You know that the most app clients are written in something diffeernt then PL/SQL and because of that, in my case C, you need to perform somekind of lookups (database, C hash table or whatever).

For you the client is PL/SQL (and me if not looking the company I am working for :), for many of us is something else.

Just that example about these lookups and associative arrays (to confirm that maybe my test is not wrong):

SQL> desc x2
 Name                                      Null?    Type
 ----------------------------------------- -------- -------- X                                         NOT NULL VARCHAR2(100)
 Y                                                  VARCHAR2(100)

Populated with 1M rows with this script:

declare
   x varchar2(100);
   y varchar2(100);
   i number;
begin
   for j in 1..1000000 loop
    insert into x2 values(to_char(j)||'abcdffgghhghg','abcdffgghhghg');
    if mod(j,10000) = 0 then
        commit;
    end if;
   end loop;
   commit;
end;
/

*************** Test ************************************
declare
      type  lookupTable is table of varchar2(100) index by varchar2(100);
      type  indexByTable is table of varchar2(100) index by binary_integer;

      l_lookupTable lookupTable;
      l_codes       indexByTable;
      l_values      indexByTable;
      l_value       varchar2(100);

      l_start number;
      l_run1  number;
      l_run2  number;
  begin

      dbms_output.put_line(dbms_utility.get_time);

      select x, y BULK COLLECT into l_codes, l_values from x2;

      dbms_output.put_line(dbms_utility.get_time);

      for i in 1 .. l_codes.count
      loop
          l_lookupTable( l_codes(i) ) := l_values(i);
      end loop;

      dbms_output.put_line(l_codes.count);
      dbms_output.put_line(dbms_utility.get_time);

      for i in 1 .. 1000000
      loop
          l_value := l_lookupTable( '501abcdffgghhghg' );
      end loop;

      dbms_output.put_line(dbms_utility.get_time);

      for i in 1 .. 1000000
      loop
           select y into l_value from x2
                   where x = '501abcdffgghhghg';
      end loop;

      dbms_output.put_line(dbms_utility.get_time);

  end;
/

Result:
411879663
411886578
1000000
411898013
411898682
411912955

You can see the difference between last 3 numbers.
I fixed the string I am searching for because I did not want to have possibility using the disk (the table is about 100M I think and also PGA was 180M for this test).
I did it for 100000 also. It was also much faster.
I did not try more columns or anything else.
I know that could be a way different.

Let me know if my test is wrong.
If not please give us some example where the table lookup is faster then associative array (no big JOIN technique please).

I agree again in PL/SQL is probably stupid to do lookups when you have joins, but sometimes you need to process event by event and not the bunch of events together (Imagine processing connect request or something similar in real world, you cannot batch that). 

Thanks,
Zoran 

Tom Kyte
December 02, 2003 - 12:31 pm UTC

i would have just joined, once again, that original bulk collect -- 0% chance I would have looked up on the client, 1000% chance I would have looked up on server (even if I had to create a global temporary table and shoot the data over first!)

In PLSQL, just use sql.

In C, -- well -- call PLSQL :)

(or look to that LANGUAGE to provide you a lookup tool -- C has that nifty bsearch routine)

Join

Zoran Martic, December 03, 2003 - 4:41 am UTC

Tom,

If you read carefully, there are not N records. There is not set of rows.
You are dealing with one event you need to respond as fast as possible.
Imagine the situation where you need to respond to some network event immediately, or even to some database event, you cannot use joins, you need to respond immediately. You cannot batch it.
That is the point, no way to have a batch.
You have "real-time" event you need to respond to.
In this case this server call (from C) from my perspective is very expensive.
In the same way you can look from PL/SQL you do not want sometimes to go and look the table, you will use local variables or associative arrays. That is the point.

My example was to show that looking into associative array is much faster then looking into IOT for direct hit lookups.

You owe me an example where IOT is faster then associative array as you said previously (just for PK matches).

I am pretty confident that even if you have the multicolumn primary key (doing concatenation in associative array to make the index), the lookup will be faster.

That means if you need to do something faster then your tables can do, you will opt for associative arrays.

At the end why should Oracle put associative arrays in the product if you can do everything with join :)
Are you saying to us that associative arrays are not needed?
I am sure that you want to use it in some special cases where you need to do things record by record (or event by event).
Oracle put it to fullfil PL/SQL speed.
I asked you in the first place is Oracle going to make something on the client side similar to the OCI object cache but more advanced.
I want to rely on Oracle, do not want to use other products.
I saw (tested) that Oracle B*tree search in associative arrays are faster then coresponding "fastest" in-process in-memory databases.

Thanks for all your time.

Zoran







Tom Kyte
December 03, 2003 - 7:07 am UTC

then, as I keep saying, you need to look to your LANGUAGE to help you out. be it c, java, whatever. the LANGUAGE will supply this feature

then you have to figure out how to make it so that when the database changes the client knows to refresh its cache.

you would not be using a plsql stored procedure in your case -- there would be no need to have it do a lookup as plsql stored procedures are running in the database and have no UI.

you would not be using forms for a real time interface either (i wouldn't imagine you would)


we put lots of stuff in the database cause people ask for it. its not always the best. i've demonstrated the best, most performant way.

I've yet to use an associative array in code, for real.



Associative arrays

Zoran Martic, December 03, 2003 - 10:41 am UTC

Tom,

Thanks a lot for your time.
That is what I am doing at the moment.

I am impressed with the speed of associative arrays.
I mean 1M lookups in 6 seconds on 1M table (simple two columns but..) is very good. The server is just 450MHz Sparc.
That is proving that Oracle has crazy good B*Tree algorithm, just all code on top of it (optimizer, SQL, locking, ....) are having initial overhead per SQL.
At the end this is why joins are fast, the overhead is too small in comparison to memory operations performed to read database blocks from the buffer cache (about 5k - 10k LIO's per 100MHz CPU).

Regards,
Zoran

This was incredible!

Roland G. Bender, December 10, 2003 - 12:25 pm UTC


validation and auditing

dxl, June 30, 2004 - 5:35 am UTC

Tom

We have a requirement to bulk load some data from a legacy system into oracle. The data needs to be validated. The validation is not particularly complicated but we do have to audit any rows failing validation.

So first the data is passed from a java web app via jdbc using oracle batch updating into a staging table (possible gtt) that just contains a load of varchar2 fields in order to get the data into oracle.
Then we need to get move the data from the staging table into the real table with the correct datatypes, validating the data as we go and logging any rows which don't pass but also sometimes preventing these rows from getting into the real table and in other cases where the validation fails as a warning we would let the row get into the table but just log it as a warning.

Hence we have something like: (this is a very simplified view of our tables)

create table stage (
forename varchar2(100),
surname varchar2(100),
dateofbirth varchar2(100),
postcode varchar2(100),
codeA varchar2(100));



create table myTable (
forename varchar2(30),
surname varchar2(30),
dateofbirth date,
postcode varchar2(7),
codeA number(2));



My question is what will be the most efficient method of loading into myTable? I would like to use

INSERT INTO myTable select ..validate.. from stage;

because using this in the past has proved to be amazingly quick.

However a lot of our fields require validation using lookup tables from the database. I belive the validation is bulky and awkward enough not to do it using CASE and decodes etc, plus it also uses some lookup tables, so I was thinking that i could write lots of functions to do the validation hence we would have:

begin

INSERT INTO myTable
select function1(forename),
function2(surname),
function3(dateofbirth),
function4(codeA),
..
.. etc
..
from stage;

end;

1) would this involve a lot of context switching? How efficient could this be?

2) Doing it this way i can put the auditing of each field into the functions BUT how do i prevent the rows failing validation from getting into the table without the whole insert statement failing? Should i let them get in and then run another procedure afterwards to delete them??


3) Maybe my whole approach is wrong, should i just go for a cursor loop so that i can validate as i go along and easily audit any errors and warnings. This would give me complete control over which rows get into myTable, but i don't like the thought of row by row processing since we may be dealing with a lot of rows.

4) Would loading up plsql lookup tables be quicker instead of using the real db tables and then doing something like


INSERT INTO myTable
select (select a from plsql_tble1 where b = forename),
(select a from plsql_tble2 where b = surname),
(select a from plsql_tble3 where b = dateofbirth),
(select a from plsql_tble4 where b = codeA),
..
.. etc
..
from stage;



I would still have the same problem of logging the errors and warnings and contolling which rows get into myTable.


Any advice here would be greatly appreciated.

Thanks

Tom Kyte
June 30, 2004 - 10:12 am UTC

is this an ongoing thing or one time?
how big is the input data set (rows and megabytes)?
why are you loading from a middle tier like that? seems like external tables would be more than sufficient.


thanks for reply

dxl, June 30, 2004 - 12:27 pm UTC

Forgot to say we are on 8.1.7.4 so no external tables.
This is an ongoing process which will be created in a stored procedure.
The data is coming from a java app on a middle tier app server, although we may at some point try and connect direct to the legacy system using HS, but not yet, it is coming from the java app for now.

As for size it will be done on a batch basis and i'm expecting it to be on average in the order of between 10,000 and 50,000 rows, i know this is vague but thats all i know right now, i've been told it "shouldn't" ever be more than 100,000 rows but that isn't definite either. There are probably about 40 - 50 columns to validate and the requirement is to make it go as quick as we can!



Tom Kyte
June 30, 2004 - 1:15 pm UTC

sqlldr then.. but anyway.


bulk process the data 100 rows at a time. bulk collect limit 100 into plsql arrays, process the data, forall i insert the "good" data and log the bad data (eg: process the one array you bulk collected into into two sets of arrays -- one good and one bad)




please give me some more details

dxl, July 01, 2004 - 5:34 am UTC

I'm not sure that i totally understand the whole process you've suggested. Please can you give me more details with an example.

At the moment i'm thinking that you mean something like:


1) bulk collect table (of varchar columns) into a scalar object of varchars with a 100 limit
2) then loop through each element of the array and do the validation
3) if any elements pass validation then add to the "good" array else add to the "bad" array
4) load each good and bad array into myTable or the error table using forall


For example:

(please excuse the pseudo code, i'm just interested in the process rather than correcting syntax at the moment, although any comments on whether to use scalartypes, arraytypes or rowtypes etc would also be helpful)


create or replace type myScalarType as object
(forename varchar2(100),
surname varchar2(100),
dateofbirth varchar2(100),
postcode varchar2(100),
codeA varchar2(100))
/


create or replace type myArrayType as table of myScalarType
/





declare
cursor c is select myScalarType( forename, surname, dateofbirth, postcode, codeA ) from stage;

l_data myArrayType;
l_good_data myTable%rowtype;
l_bad_data myTable%rowtype;

type good_Array is table of l_good_data index by binary_integer;
type bad_Array is table of l_bad_data index by binary_integer;
begin

open c;
loop
l_data := myArrayType();
fetch c bulk collect into l_data limit 100;


for i in 1 .. l_data.count loop

-- then should i do the validation here??

-- I need to do things like, for example :


select code into v1 from lookup table where code = l_data(i).codeA;
if sql%rowcount = 1 then
-- pass_validation
else
-- fail_validation
end if;

..
..
.. etc (lots more validation like this)
..
..


-- After the validation decide if it is a good row

if pass validation then
--
l_good_data := l_data(i);
else
l_bad_data := l_data(i);
end if;

..
.. etc
..

-- now build good_array and bad_array??

if good row then
good_array(n):= l_good_data ;

if bad row then
bad_array(m):= l_bad_data ;

n:= n + 1;
m:= m + 1;


end loop;


if ( good_array.count is not null )
then
forall i in 1 .. good_Array.count
insert into myTable values
( good_array(i) );
end if;


if ( bad_array.count is not null )
then
forall i in 1 .. bad_Array.count
insert into myErrorTable values
( bad_array(i) );
end if;


bad_Array := empty;
good_Array := empty;


exit when c%notfound;
end loop;
close c;
end;



Is this the kinda of processing that you meant?? or have i got the wrong end of the stick!?

Tom Kyte
July 01, 2004 - 10:59 am UTC

in this example -- mod() is my function to determine goodness or badness.  

this is 9ir2.

ops$tkyte@ORA9IR2> create table t as select * from big_table.big_table where rownum < 200;
 
Table created.
 
ops$tkyte@ORA9IR2> create table good as select * from t where 1=0;
 
Table created.
 
ops$tkyte@ORA9IR2> create table bad as select * from t where 1=0;
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> declare
  2      type array is table of t%rowtype index by binary_integer;
  3      l_source   array;
  4      l_good     array;
  5      l_bad      array;
  6
  7      cursor c is select * from t;
  8  begin
  9      open c;
 10      loop
 11          fetch c bulk collect into l_source limit 100;
 12          l_good.delete;
 13          l_bad.delete;
 14          for i in 1 .. l_source.count
 15          loop
 16              if (mod( l_source(i).id,2 ) = 0 )
 17              then
 18                  l_good(l_good.count+1) := l_source(i);
 19              else
 20                  l_bad(l_bad.count+1) := l_source(i);
 21              end if;
 22          end loop;
 23          forall i in 1 .. l_good.count
 24              insert into good values l_good(i);
 25          forall i in 1 .. l_bad.count
 26              insert into bad values l_bad(i);
 27          exit when c%notfound;
 28      end loop;
 29  end;
 30  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select count(*) from t;
 
  COUNT(*)
----------
       199
 
ops$tkyte@ORA9IR2> select count(*) from good;
 
  COUNT(*)
----------
       100
 
ops$tkyte@ORA9IR2> select count(*) from bad;
 
  COUNT(*)
----------
        99
 

excellent

dxl, July 02, 2004 - 4:15 am UTC

Thanks very much for that.

I am on 8.1.7.4 so will your method work or do i just need to replace the record types with object types??

Tom Kyte
July 02, 2004 - 9:51 am UTC

with an array per column I would suggest. You didn't have record operations in forall in 8i -- you need an array per column.

Bulk collecting into an associative array?

A reader, July 23, 2004 - 7:14 pm UTC

How can I bulk collect into an associative array?

I have a lookup table containing code (vc 10) and value (vc2 100).

Can I

type assoc_t is table of varchar2(100) index by varchar2(10);
mytab assoc_t;

select code,value BULK COLLECT into mytab from lookup_table;

How can I do this without resorting to "slow-by-slow" processing?

Thanks

[Sorry, I dont have my database in front of me right now to try this]

Tom Kyte
July 23, 2004 - 8:19 pm UTC

you cannot right now, you bulk collect the "subscripts" into array1, the "subscriptee" into array2 and then you have to build array3 by assigning array2 to array3:

for i in 1 .. array1.count
loop
associative_array3(array1(i)) := array2(i);


Magic?

A reader, August 02, 2004 - 11:14 pm UTC

Refer to one of your earlier followups...

"I threw in that last query to show you that there is "magic" happening with select of selects. See the number of consistent gets on that last query? Very very close to the others. There are optimizations happening here that go beyond what we can see"

Would you please explain what is going on here?

How is it possible for a FTS of EMP to have the same consistent gets as

select deptno, (select dname from dept where
dept.deptno = emp.deptno)
from emp

Basically, the DEPT access is free!! How is this possible?

Thanks

Tom Kyte
August 03, 2004 - 8:09 am UTC

it is not free -- it is however cached.

the scalar subquery -- the (select dname from dept where dept.deptno=emp.deptno) is "stripped out" and turned into:

select dname from dept where dept.deptno = :bv


the server then starts fetching data -- gets the first row from EMP, sees "DEPTNO 10" and runs that scalar subquery.  It then "saves this result in a small cache" -- so in a small cache we have "bv=10, dname=<xxxxx>".

The next time, it fetched 20, then 30, then 40, then bv gets NULL.  Then back to 10, 20, 30, 40, null and so on (because of the way I loaded the data in this example).

This small cache is able to hold the results of the 10, 20, 30, 40, null results so the scalar subquery is fired off just 5 times -- resulting in few logical IO's.

with scalar subqueries -- the "order of the data" can be very relevant.  consider:



ops$tkyte@ORA9IR2> set echo on
ops$tkyte@ORA9IR2> set linesize 121
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> define nvalues=100
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace function f(x in number) return number
  2  as
  3  begin
  4      dbms_application_info.set_client_info( userenv('client_info')+1 );
  5      return 42;
  6  end;
  7  /
 
Function created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> drop table t;
 
Table dropped.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create table t
  2  as
  3  select mod(rownum,&nvalues) r from all_objects;
old   3: select mod(rownum,&nvalues) r from all_objects
new   3: select mod(rownum,100) r from all_objects
 
Table created.
 
<b>in this table, the data will be fetched 1,2,3,4,...99,0,1,2,3...,99,0,1...

that is, the r values do not repeat from row to row but rather every 100 rows.  Now:</b>


ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_application_info.set_client_info(0);
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> set autotrace traceonly
ops$tkyte@ORA9IR2> select r, (select f(r) from dual) from t;
 
32160 rows selected.
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'DUAL'
   2    0   TABLE ACCESS (FULL) OF 'T'
 
 
 
 
Statistics
----------------------------------------------------------
         22  recursive calls
          0  db block gets
      25556  consistent gets
         49  physical reads
          0  redo size
     516751  bytes sent via SQL*Net to client
      24072  bytes received via SQL*Net from client
       2145  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
      32160  rows processed
 
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2> select userenv('client_info') from dual;
 
USERENV('CLIENT_INFO')
----------------------------------------------------------------
7783

<b>we can see our function was called 7,783 times there.  Taking the same exact set of data but "having it reside on disk in sorted order":</b>

 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> truncate table t;
 
Table truncated.
 
ops$tkyte@ORA9IR2> insert into t select mod(rownum,&nvalues) r from all_objects order by mod(rownum,&nvalues);
old   1: insert into t select mod(rownum,&nvalues) r from all_objects order by mod(rownum,&nvalues)
new   1: insert into t select mod(rownum,100) r from all_objects order by mod(rownum,100)
 
32160 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_application_info.set_client_info(0);
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> set autotrace traceonly
ops$tkyte@ORA9IR2> select r, (select f(r) from dual) from t;
 
32160 rows selected.
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'DUAL'
   2    0   TABLE ACCESS (FULL) OF 'T'
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2506  consistent gets
          0  physical reads
          0  redo size
     420930  bytes sent via SQL*Net to client
      24072  bytes received via SQL*Net from client
       2145  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      32160  rows processed
 
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2> select userenv('client_info') from dual;
 
USERENV('CLIENT_INFO')
----------------------------------------------------------------
103


<b>only 103 calls (the cache worked really well this time as we basically only needed to call the function pretty much once per bind variable value).

question would then be "does data have to be sorted on disk for this to work?" and the answer is no, an inline view that sorts the data by the values you will be binding into the scalar subquery can do this:</b>


ops$tkyte@ORA9IR2> truncate table t;
 
Table truncated.
 
ops$tkyte@ORA9IR2> insert into t
  2  select mod(rownum,&nvalues) r from all_objects;
old   2: select mod(rownum,&nvalues) r from all_objects
new   2: select mod(rownum,100) r from all_objects
 
32160 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_application_info.set_client_info(0);
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> set autotrace traceonly
ops$tkyte@ORA9IR2> select r, (select f(r) from dual)
  2    from (select r from t order by r);
 
32160 rows selected.
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'DUAL'
   2    0   VIEW
   3    2     SORT (ORDER BY)
   4    3       TABLE ACCESS (FULL) OF 'T'
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        353  consistent gets
          0  physical reads
          0  redo size
     420891  bytes sent via SQL*Net to client
      24072  bytes received via SQL*Net from client
       2145  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
      32160  rows processed
 
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2> select userenv('client_info') from dual;
 
USERENV('CLIENT_INFO')
----------------------------------------------------------------
100



<b>and here is the challenge du-jour..... why did the consistent gets go way way down on this last query as opposed to the one right before :)</b>


 

Challenge du-jour!

A reader, August 03, 2004 - 9:32 am UTC

Because the 'select r from t order by r' filled up the cache? Argh, I give up!

This was simply brilliant. You are single-handedly changing the world in how Oracle technologies are effectively used.

Keep up the good work (and do post the solution to your challenge!)

Challenge Du-jour

Bill, August 03, 2004 - 9:51 am UTC

Is it possible the consistent gets go way down because of the order by clause in the subquery? Still trying to get the hang of how this works, but very educational!

Tom Kyte
August 03, 2004 - 10:06 am UTC

Here is a "hint" of sorts:



ops$tkyte@ORA9IR2> create table t as select rownum r from all_objects;
Table created.
 
ops$tkyte@ORA9IR2> set autotrace traceonly statistics;
ops$tkyte@ORA9IR2> set arraysize 15   
<b> /* the default... */</b>

ops$tkyte@ORA9IR2> select * from t;
32174 rows selected.
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets<b>
       2197  consistent gets</b>
         49  physical reads
          0  redo size
     448960  bytes sent via SQL*Net to client
      24083  bytes received via SQL*Net from client
       2146  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      32174  rows processed
 
ops$tkyte@ORA9IR2> select * from t <b>order by r;</b>
 
32174 rows selected.
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets<b>
         52  consistent gets</b>
          0  physical reads
          0  redo size
     448960  bytes sent via SQL*Net to client
      24083  bytes received via SQL*Net from client
       2146  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
      32174  rows processed
 <b>
ops$tkyte@ORA9IR2> set arraysize 5000</b>
ops$tkyte@ORA9IR2> select * from t;
 
32174 rows selected.
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets<b>
         59  consistent gets</b>
          0  physical reads
          0  redo size
     203788  bytes sent via SQL*Net to client
        565  bytes received via SQL*Net from client
          8  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      32174  rows processed
 
ops$tkyte@ORA9IR2> set autotrace off
 

order by

A reader, August 03, 2004 - 10:26 am UTC

So, the 'order by' automagically turns on array processing on the server side? Even so, wouldnt the client (sqlplus) arraysize (15) still limit the throughput?

Still not getting it!

Tom Kyte
August 03, 2004 - 10:32 am UTC

Nope... not quite.

think about the definition of a consistent get -- were we get that data from vs what might happen with "temp data"

Challenge Du-juor - take two

Bill, August 03, 2004 - 10:47 am UTC

Hmmm....further up in this thread, you stated the following:
"I threw in that last query to show you that there is "magic" happening with
select of selects. See the number of consistent gets on that last query? Very
very close to the others".

So, maybe it has to do with the select of the select?


Challenge Du-jour Take Two yet again

Bill, August 03, 2004 - 10:56 am UTC

Oops - just realized that that would not explain the last example you gave. Back to the drawing board...

Challenge Du-jour : I think I am on the right track now!

Bill, August 03, 2004 - 11:05 am UTC

It has to do with the array fetches. They reduce the consistent gets.

Challenge du-jour!

A reader, August 03, 2004 - 1:30 pm UTC

"think about the definition of a consistent get -- were we get that data from vs what might happen with "temp data""

I give up. consistent get is a point-in-time current-mode read. It comes from the buffer cache, it may have to incur a PIO in order to get the block into the cache in the first place.

temp data comes from tempfiles (if you are using them).

i still dont see how adding a order by to the inline view had the same effect as increasing the client's arraysize! or was that just a red herring?!

Tom Kyte
August 03, 2004 - 5:46 pm UTC

when we fetched 15 rows by 15 rows without sorting.... we would

a) get a block
b) get 15 rows from block
c) give up block
d) goto a)

problem was -- there were about 500 or 600 rows per block here! We would get block "1" 40 times, block "2" 40 times and so on.

setting the arraysize to 5000 meant we would get block 1 once, block two 2 and so on (in order to get the first 5000 rows). We might have gotten block 10 or 11 twice -- but in essence, each block would be "consistent gotten" once -- instead of 40 or more times.

that is why the difference with array fetch of 15 vs 5000.

So, how did the sort affect that? In order to get the FIRST row, we had to get ALL rows and sort them. So the first row got all of the blocks -- put them into temp and sorted them. We returned the rows from there without having to use consistent gets on each and every fetch call -- since they were copied on the first fetch into temp.

Great!

A reader, August 03, 2004 - 8:46 pm UTC

Hm, great, that was very useful, thanks, but that raises a few more questions

"put them into temp and sorted them. We returned the rows from there without having to use consistent gets on each and every fetch call"

1. So the table had about 59 blocks and they were all pulled into the buffer cache by the sort operation. the "fetch" phase of the query simply returned them from there instead of doing more consistent gets for them. So the sort essentially "subsidized" the cost of the fetch operation! Nice.

This is a (good) side-effect of all sort-operations, right?
i.e. whenever you sort, you already get all the data you need before starting to output the first row.

"that is why the difference with array fetch of 15 vs 5000"

2. you have shown numerous times the benefit of using higher arraysizes.

If higher arraysizes are always beneficial, why doesnt Oracle make that the default behaviour for all "fetches", sqlplus or otherwise? (well, lets stick to sqlplus for this discussion).

I mean, getting a block, returning 15 rows and "throwing away" the block isnt terribly smart. If the user has set arraysize 15, he doesnt know what is good for him. Just bump it up without telling the user. Its not exactly like "Open the pod bay doors Hal. I'm sorry Dave, I'm afraid I can't do that", is it?

3. Why does the example with the 59 consistent gets have

203788 bytes sent via SQL*Net to client

I can understand that higher arraysizes will reduce the "SQL*Net roundtrips to/from client" number, but the answer is the same, so the "bytes sent via SQL*Net to client" should be the same, right?

Thanks

Tom Kyte
August 04, 2004 - 8:37 am UTC

1) the sort pulled all 50 some blocks from the cache into the "sort area size" allocated in the UGA -- once in the sort area, they never needed another "consistent get", they were already there in the sessions memory. If the sort area exceeded the sort area retained OR the sort area was not large enough -- we would have swapped to disk.

So, caveat emptor - do not run out and throw bogus order by's on queries that do not need them. Also -- a sort performed by an index access (skipping the sort actually) won't have this effect.

2) they are not always beneficial. I find 100 to be sort of a sweet spot. sqlplus is "just a very very simple tool" and has the method to change the default. The real problem is code like VB or Java -- there the default is "i don't know" for VB and 10 for jdbc - 10 is probably too small and the preFetch should change. For PLSQL, in 10g they made a default of 100 for prefetching for implicit cursors (was 1 -- unless you used bulk collect)


3) every round trip incurs protocol layers (metadata so we understand what we just sent to ourselves), less round trips, less protocol. Here because the data was so darn "small" (a simple number) the protocol was as large as the data itself

PGA usage by assoc arrays

A reader, December 13, 2004 - 2:09 pm UTC

Following the thread on this page...Lets say I prove that my system would benefit by the associative array approach i.e. use a package instantiation code to bulk collect 2 arrays, load up your assoc array and then simply use that instead of making database calls.

What are the memory implications of doing this?

Would all sessions using this package use as much memory as needed by the assoc array? Or do some optimizations kick in?

If I am using dedicated server and pga_aggregate_target, where is all this memory used? In each session's PGA? Would that be on the server or on the client?

Any way to reduce the memory usage?

Thanks


Tom Kyte
December 13, 2004 - 2:40 pm UTC

all sessions have their own data segment, they would each get a copy of the array in their address space.

it would be in the UGA in the PGA under dedicated server and in the UGA in the SGA under shared server and is not subject to the pga_aggregate_target which only really affects dynamic workareas like sort areas, hash areas.

the only way to reduce memory usage it to not use memory ;)

if at all possible, incorporate the "lookup" right in your SQL itself, that'll be faster still and not consume ram in the dedicated servers.

Sorry I have a large backlog right now, please ask a question later

Tanweer, December 22, 2004 - 6:38 am UTC

Sorry I am not able to send the Fresh message bCOZ I have gotten the message (I have a large backlog right now, please ask a question later )

How execute the Package,Procedures & Function on the system time.
Suppose I want to execute the (1) Package on time 3:00 A.M, (2) Procedures on time 4:00 A.M & (3) Functions on time 4:30 a.m.

Can you give me the Syntax/Code to take my Procedure/Package name?

I have given the Package & Procedure name for example:
(1)create or replace package lob_client_operations as

function get_lob_size(p_tablename in varchar2,
p_pkcolname in varchar2,
p_textcolname in varchar2,
p_docid in varchar2) return number;

procedure get_lob_chunk ( p_tablename in varchar2,
p_pkcolname in varchar2,
p_textcolname in varchar2,
p_docid in varchar2,
amount IN OUT NUMBER,
offset IN NUMBER,
buff OUT RAW);


procedure insert_new_record ( p_tablename in varchar2,
p_pkcolname in varchar2,
p_textcolname in varchar2,
p_docid in varchar2);

end;
/
-- ---------------------------- ---
-- ---------------------------- ---

(2)
(a)CREATE or REPLACE PROCEDURE SimpleTextMessage IS
(b) create or replace procedure html_email( p_to in varchar2, p_from in varchar2, p_subject in varchar2,
p_text in varchar2 default null) is
-- ---------------------------- ---
-- ---------------------------- ---

(3) create or replace function send(
p_from in varchar2,
p_to in varchar2,
p_subject in varchar2,
p_body in varchar2) return number as

-- ---------------------------- ---
-- ---------------------------- ---

Thanks in Adv


Tom Kyte
December 22, 2004 - 9:53 am UTC

hmm, do you understand the meaning of that message?


search this site for dbms_job.


doesn't matter if it is a package, procedure, function whatever -- you just need a job schedulers.

When is it necessary to usE PL/SQL tables ?Version ORACLE 8I

Stivi, December 23, 2004 - 1:56 pm UTC

Tom,hello!
As i understood it is more efficient to use
in procedural code SQL statements with joins(because it works faster) than lookup in arrays,but when is it good to use PL/SQL tables instead of SQL'S? May you show the example where PL/SQL table is preferable than SQL statement.
Thank you.
Stivi


Tom Kyte
December 23, 2004 - 2:43 pm UTC


I don't have any :)

If you feel you MUST code procedurally, that you MUST do it in procedural code -- then a PLSQL lookup table is pretty darn efficient.

my mantra:

if you can do it in a single sql, do it.
if not do it in as little plsql as possible...

QUESTION ABOUT INSERT IN SQL STATEMENT

DAV, December 23, 2004 - 2:55 pm UTC

Thank you Tom,
I have an insert statement that i do in procedural code:
insert into table_try (select 1,cstmr_no,func(1,ctmr_no)day
from customers
where bsnss_no=:param_code
and retuirning_no!=1);
i prefer sql for insert and not plsql code because you told it is faster.But i don't want to insert record if
func(1,ctmr_no) return null or select doesn't return anything.How can i do it?
Thank you very much.
DAV


Tom Kyte
December 23, 2004 - 3:48 pm UTC

if select doesn't return anything -- what could possibly be inserted? :) that is OK.

add a where clause to select only those rows you actually want to insert.

Insert_continue

A reader, December 23, 2004 - 3:57 pm UTC

Tom,thank you very much,

in (select 1,cstmr_no,func(1,ctmr_no)day
from customers
where bsnss_no=:param_code
and retuirning_no!=1);

FUNC(1,cstmr_no) acceses to another table and
and if func(1,cstmr_no) is null i don't want it to be inserted but if i use in where func(1,cstmr_no) is not null
this func will be executed twice in select and in where.
am i right?
if yes,how to improve it?
Have a nice day.



Tom Kyte
December 23, 2004 - 4:36 pm UTC

might be, might not be


select 1, cstmr_no, day
  from (select 1,cstmr_no,func(1,ctmr_no)day, ROWNUM
          from customers  
         where bsnss_no=:param_code 
           and retuirning_no!=1)
 where day is not null;


ops$tkyte@ORA9IR2> create or replace function f( x in number, y in number ) return number
  2  as
  3  begin
  4          dbms_application_info.set_client_info(userenv('client_info')+1);
  5      return x;
  6  end;
  7  /
 
Function created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_application_info.set_client_info(0)
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select empno,f(sal,comm)
  2    from emp
  3   where f(sal,comm) is not null;
 
     EMPNO F(SAL,COMM)
---------- -----------
      7369         800
      7499        1600
      7521        1250
      7566        2975
      7654        1250
      7698        2850
      7782        2450
      7788        3000
      7839        5000
      7844        1500
      7876        1100
      7900         950
      7902        3000
      7934        1300
 
14 rows selected.
 
ops$tkyte@ORA9IR2> select userenv('client_info') from dual;
 
USERENV('CLIENT_INFO')
----------------------------------------------------------------
28
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_application_info.set_client_info(0)
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select empno, f
  2   from (
  3  select empno,f(sal,comm) f, rownum r
  4    from emp
  5         )
  6   where f is not null;
 
     EMPNO          F
---------- ----------
      7369        800
      7499       1600
      7521       1250
      7566       2975
      7654       1250
      7698       2850
      7782       2450
      7788       3000
      7839       5000
      7844       1500
      7876       1100
      7900        950
      7902       3000
      7934       1300
 
14 rows selected.
 
ops$tkyte@ORA9IR2> select userenv('client_info') from dual;
 
USERENV('CLIENT_INFO')
----------------------------------------------------------------
16
 
ops$tkyte@ORA9IR2>
 

Procedural mindset

Bob, January 07, 2005 - 12:10 pm UTC

Tom,
Very helpful and insightful information. I've been reading your book and I am now trying to breakout of the procedural mindset. The one question this thread doesn't answer for me is this; I SQL*Load a table, then I need to validate each record from that table against two other tables to determine if it should be inserted into the main table. If the validation fails, an error needs to be logged in a seperate table. The structure of all tables is different and it's version 8.1.7. The idea of using a good and bad table covers most of it, but how do you log the error associated with the bad records?


Tom Kyte
January 08, 2005 - 3:41 pm UTC

what is the validation? I always try to let the database do my work for me, why can't it do it here?

Procedural mindset - resolved

Bob, January 08, 2005 - 8:30 am UTC

After further reading and testing it's starting to make sense.

Thanks for the execellent site.


You can't think of one reason?

bob mourning, February 19, 2005 - 3:57 pm UTC

You could not come up with a single reason to ever use a PL/SQL table over the SQL?

How about when the table one must read does not belong to the reader and it is poorly indexed. Then it might make sense to do the select one time and efficiently access the data where possible via PL/SQL binary_integer.

There are other solutions of course, but poorly performing SQL run 10M times would not work.

Also, it may be possible to limit the candidate values up front. For example, if my transaction data has only 12 lookup keys and the lookup table has 2M rows, it might make sense to go against 12 rows in a PL/SQL table versus use a non-unique index.

BTW, I think your posts are generally excellent, but sometimes you are just arrogantly dogmatic.

Tom Kyte
February 19, 2005 - 5:26 pm UTC

Hmm, lets see, dogmatic is a recursively defined thing

Main Entry: dogmatic
Pronunciation: dog-'ma-tik, dg-
Variant(s): also dogmatical /-ti-k&l/
Function: adjective
1 : characterized by or given to the use of dogmatism <a dogmatic critic>
2 : of or relating to dogma

so, we must dig deeper to find the meaning:

Main Entry: dogmatism
Pronunciation: 'dog-m&-"ti-z&m, 'dg-
Function: noun
1 : positiveness in assertion of opinion especially when unwarranted or arrogant
2 : a viewpoint or system of ideas based on insufficiently examined premises

If you choose number 1, then your use of arrogantly would be redundant. It cannot be "unwarranted", for I tend to put the proof of what I talk about out there for all to see. It definitely cannot be #2.

but actually, wait, it cannot be #1, because that includes "opinion". All I asked was "where is the proof here". I'm not offering opinion, but rather hard facts -- with numbers, with code.

Anyway, I did give a bunch of them in "Effective Oracle by Design" as well -- spent the better part of a chapter on this topic. The answer came down to "just join" (that would make a nice bumper sticker I think -- it would mean different things to different people but it has a nice ring to it. Maybe "databases were born to join", but then the print would have to be pretty small to fit it all)


but hey -- I even gave a case, an example of using a plsql based lookup table versus querying repeatedly -- I agree, that can be faster, HOWEVER, I also (and will keep doing so over and over) point out that you've only gone a tiny fraction of the way to the right answer which in most cases is "just join darn it, databases were born to join"


Maybe you did not read the entire answer, the entire discussion, the give and take.


So, you can drop the arrogantly -- it is sort of "inferred" in the definition of the term you are using, but I do believe the term was used incorrectly, especially if you take into consderation the definition of


Main Entry: dogma
Pronunciation: 'dog-m&, 'dg-
Function: noun
Inflected Form(s): plural dogmas also dogmata /-m&-t&/
Etymology: Latin dogmat-, dogma, from Greek, from dokein to seem -- more at DECENT
1 a : something held as an established opinion; especially : a definite authoritative tenet b : a code of such tenets <pedagogical dogma> c : a point of view or tenet put forth as authoritative without adequate grounds
2 : a doctrine or body of doctrines concerning faith or morals formally stated and authoritatively proclaimed by a church

"established opinion" -- nope.
"without adequate grounds" -- nope
and while database conversations sometimes take on a religious fervor -- i don't think #2 applies either.

Just my 2cents -- and yes, I'll keep saying the same things over and over and over and maybe they'll stick (use binds, use binds, just join, databases are really good at inserting too -- audit trails are good, every other line of code should be DEBUG (trace, instrumentation) code -- all of these things people don't do, i'll just keep plugging away at it)

(yes, this answer is tongue in cheek -- actually, I just read "arrogantly dogmatic" as "consistent and committed to provide the information with proof")

Which would be faster

hawk, July 06, 2005 - 12:03 pm UTC

Dear Tom

Please advise which of the 2 methods would be effecient and faster

Aim: Need to insert data into a table on daily basis (table P) and also I need to insert that data in form of into other table (t)
I would be using data from table p for certain other purposes


method 1:
---------
For curref in (select a.*, b.* from a, b, c, d, e, f
where a.column = b.column
and some join conditions among all 6 tables)
loop
insert into p values (abv selected)

insert into t values (abv selected some cals)
insert into t values (abv selected but some diff cals)
end loop;

Method 2:
---------
insert into p (select a.*, b.* from a, b, c, d, e, f
where a.column = b.column
and some join conditions among all 6 tables);

for cur_ref in (select * from p where date_time = sysdate)
loop
insert into t values (abv selected)
insert into t values (abv selected but some diff cals)
end loop;


Please advise which of the abv 2 methods is efficient and faster.

Thankyou.

Tom Kyte
July 06, 2005 - 12:46 pm UTC

method three, multi-table insert, no code of course.

insert
when (1=1) then into p ( ... ) values ( ... )
when (whatever) then into t ( ... ) values ( ... )
.....
select a.*, b.* from a,b, c,d.......



A note

Hawk, July 06, 2005 - 1:36 pm UTC

Tom,
Great! but one thing every row in the select needs to get to all the tables hence there wouldn'nt be any when clause.

I hope still this(ur suggested method 3) is the best way to do it.

Please comment.

Thankyou



Tom Kyte
July 06, 2005 - 2:30 pm UTC

sure is, read about multi-table inserts, there is an INSERT ALL ..... as well no when clauses.

Thankyou very much TOM

Hawk, July 06, 2005 - 3:03 pm UTC


very interesting

Muhammad Ibrahim, July 06, 2005 - 10:43 pm UTC

Dear Tom,

I have a situation as below

create table t ( id number, from_date date, to_date date, amount number, expired varchar2(1));

I want to take a copy of the table t by id(column).
Process this data in the memory then write back
the same/new set of data to the table t and mark the
previous set as old. The processing includes many for
loops and validations,comparisions,lookups etc...
Here my question to process the data whether plsqltabl/
global temporary table which one is better?
If i use temp table i need to do lot of inserts and updates
for processing sometimes delete also inside that too may inside
for loops. Insert is fine but i am worried about updates and
deletes Or plsql table is better for this kind of situation?

Regards,
Ibrahim.


Tom Kyte
July 07, 2005 - 9:04 am UTC

you missed the most important bit of data

how big.

and databases are really good at doing things -- lookups for example are called joins (they do them particularly weel). validations and comparisions. You may well find you can actually do a ton of stuff in sql.

plsql table or global temp table

Muhammad Ibrahim, July 08, 2005 - 5:27 am UTC

eg:
----

main table which keeps the transactions:
---------------------------------------
create table cessions ( cession_id number(20) not null,
start_date date not null,
mode varchar2(10) not null,
end_date date not null
pol_group_id number(10),
class_id number(10),
arrangement_id number(10),
sequence_number number(10),
premium number,
exch_rate number,
overwritten varchar2(1),
apply_copy varchar2(1));

alter table cessions add ( constraint cessions_pk primary key(cession_id,start_date,ri_mode));

global temp table for processing:
--------------------------------
create global temporary table cessions_tmp ( cession_id number(20) not null,
start_date date not null,
mode varchar2(10) not null,
end_date date not null,
pol_group_id number(10),
class_id number(10),
arrangement_id number(10),
sequence_number number(10),
premium number,
exch_rate number,
overwritten varchar2(1),
apply_copy varchar2(1))
on commit preserve rows;

alter table cessions_tmp add ( constraint cessions_tmp_pk
primary key(cession_id,start_date,ri_mode));

procedure merge_old_situation(pol_group_id in number)
is
begin
delete cessions_tmp;
for rpol in ( select pol_group_id
from main_pol
where ..) -- this can be one or more
--(select criteria may return min 1000 or more)
loop
insert into cessions_tmp( .... )
select ...
from cessions
where pol_group_id = rpol.pol_group_id;
end loop;

end merge_old_situation;

procedure process_re(pol_group_id in number
Is
begin
-- i dont want to directly manipulate the transaction table that
-- is changing it directly. So i take a mirror image of the original
-- table to the temp table by some conditions(not all the data).

merge_old_situation(pol_group_id);

for rintervals in ( select start_date, end_date
from intervals )-- this is also a global temp table which
-- will have set of intervals
--(one or more may be 365 intevals per
-- year).
for rexisting_pol in ( select ...
from another_table
where rintervals.start_date between start_date
and end_date )
-- this another_table is a big table but the select
-- criteria may return min 100 or more
loop
-- here i take the new data which comes from rexisting_pol then
-- compare with cessions_tmp table
-- insert or update depends on the necessaity
update cessions_tmp
...
if sql%rowcount = 0 then
insert into cessions_tmp
...
end if;
for rsomething in (select ..
someother_bigtable -- more than a million records
where rintervals.start_date between start_date
and end_date )
loop
update cessions_tmp
...
if sql%rowcount = 0 then
insert into cessions_tmp
...
end if;
end loop;
end loop;

loop
end loop;

-- finally i do merge for the new one with the old one based
-- on the primary key columns
merge into cessions a
using cessions_tmp
on ( a.cession_id = b.cession_id
and a.start_date = b.start_date
and a.mode = b.mode )
when matched then
update set a.end_date = b.end_date
...
when not matched then
insert (a.cession_id, a.....)
values (b.cession_id,b.... );
end process_re;


-- so my question is using global temp table(cessions_tmp) in this above
-- scenario is better or plsql table is better? please tell us your
-- opinion.
-- note: this is online transaction!

Tom Kyte
July 08, 2005 - 7:54 am UTC

for rpol in ( select pol_group_id
from main_pol
where ..) -- this can be one or more
--(select criteria may return min 1000 or more)
loop
insert into cessions_tmp( .... )
select ...
from cessions
where pol_group_id = rpol.pol_group_id;
end loop;

that should be a single sql statement -- why write code?


I've no clue if you have 1 row or 1000000000 rows or something in between.


My main concern for performance would be "how to remove the procedural slow by slow code and do as much in a single sql statement as possible"

using seq can cause issues in insert all

Hawk, July 08, 2005 - 11:02 am UTC

Tom,
I am using Insert all as suggested and looks great but....
insert all
into t1 values(a,b,c...)
into t2 values (seq.nextval, d, f)
into t2 values (seq.nextval, d* -1 , f+1)
(select a, b, c, d, f
from
pp, qq
where ,....conditions)

Above, issue is the I am using a same sequence in same table as 2 different rows hence the same seq.nextval goes in both rows which are duplicates for my table t2.
Limitation: I cannot use 2 different sequences.

Please suggest.

Thankyou.

Tom Kyte
July 08, 2005 - 12:55 pm UTC

I can only suggest you revisit your "limitation" or create the sequence as


create sequnce seq increment by 2;

and use seq.nextval, seq.nextval+1

nextval is defined to be consistent that way -- constant -- if you refer to it a dozen times in a sql statement -- it'll be the same value a dozen times (same with currval). It is as if it were "bound" in -- like it was a bind variable row by row.

This works

Hawk, July 08, 2005 - 12:44 pm UTC

Dear Tom,
I hope this works

function seq_nextval(seqname in varchar2)
return number is
seq_next number;
begin
-- select 'seqname'||.nextval into seq_next from dual;
execute immediate 'select '||seqname||'.nextval from dual'
into seq_next ;
return seq_next ;
end; -- function seq_nextval



Tom Kyte
July 08, 2005 - 1:06 pm UTC

not reliably, a plan change can and will change the number of times PLSQL called from SQL is called. Not recommended, you are relying on the function to be called for each invocation and that behavior can change over time.


the dynamic sql would not be necessary.

how's this?

Tyler, July 08, 2005 - 3:26 pm UTC

a possible solution to Hawk's problem? 

SQL> drop sequence test_seq ;

Sequence dropped.

SQL> create sequence test_seq start with 1 increment by 1;

Sequence created.

SQL>
SQL> create or replace function test_the_seq return number as
  2     hold_it number;
  3  begin
  4     select test_seq.nextval into hold_it from dual;
  5  return hold_it;
  6  end;
  7  /

Function created.

SQL>
SQL>
SQL> SELECT test_the_seq, test_the_seq from dual;

TEST_THE_SEQ TEST_THE_SEQ
------------ ------------
           1            2

SQL> SELECT test_the_seq, test_the_seq from dual;

TEST_THE_SEQ TEST_THE_SEQ
------------ ------------
           3            4

SQL> SELECT test_the_seq, test_the_seq from dual;

TEST_THE_SEQ TEST_THE_SEQ
------------ ------------
           5            6

SQL> SELECT test_the_seq, test_the_seq from dual;

TEST_THE_SEQ TEST_THE_SEQ
------------ ------------
           7            8

SQL> SELECT test_seq.nextval from dual;

  NEXTVAL
---------
        9
 

Tom Kyte
July 08, 2005 - 6:03 pm UTC

today that might work
yesterday it might not
tomorrow it might stop
the day after it could work again.

there is nothing saying how many times a function written in plsql will be called from SQL.

Nothing.

answer is same as answer immediately above

?

Tyler, July 08, 2005 - 6:39 pm UTC

Hi Tom,

Could you please elaborate on this behavior? IÂ’m not questioning your statement, but it does confuse me. Personally, IÂ’d expect SQL to execute a function call once per utilization always and forever.

Thanks!



Tom Kyte
July 08, 2005 - 6:53 pm UTC

select 1+2, 1+2
  from t;

how many times do we need to evaluate 1+2?


now, add in different plans, joins, a nested loop versus hash join, a subquery materialized to temp, etc etc etc etc.


Please do not ascribe to the non-procedural language SQL anything procedural at all.

Consider:

select * from t where x = f();

How many times would you expect f() to be called?



ops$tkyte@ORA10G> create table t ( x int );
 
Table created.
 
ops$tkyte@ORA10G> insert into t select rownum from all_users ;
 
38 rows created.
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> create or replace function f return number
  2  as
  3  begin
  4          dbms_application_info.set_client_info(userenv('client_info')+1);
  5          return 42;
  6  end;
  7  /
 
Function created.
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> exec dbms_application_info.set_client_info(0);
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA10G> select * from t where x = f();
 
no rows selected
 
ops$tkyte@ORA10G> select userenv('client_info') from dual;
 
USERENV('CLIENT_INFO')
----------------------------------------------------------------
38
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> create index t_idx on t(x);
 
Index created.
 
ops$tkyte@ORA10G> exec dbms_application_info.set_client_info(0);
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA10G> select * from t where x = f();
 
no rows selected
 
ops$tkyte@ORA10G> select userenv('client_info') from dual;
 
USERENV('CLIENT_INFO')
----------------------------------------------------------------
2
 


why 38? (there are 38 rows in there)
but why 2?  

I can show you tricks to MINIMIZE the number of times a function is called, but you would be hard pressed to make it so that a function is called exactly as many times as you want -- USING SQL.

I would not rely on this behavior (at the very least I would dummy up parameters to this function and make sure it gets called with unique inputs -- eg: pass it rowid and something else -- so that the optimizer could never say "ah hah, don't need to keep calling this puppy over and over) 

Why 2?

VA, July 08, 2005 - 7:14 pm UTC

Thats very interesting, why does f() get called 2 times when you add the index on t(x)? What does the index on t(x) have to do with evaluating the predicate where t.x=f()?

Also, you confused me with your double-negative in your last paragraph. "so that the optimizer could never say "ah hah, don't need to keep calling this puppy over and over"

If I add a dummy parameter to the function like f(rowid,...), the optimizer will be forced to call the function for every row, right? Isnt this exactly what we were trying to avoid?

Thanks




Tom Kyte
July 09, 2005 - 8:47 am UTC

not really a double negative. The optimizer could say:

"ah hah, we do not have to call F() over and over, We'll call F() once and be done with it"

If you make the inputs into F() (add inputs) 'unique', the optimizer could not ever make this optimization happen.


IF you call a function AND everytime you call it, the inputs are unique, then the database would be forced to invoke it AT LEAST ONCE, eg, you have your function f() that selects the sequence.nextval. You want to call it AT LEAST twice per row:


select f( rowid, 1 ), f(rowid, 2 ), ....
from table;


rowid is unique in table, rowid+1 is unique -- rowid+2 is unique, there would be no way to NOT call f() at least twice per row. Stress words "at least", it could call it lots more if it feels like it.


Why did it call it twice? Just because it does, it could call it once, maybe in the future it will. Maybe in the future it'll call it 3 times though.

PL/SQL associative arrays

Quadro, September 21, 2005 - 12:07 am UTC

if you need to do simple lookups with small tables, PL/SQL associative arrays go far far far away from even hash clusters in terms of performance (3x-10x times better) and scalability (latches go down in orders of magnitude).

So, if you need to do lookups in procedural code - by all means use it. We use them broadly in our PL/SQL tariffication routines (too much logic to do it just in sql joins).

Tom Kyte
September 21, 2005 - 6:54 pm UTC

prove it.


if they are small, they hash in memory.
if they are small, they cannot be causing much lio.

I'll stick with Just Join, until shown otherwise.

Prove it

Quadro, September 22, 2005 - 3:04 am UTC

"prove it."

Sure. In our simple test case we will use PL/SQL table and hash cluster to lookup object_name by object_id.

SQL> select count(*) from user_objects;

  COUNT(*)
----------
       418

So, hash cluster with 1024 keys will be more than enougth.

SQL> create cluster hc_test
  2  (
  3   object_id number
  4  ) size 512 hashkeys 1024;

Cluster created.

SQL> create table hash_lookup
  2  (
  3   object_id number,
  4   object_name varchar2(30)
  5  ) cluster hc_test (object_id);

Table created.

SQL> insert /*+ append */ into hash_lookup select object_id, object_name from user_objects;

420 rows created.

SQL> commit;

Commit complete.

SQL> declare
  2   --type we will use for lookup table
  3   type varchar2_table is table of varchar2(30) index by binary_integer;
  4   l_lookup_table varchar2_table;
  5  
  6   --table for storing object identifiers
  7   type number_table is table of number index by binary_integer;
  8   l_lookup_values number_table;
  9  
 10   l_object_name varchar2(30);
 11  begin
 12   --Fill our lookup table
 13   for cur in (select * from user_objects where object_id is not null)
 14   loop
 15    l_lookup_table(cur.object_id):=cur.object_name;
 16   end loop;
 17  
 18   --fill table with lookup values
 19   select object_id bulk collect into l_lookup_values
 20    from user_objects
 21    where object_id is not null;
 22  
 23   --first - lookup in hash cluster
 24   runstats.rs_run1;
 25   for i in 1 .. 100
 26   loop
 27    for j in 1 .. l_lookup_values.count
 28    loop
 29     select object_name into l_object_name
 30      from hash_lookup
 31      where object_id=l_lookup_values(j);
 32    end loop;
 33   end loop;
 34   --second - lookup in associative array
 35   runstats.rs_run2;
 36   for i in 1 .. 100
 37   loop
 38    for j in 1 .. l_lookup_values.count
 39    loop
 40     l_object_name:=l_lookup_table(l_lookup_values(j));
 41    end loop;
 42   end loop;
 43   runstats.rs_stop(1000);
 44  end;
 45  /
Run1 run in 213 hsecs
Run2 run in 3 hsecs
Run1 run in 7100% of run 2
*
STATISTIC                                 RUN 1       RUN 2        DIFF
----------------------------------- ----------- ----------- -----------
STAT...no work - consistent read ge      41,900           0     -41,900
STAT...cluster key scans                 41,900           0     -41,900
STAT...buffer is not pinned count        41,900           0     -41,900
STAT...cluster key scan block gets       41,900           0     -41,900
LATCH..library cache pin                 83,810           2     -83,808
STAT...consistent gets from cache        41,907           8     -41,899
STAT...execute count                     41,901           1     -41,900
LATCH..simulator lru latch                2,300           0      -2,300
STAT...recursive calls                   41,902           1     -41,901
STAT...consistent gets                   41,907           8     -41,899
STAT...session logical reads             41,922          24     -41,898
LATCH..cache buffers chains              83,883          84     -83,799
LATCH..simulator hash latch               2,300           0      -2,300
LATCH..library cache                     83,812           2     -83,810
STAT...Cached Commit SCN referenced      40,900           0     -40,900
STAT...calls to get snapshot scn: k      41,902           1     -41,901
*
LATCH totals                              RUN 1       RUN 2         PCT
----------------------------------- ----------- ----------- -----------
Latch totals versus runs                256,381         110 233,073.64%

PL/SQL procedure successfully completed.

So - the difference is tremendous? 

Tom Kyte
September 22, 2005 - 1:44 pm UTC

you missed my point.

"JUST JOIN"

you would not be doing this sql in a loop, you would have JOINED to this data once.


You have some query you are processing, doing a slow by slow process apparently and doing a lookup on each row.

My point is JUST JOIN, don't do a lookup for each row, just JOIN and let SQL do the look up.


I agree, if you are going to run a little single row lookup over and over (slow by slow), this helps -- however, JUST JOIN in the first place.

Need help in filtering values from pl/sql Table

Mave, September 29, 2005 - 5:09 pm UTC

I have an array [Associative array] with information like

Type rMyArray is record of (empno integer,
category integer);

Type tMyArray is table of rMyarray index by pls_integer;

Myarray tMyArray;

-- Values in this array

MyArray(1):={1234,1}
MyArray(2):={1234,2}
MyArray(3):={1234,3}
MyArray(4):={1234,4}
MyArray(5):={1234,5}
MyArray(6):={1234,6}
MyArray(7):={2345,1}
MyArray(8):={2345,2}
MyArray(9):={2345,3}
MyArray(10):={2345,4}
MyArray(11):={2345,5}

My requirement is to filter out this array for each empno [in this case 2 employees]
and dump them in another array[optional] called emparray and do some processing.
loop over first empno[=1234] and fetch all the values for this empno and process them, then
loop over second empno[=2345] and fetch all the values for this empno.

How can I acheive this?

Thanks for your time and help,


Tom Kyte
September 30, 2005 - 8:43 am UTC

you just answered your own question with the psuedo code?

Filter PL/SQL array by value

Quadro, September 29, 2005 - 8:08 pm UTC

This will have linear complexity. Pretty much unscalable and slow.

Instead of iterate use lookup value as key. In your case you can use two PL/SQL tables. First PL/SQL tables keyed by empno will point to second with categories.

declare
type number_table is table of number /* categories */ index by binary_integer;
type number_table2 is table of number_table index by binary_integer /*empno*/;

Myarray number_table2;
begin
Myarray(1234)(1):=1;
Myarray(1234)(1):=2;
Myarray(1234)(1):=3;
Myarray(1234)(1):=4;
Myarray(1234)(1):=5;
Myarray(1234)(1):=6;

Myarray(2345)(1):=1;
Myarray(2345)(1):=2;
Myarray(2345)(1):=3;
Myarray(2345)(1):=4;
Myarray(2345)(1):=5;

--So, when you need to filter by empno, you just use it as key
if (Myarray.exists(1234)) -- see, if such empno exists
then
for i in 1 .. Myarray(1234).count
loop
--process them here
end loop;
end if;
end;

FIlter PL/SQL array

Quadro, September 29, 2005 - 8:17 pm UTC

Or wait for a while - in a moment here will be the Tom himself, pointing out what the best way to get filtere data is to use SQL and *JUST WHERE* clause :-)

Thanks for the replies..

A reader, September 30, 2005 - 9:38 am UTC

Tom, I got the pseudo code but not able to put it in code. I could do like Quadro mentioned here [thanks quadro], but he said myarray.exists(1234)..how to get that value 1234?

if i have two arrays
myarray [with all the values] (empno,category)
myarray1 (empno)

can i say like this ?
for i in 1..myarray1.count
loop
if myarray.exists(myarray1(i).empno)
then
do process --
end if;
end loop

Thanks a bunch.

Tom Kyte
September 30, 2005 - 11:07 am UTC

he is saying to use an entirely different structure. do you want to do that.


you should be able to split this into two arrays easily yourself, if you can fill ONE array, you can read that array and put it into two arrays?

I can split into two arrays but..

A reader, September 30, 2005 - 11:16 am UTC

How do i check the value in an array..I am not quite following it. Could you please eloborate with a simple example?

That would be really helpful..

Tom Kyte
September 30, 2005 - 11:58 am UTC

if ( myarray(i).empno = something )

??

you created the array? have you checked out the plsql guides, they have many examples.



Falling in infinite loop??

A reader, September 30, 2005 - 1:13 pm UTC

Ok, this is what i did [just testing ]
I am trying to loop over just one empno and it's falling in infinite loop??
It's showing me when it's not equal to v_empno [1234], but not exiting.

Can you point out the reason?

Thanks for all your help..

declare

Type Emparray is record (empno integer,
ename varchar2(20));

Type temparray is table of Emparray index by pls_integer;

emp_array temparray;

v_empno integer:=0;
begin

-- select empno,ename bulk collect into emp_array from emp;
emp_array(1).empno:=1234;
emp_array(1).ename:='name1';

emp_array(2).empno:=1234;
emp_array(2).ename:='name2';

emp_array(3).empno:=1234;
emp_array(3).ename:='name3';

emp_array(4).empno:=1234;
emp_array(4).ename:='name4';

emp_array(5).empno:=2345;
emp_array(5).ename:='name5';

for i in 1..emp_array.count
loop

--v_empno:=emp_array(i).empno;
v_empno:=1234;

dbms_output.put_line(emp_array(i).empno);

loop
if emp_array(i).empno=v_empno
then
dbms_output.put_line(emp_array(i).empno);
else
--dbms_output.put_line('Not equal '||emp_array(i).empno);
exit ;
end if;
end loop;

end loop;



end;

Tom Kyte
September 30, 2005 - 2:22 pm UTC

I don't get the inner loop. it is an infinite loop for sure unless they are not equal.


why do you have an inner loop at all.

Just testing

A reader, September 30, 2005 - 2:32 pm UTC

Inner Loop is just for testing . I am trying to insert each value back into a log table for all those empno's that are equal to the value v_empno.

So, why it's an infinite loop? I am exiting if they are not equal.

Thanks,

Tom Kyte
September 30, 2005 - 2:38 pm UTC

but when the are equal.... infinite loop, you are not incrementing anything, it is the same condition over and over and over again.


loop
if emp_array(i).empno=v_empno
then
dbms_output.put_line(emp_array(i).empno);
else
--dbms_output.put_line('Not equal '||emp_array(i).empno);
exit ;
end if;
end loop;


no matter how many billions of times you execute that - (i) isn't changing.

I think i is changing from outer loop

A reader, September 30, 2005 - 3:01 pm UTC

Tom, i is coming from outer loop [for i in 1..]. isnt't it?


Tom Kyte
October 01, 2005 - 8:15 pm UTC

yes, but inner loop is inside outer loop - once inner loop entered - forget about outer loop, there is no escape!

Sorry. I got it

A reader, September 30, 2005 - 3:02 pm UTC

Sorry, I got what you are trying to say.

Thanks.

Rocketship vs PL/SQL lookup

Ian, October 04, 2005 - 1:47 pm UTC

Tom,
nice discussion and some very useful info.
Here's the skinny - i have inherited some code often referred to as "the rocketship" because of the DECODE lookup shape. Unfortunetly the rocketship could use a few boosters. Basically it creates an 8 million row flat file from a SQL query that looks up rates to apply to a sales amount. I had initially rewrote this to do the lookups in a PL/SQL cursor and then tried using PL/SQL tables but both were slower which agrees with this thread.
To simplify things i stipped out 90% of the code from the query below...its pretty ugly and has 4 DECODE rockets.
The query attempts 8 different senarios to aquire an accrual rate from the same rate table...for example it first tries to get the rate from S1 if it finds a rate based on CUST_ID and UPC_CODE...then it will try from S2 based on CUST_ID,BRAND and so on until S8.

My question is does using DECODES joined to 8 senario tables seem like a reasonable approach to doing lookups or perhaps there is another way to skin this cat. Thanks for your time.


select account_num,
trim(to_char(sum(
decode(S1.ACCRUAL_RATE,NULL,
decode(S2.ACCRUAL_RATE,NULL,
decode(S3.ACCRUAL_RATE,NULL,
decode(S4.ACCRUAL_RATE,NULL,
decode(S5.ACCRUAL_RATE,NULL,
decode(S6.ACCRUAL_RATE,NULL,
decode(S7.ACCRUAL_RATE,NULL,
decode(S8.ACCRUAL_RATE,NULL,0,S8.ACCRUAL_RATE * sales_amt ),
S7.ACCRUAL_RATE * sales_amt ),
S6.ACCRUAL_RATE * sales_amt ),
S5.ACCRUAL_RATE * sales_amt ),
S4.ACCRUAL_RATE * sales_amt ),
S3.ACCRUAL_RATE * sales_amt ),
S2.ACCRUAL_RATE * sales_amt ),
S1.ACCRUAL_RATE * sales_amt )
),'9999999990.00'))
from weely_sales_summary x,
-- Scenario 1
( SELECT CUST_ID, UPC_CODE, ACCRUAL_RATE
FROM RATE_TABLE WHERE CUST_ID != '0000000000' and UPC_CODE != 'XXXXXXXXXXXX') S1,
-- Scenario 2
(select CUST_ID, brand_code, family_code, ACCRUAL_RATE
from RATE_TABLE
where CUST_ID != '0000000000' and brand_code != 'XXX' and family_code != 'XXX') S2,
-- Scenario 3
(select CUST_ID,brand_code, ACCRUAL_RATE
from RATE_TABLE where CUST_ID != '0000000000' and brand_code != 'XXX' and family_code = 'XXX') S3,
-- Scenario 4
(select CUST_ID,CAT_CD, ACCRUAL_RATE
from RATE_TABLE where CUST_ID != '0000000000' and brand_code = 'XXX' and family_code = 'XXX') S4,
-- Scenario 5
( SELECT UPC_CODE, ACCRUAL_RATE
FROM RATE_TABLE WHERE CUST_ID = '0000000000' and UPC_CODE != 'XXXXXXXXXXXX') s5,
-- Scenario 6
(select brand_code, family_code, ACCRUAL_RATE
from RATE_TABLE where CUST_ID = '0000000000' and brand_code != 'XXX'
and family_code != 'XXX' and UPC_CODE = 'XXXXXXXXXXXX' and CS_UPC_VRSN_NR_CD = 'XX') S6,
-- Scenario 7
(select CAT_CD,brand_code, ACCRUAL_RATE
from RATE_TABLE where CUST_ID = '0000000000' and brand_code != 'XXX' and family_code = 'XXX') S7,
-- Scenario 8
(select CAT_CD, ACCRUAL_RATE
from RATE_TABLE where CUST_ID = '0000000000' and brand_code = 'XXX' AND family_code = 'XXX') S8
where x.CUST_ID = S1.CUST_ID (+)
and x.UPC_CODE = S1.UPC_CODE(+)
--
AND x.CUST_ID = S2.CUST_ID (+)
and x.BRND_CD = S2.brand_code (+)
and x.family_code = S2.family_code (+)
--
and x.CUST_ID = S3.CUST_ID (+)
and x.BRND_CD = S3.brand_code (+)
--
and x.CUST_ID = S4.CUST_ID (+)
and x.cat_cd = S4.CAT_CD (+)
--
and x.UPC_CODE = S5.UPC_CODE (+)
--
and x.BRND_CD = S6.brand_code (+)
and x.family_code = S6.family_code (+)
--
and x.BRND_CD = S7.brand_code (+)
--
and x.CAT_CD = S8.CAT_CD (+)
group by account_num
;

Tom Kyte
October 04, 2005 - 6:15 pm UTC

ops$tkyte@ORA9IR2> create table t ( x int, y int, z int );
 
Table created.
 
ops$tkyte@ORA9IR2> insert into t values ( null, null, 1 );
 
1 row created.
 
ops$tkyte@ORA9IR2> insert into t values ( null, 2, 1 );
 
1 row created.
 
ops$tkyte@ORA9IR2> insert into t values ( 3, 2, 1 );
 
1 row created.
 
ops$tkyte@ORA9IR2> insert into t values ( null, null, null );
 
1 row created.
 
ops$tkyte@ORA9IR2> select coalesce(x,y,z,0) from t;
 
COALESCE(X,Y,Z,0)
-----------------
                1
                2
                3
                0
 
(nicer than decode...)


 

coalesce

Ian, October 05, 2005 - 12:19 pm UTC

thanks for the input.

looks like a useful function, unfortunately I forgot to mention that we are running on 8.1.7 and COALESCE isn't available until 9.0.1.....my bad.

Procedural vs set-based

A reader, October 25, 2005 - 4:36 pm UTC

Tom has proven many times that rewriting procedural code into set-based code is faster.

I was wondering... how does Oracle implement/process the set-based approach? Any chance that Oracle internally breaks it up into procedural code? As fas as I know, you need special hardware (e.g. vector computer) to actually process arrays as if they are scalars. How does Oracle process a statement such as:

SELECT * FROM TABLE_A
MINUS
SELECT * FROM TABLE_B

This construct is much more efficient that trying to do it procedurally. Still, how can Oracle execute the query without looking at the tables row by row?

Hopefully, the answer doesn't violate any proprietary laws. :-)



Tom Kyte
October 26, 2005 - 11:31 am UTC

The point is - everything is procedural code at the lowest level.

We can just optimize the overall SQL / data processing request - if you give us a single sql statement.

We cannot optimize your procedure code that executes thousands or billions of sql statements.


Ask us to do all of the work - instead of you trying to do the work in your own code.

Using dbms_alert to signal data changes

malcolm, July 18, 2006 - 6:10 am UTC

Hi Tom,

I'm using pl/sql associative arrays to perform lookups on nearly static data - perhaps an update every hour or so. On 9iR2.

I completely agree with you that it's better to join any lookup tables, and let the optimiser do its stuff, but it's not really relevent to what am I doing here. The data I'm am caching in arrays is just config data - just to control the flow of processing.

The (age-old) problem is that I need to detect when the cached data has changed, and reload. This is especially relevent as some sessions could be members of connection pools, and almost never disconnect.

In a followup a third of way down the page, you suggest using a trigger and populating a index organised table. What do you think of the trigger using dbms_alert.signal(alert), and each time the cache is checked, using dbms_alert.waitone(alert, message, status, 0) instead?

I've traced dbms_alert.waitone and it uses a table called sys.dbms_alert_info under the scenes. I thought this might be more efficient, because if the table is owned by sys, it goes into the dictionary cache, and the dictionary cache must be more efficient than the buffer cache, because it caches formatted rows, not blocks.

Or is my assumption that all tables owned by sys go into the dictionary cache incorrect? In which case, exactly which tables are cached in the dictionary cache?



Tom Kyte
July 18, 2006 - 8:33 am UTC

it does not go into the dictionary cache. It is just a table. in the dictionary cache goes our metadata - not every block from sys owned tables.



Kumar, December 05, 2006 - 2:00 am UTC

There is a limitation in using the PL/SQL arrays. In my case the maximum subscript that an associative array have is 2147483648. It seems to be the limitation of OS & database version (9.2.0.7).

I would like to know How I can overcome this Limitation.
I am trying to use PL/SQL table lookup for loading fact records. I have 10 dimensions. These dimension tables will act as lookup table. Each dimension will have maximum of 10,000 records (eventhough I have only 10000 records,few of the my subscript values are more that 2GB). So when loading the fact records I would like to avoid reading dimension tables for each records and instead use PL/SQL table as lookup. Please advise.
== My script
declare

type NumberType is table of number index by pls_integer;

l_doper_nkid NumberType;
l_doper_skid NumberType;
l_doper_lkup NumberType;

v number :=8888888888;

Begin
select sk_destination_operator_id,destination_operator_id_nk bulk collect
into l_doper_skid,l_doper_nkid
from dt_destination_operator_v;
for i in 1 .. l_doper_nkid.count
loop
dbms_output.put_line('test '||l_doper_skid(i));
l_doper_lkup(l_doper_nkid(i)) := l_doper_skid(i);
end loop;
dbms_output.put_line('sk_operator_id value for '||v||' :'||l_doper_lkup(v));
Exception
when no_data_found then
dbms_output.put_line('Operator Surrogate Key not found for '||v);
End;
=================




Tom Kyte
December 05, 2006 - 9:55 pm UTC

you know, if you have more than 2gb of subscripts - you really ought to....

well.....

rethink?

that would not be a "smart" thing.

(databases JOIN really good, you might consider using them to do that?)

Hard-coded CASE vs SQL lookup

Sunj, March 09, 2007 - 11:06 am UTC

As compared to table lookup and PL/SQL lookup, I find hard-coded CASE statements are the fastest. I do not like the idea of hard-coding, but I have to pick performance over maintainability.

My example is for a country lookup for a web-service application. Although per-search the timing is not significant, but in case the user uses my web-service in a batch, I need to my service as lean as possible.
My country table has been relatively static.

I generated this block of CASE statement of 250 conditions.

CREATE OR REPLACE FUNCTION map_iso_country_cd (iso_code VARCHAR2)
   RETURN NUMBER
IS
   vcountryid   PLS_INTEGER;
BEGIN
   CASE iso_code
         WHEN 'VCT' THEN vcountryID  := 63;
         WHEN 'SAU' THEN vcountryID  := 64;
         WHEN 'SCO' THEN vcountryID  := 65;
         WHEN 'USA' THEN vcountryID  := 1;
         WHEN 'CAN' THEN vcountryID  := 2;
         WHEN 'AFG' THEN vcountryID  := 7;
         -- about 200 more country codes
         WHEN 'ZWE' THEN vcountryid := 250;
   END CASE;

   RETURN vcountryid;
END;

This is my lookup version of function:
CREATE OR REPLACE FUNCTION map_iso_country_cd_TAB (iISO_code VARCHAR2)
   RETURN NUMBER
IS
   vcountryid   PLS_INTEGER;
BEGIN

      SELECT country_id 
        INTO vcountryid
        FROM country c
       WHERE iso_code = iISO_code;

   RETURN vcountryid;
END;


A million executions, came back in .06 seconds for the first one, whereas the second one came back in 50 seconds.


Could you suggest a more civilized approach please. :)
thanks

Tom Kyte
March 12, 2007 - 1:46 pm UTC

My idea is.....
do away with the lookup all together....

JUST JOIN

Databases where BORN TO JOIN, just join

see my comments above, for example:

... I betcha using a single INSERT -- removing the procedural code all together -- would get rid of the
rest of the time. .....


REMOVE the CODE


More food for thought.... YAWTDI

yet another way to do it.......


ops$tkyte%ORA10GR2> /*
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> drop table lookup;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table lookup
ops$tkyte%ORA10GR2> ( c_code varchar2(3) primary key,
ops$tkyte%ORA10GR2>   c_id   number
ops$tkyte%ORA10GR2> )
ops$tkyte%ORA10GR2> organization index
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into lookup
ops$tkyte%ORA10GR2> select *
ops$tkyte%ORA10GR2>   from (
ops$tkyte%ORA10GR2> select substr( object_name, 1, 3 ), max(object_id)
ops$tkyte%ORA10GR2>   from all_objects
ops$tkyte%ORA10GR2>  group by substr(object_name,1,3)
ops$tkyte%ORA10GR2>  order by dbms_random.random
ops$tkyte%ORA10GR2>        )
ops$tkyte%ORA10GR2>  where rownum <= 225
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> declare
ops$tkyte%ORA10GR2>     l_case long := 'case p_iso_code ' || chr(10);
ops$tkyte%ORA10GR2> begin
ops$tkyte%ORA10GR2>     for x in ( select * from lookup )
ops$tkyte%ORA10GR2>     loop
ops$tkyte%ORA10GR2>         l_case := l_case || ' when ''' || x.c_code || ''' then l_c_id := ' || x.c_id || ';' || chr(10);
ops$tkyte%ORA10GR2>     end loop;
ops$tkyte%ORA10GR2>     l_case := l_case || 'end case;' || chr(10);
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>     execute immediate q'|
ops$tkyte%ORA10GR2> create or replace function map_iso_country_cd( p_iso_code in varchar2 )
ops$tkyte%ORA10GR2> return number
ops$tkyte%ORA10GR2> as
ops$tkyte%ORA10GR2>     l_c_id number;
ops$tkyte%ORA10GR2> begin |' || l_case || '
ops$tkyte%ORA10GR2> return l_c_id;
ops$tkyte%ORA10GR2> end;';
ops$tkyte%ORA10GR2> end;
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create or replace function map_iso_country_cd_tab( p_c_code in varchar2 )
ops$tkyte%ORA10GR2> return number
ops$tkyte%ORA10GR2> as
ops$tkyte%ORA10GR2>     l_c_id number;
ops$tkyte%ORA10GR2> begin
ops$tkyte%ORA10GR2>     select c_id into l_c_id
ops$tkyte%ORA10GR2>       from lookup
ops$tkyte%ORA10GR2>      where c_code = p_c_code;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>     return l_c_id;
ops$tkyte%ORA10GR2> end;
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create or replace package my_pkg
ops$tkyte%ORA10GR2> as
ops$tkyte%ORA10GR2>     type array is table of number index by varchar2(3);
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>     g_lookup array;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>     function map_iso_country_cd( p_c_code in varchar2 ) return number;
ops$tkyte%ORA10GR2> end;
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create or replace package body my_pkg
ops$tkyte%ORA10GR2> as
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> function map_iso_country_cd( p_c_code in varchar2 )
ops$tkyte%ORA10GR2> return number
ops$tkyte%ORA10GR2> is
ops$tkyte%ORA10GR2> begin
ops$tkyte%ORA10GR2>     return g_lookup(p_c_code);
ops$tkyte%ORA10GR2> end;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> begin
ops$tkyte%ORA10GR2>     for x in (select * from lookup)
ops$tkyte%ORA10GR2>     loop
ops$tkyte%ORA10GR2>         g_lookup(x.c_code) := x.c_id;
ops$tkyte%ORA10GR2>     end loop;
ops$tkyte%ORA10GR2> end;
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2> */
ops$tkyte%ORA10GR2> connect /
Connected.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> set timing on
ops$tkyte%ORA10GR2> declare
  2      type array is table of varchar2(3) index by binary_integer;
  3      l_data array;
  4      l_num number;
  5  begin
  6      select c_code bulk collect into l_data from lookup;
  7
  8      for i in 1 .. 1000
  9      loop
 10          for i in 1 .. l_data.count
 11          loop
 12              l_num := map_iso_country_cd( l_data(i) );
 13          end loop;
 14      end loop;
 15  end;
 16  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.66
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> declare
  2      type array is table of varchar2(3) index by binary_integer;
  3      l_data array;
  4      l_num number;
  5  begin
  6      select c_code bulk collect into l_data from lookup;
  7
  8      for i in 1 .. 1000
  9      loop
 10          for i in 1 .. l_data.count
 11          loop
 12              l_num := map_iso_country_cd_tab( l_data(i) );
 13          end loop;
 14      end loop;
 15  end;
 16  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:08.00
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> declare
  2      type array is table of varchar2(3) index by binary_integer;
  3      l_data array;
  4      l_num number;
  5  begin
  6      select c_code bulk collect into l_data from lookup;
  7
  8      for i in 1 .. 1000
  9      loop
 10          for i in 1 .. l_data.count
 11          loop
 12              l_num := my_pkg.map_iso_country_cd( l_data(i) );
 13          end loop;
 14      end loop;
 15  end;
 16  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.25
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>



Hard-coded CASE vs SQL lookup

Sunj, March 21, 2007 - 5:42 pm UTC

Thanks Tom, using a PL/SQL array/table is a cleaner alternative.
But .25 seconds for a thousand look-ups still does not come close to the hard-coded CASE statement's timing of .06 seconds for a million lookups.

Also each time a new user session calls the Package, wouldn't the array be loaded all over again. I agree this will be definitely efficient, if you are still within same Oracle session, especially in PL/SQL LOOPs.
Tom Kyte
March 21, 2007 - 8:06 pm UTC

you seem to have missed the crux here.

case was SLOWER than associative array.

My machine is not your machine
My times are not your times.

CASE:
12 l_num := map_iso_country_cd( l_data(i) );
Elapsed: 00:00:01.66

TABLE:
12 l_num := map_iso_country_cd_tab( l_data(i) );
Elapsed: 00:00:08.00

ASSOCIATIVE ARRAY:
12 l_num := my_pkg.map_iso_country_cd( l_data(i) );
Elapsed: 00:00:00.25


and how many sessions does it take to process this?!?! if the overhead of loading the table becomes an issue - you have a seriously bad architecture problem that we need to fix first!!!

Memory Lookups vs. Database Lookups

Towhidun Nabi, May 09, 2007 - 8:49 am UTC

Hello,

I was wondering if there is anything wrong with memory lookups.

I must say I am a novice programmer comparing to some of the discussions above.

I am working with a selling company where we have right now 13 depots around the country. We have about 250+ items to sell, and about 20,000+ customers per depots, and per day 1000-1500 invoices per depots. We have a defined Price for each products, Some special prices for a few customers, Some bonus Products, various discount policies and few other tid bits.

I am developing an Invoicing Program with Oracle Forms 6i, Normal form (not web based since my lack of knowledge in web base programs). On the Form Init (new form instance), I load the Item table, Regular Price Table, Bonus Products Table, Special Party, Sale Stop Products Table....all in the clients memory, as I create several form level populated Record Groups to hold the data.

When the user enters an Item code, it is verified (looked-up) against the memory based Record Groups, I did develop a binary searching module to search the record group, as the group is sorted on Item Codes. Since the No of products is very few (< 300). It is working quite ok.

My point is, in the whole program, I never looked up at the database for the name of a product, as everything is in the memory already, as memory is getting cheaper, we have 256MB ram in each Client PCs, so pre-loading some 10-20k data in the memory does not seem to cause much trouble, Only trouble is when one or two new products are entered, the user gets an Invalid Item Code, they have a magin button at the corner of the form saying "Refresh Code", and they are ok again.

I do all the look up for Product Name, Bonus Infos, Special Prices in the Memory with Binary search as most of this are static for a period of time(Like a Month), and is is refreshed when the user run the program and very few time the user needs to refresh.

Only thing I look up in the database in the course of the entry system, is the STOCK of a product, off course.

As I thought, I do least required database access, least network round trips, faster response at the user end...blah blah blah. I do call a stored procedure to get the STOCK, by the way.

I was wondering if theres anything seriously wrong with the approach.

Towhidun Nabi.
IT Executive
Aristopharma Ltd.
Dhaka
Bangladesh.
Tom Kyte
May 11, 2007 - 9:49 am UTC

you still NEED to have the database constraints

so you have data integrity.

there is nothing inherently wrong with client side validation - it is just a fact that when you post to the database you MUST validate the data, using the database

else you will have data integrity problems.

Re: Memory Lookups vs. Database Lookups

Towhidun Nabi, May 12, 2007 - 5:47 am UTC

Dear Sir,

Thank you for your response. Don't get me wrong, I am very much a fan of this site, of you. This is by so far, one of the best, and genuine site, only it is kind of not easy for new - comers.

I do have database constraints on the database as foreign keys, So it is not possible for the user to enter an invalid item code in the database, as when we post the data, it is validated against the foreign keys.

Only I try to minimize the chance of invalid data before even posting the data, as the Pre-loaded client side Record Groups (sort of arrays) are usually valid, chance of data being invalidated after data load is very very minimal. There is the chance of missing code, that is often minimized, as the client usualy gets a message before hand as New Products are launching, All they have to do is refresh Code, or re-enter the program again, and this is only one / two times per whole month.

So, I confidently use memory lookups for all the client level validation, as I know it will be validated against the Database upon data POST / insert /update.

For queries and report, Most of time I try my level best NOT TO JOIN for complex queries, but to supply data from Memory.

So Instead of

Select i.prod_code, p.prod_name, c.customer_name,
  .......
  .. 
  ..
  ..
from 
  Issue_Table i,
  Product_table p,
  Customer_table c,
  Market_table m,
  Territorie_table t,
  sr_names sr, -- delivery personal
  ..
  .
where i.inv_date >= to_date('01-MAY-2007', 'dd-MON-rrrr')
and i.inv_date < to_date('01-JUN-2007', 'dd-MON-rrrr')
and i.prod_code = p.prod_code
and i.cust_code = c.cust_code
and c.market_code = m.market_code
and m.territorie_code = t.territorie_code
and i.sr_code = sr.sr_code
...
...


I simply do


show_status('Initializing...');

rg_id := find_group('Query_Data');
if not id_null(rg_id) then
  delete_group(rg_id);
end if;

rg_id := create_group('Query_Data');
gc_id := add_group_column(rg_id, 'Prod_Code', CHAR_COLUMN, 9);
gc_id := add_group_column(rg_id, 'Prod_Name', CHAR_COLUMN, 50);
gc_id := add_group_column(rg_id, 'Customer_Name', CHAR_COLUMN, 100);

gc_id := add_group_column(rg_id, 'Market_Name', CHAR_COLUMN, 100);

gc_id := add_group_column(rg_id, 'Territory_MR_Name', CHAR_COLUMN, 100);

gc_id := add_group_column(rg_id, 'SR_Name', CHAR_COLUMN, 100);
..
..
..
all the fields necessary.....
..
..

show_status('Loading Issue Data...');

rg_issue := find_group('issue');
if not id_null(rg_issue) then
  delete_group(rg_issue);
end if;

M_Issue_Table := 'Issue_Table';
-- change issue table name for history data
if history_data then
  M_Issue_Table := 'history_Issue_Table_' || month_year;
end if;

rg_issue := create_group_from_query('issue',
  'select * from ' || M_Issue_Table || ' '
  || 'where inv_date >= to_date(''' || M_StartDateStr || ''', ''dd-MON-rrrr'') '
  || 'and inv_date < to_date(''' || M_EndDateStr2 || ''', ''dd-MON-rrrr'') '
  || 'order by invoiceno', FORM_SCOPE, <b>100</b>);

if id_null(rg_issue) then
  message('Error: ' || dbms_error_text);
 ..
  raise form_trigger_failure;
end if;
 
M_ErrCode := populate_group(rg_issue);

M_NoOfIssue := get_group_row_count(rg_issue);

M_No_OF_Query_Data := 0;

for i in 1..M_NoOfIssue loop
  if mod(i, 100) = 0 then
    show_status('Processing Issue Data...' || i || ' of ' || M_NoOfIssue);
  end if;

  M_Prod_Code := get_group_char_cell('issue.prod_code', i);
  M_Prod_Name := ' ';

  M_Pos := bin_search_char('Products', M_NoOfProducts, M_Prod_Code);
  if M_Pos <> 0 then
    M_Prod_Name := get_group_char_cell('Products.prod_name', M_Pos);
  else
<b>    show_error ('Product Code: ' || M_Prod_Code || ' NOT found, may need to refresh');</b>
   return;
  end if; 

  .. Territories
...


... SR Name

  .. Market Name..


  M_No_OF_Query_Data := M_No_OF_Query_Data + 1;
  add_group_row('Query_Data', End_OF_Group);

  set_group_char_cell('query_data.prod_code', M_No_OF_Query_Data, M_Prod_Code);

  set_group_char_cell('query_data.prod_name', M_No_OF_Query_Data, M_Prod_Name);

  .. market name

  .. territorie name

  .. sr name

  .. delivery schedule

  ....  
end loop;

-- finaly we've got query_data to work with
-- just send it to the report as data_parameter


A temporary non-query client-side record group is created, and the data is populated through loop/binary search. I never saw that error: Prod code ... NOT found, may need to refresh, as it is refreshed on form init(new form instance). I asked the user to sue me (or click refresh) when they see it.

I could have done it with a global temporary table, but again, it would disturb the database as we have to select it from again. Inserting into temporary table, still required SQL <=> PLSQL switching, rollback segment(I don't know for sure, may be), network round trip(off course).

As I thought, I am issueing a very simple, straight forward
'select * from Issue_Table where inv_date between start_date and end_date', in the whole report, Inv_date is indexed, where as it should have been a very heavy loaded query with 9 to 10 joines, most of then with (+) sign, for safety from inner/outer join, as I am sure all the data in issue_table in the date range, not something is missing as a result of missing sr_code as not all invoices will have a sr_code.

The 'Issue_Table' is variable, the user may want to view history data, every month, we put all the data into the history with MONTH_YEAR as suffix.

And also, I try to avoid

 
  for rec in (select * from table) loop
    ...
  end loop;

as I have seen if the table have 1000 data, there will be 1000 fetchs, 1000 PLSQL to SQL switch, and eventually 1000 network round trips.

So I try to do Record Groups with 100 Records, so there will be about 1 PLSQL to SQL switch, 10(more or less, depending on the size of the records) network round trips
for 1000 record.

I also have a show_status(msg) to see where I am now running...99.99% it shows loading data, only a very fraction of time (< 1%) it shows Processing...

I don't know, you guys would be laughing or making fun, of what I am doing, doing it the complex way..

As a first thought, I am doing a lot of work just to speed things up.

But I am happy for the outcome, as I think my hard work came to play when I see a report running and so. All my life I tried to minimize Network Round Trips, I don't want to select the same table twice in a same Report, and whenever I could pre-load a table, I do it.

I am not an Oracle/DBA expert, but I learned my lessons through practical experiences. If Oracle can do everything, what will I do????


Towhidun Nabi.
IT Executive
Aristopharma Ltd.
Dhaka
Bangladesh.
naeeym@hotmail.com

Tom Kyte
May 14, 2007 - 12:56 pm UTC

and as I said - there is nothing wrong with doing client side validation

you just have to do it in the database as well - and you seem to be saying you do that

so - you are all good.

load small tables to memory

Alex Callaghan, July 08, 2007 - 2:28 pm UTC

Hi

In your first response (almost 5 years ago), you said

"Did you know that databases were born to be queried? It is one of the things they do best."

Is that still true now days?

Look what Oracle PErformance Tuning Guide suggests (both 9i and 10g):

If possible, avoid requerying the same data, by caching frequently accessed data in the client program or middle tier.

Oracle 10g Performance tuning Course (Oracle University):

Data blocks and header blocks can be accessed by only one process at time. Several process trying to access the same blocks will create contention.


I am pointing this out because I have had several experience with database approach where you have packages querying small lookup table (from small as 10 rows to a couple hundred of rows) all the time in batch jobs and online users (more than 5000 connected users and active from 100 to 200). There were quite a bit problems with hot blocks, when the tables were load to pl/sql arrays the problems went away.

May be I come from c programming world but I do think if you can load things up to memory (and not shared) is a good way to improve things....



Tom Kyte
July 08, 2007 - 6:00 pm UTC

how about this:

JUST JOIN

why are you procedurally looking stuff up when you can JUST JOIN, JUST JOIN

databases where born to query
they were born to join
and they permit you to write to them as well (and are very good at doing that)

stop writing procedural code, start writing SQL.

PL/SQL Table question

A reader, August 02, 2007 - 11:42 am UTC

I hope you can help. I am trying to load a PL/SQL table with some data and I want it to be ordered by one of the columns. My example is below that does not work.

create table pet_preference (sid varchar2(1), person_name varchar2(10), pet_type varchar2(10));
ALTER TABLE pet_preference
ADD (CONSTRAINT person_name_PK PRIMARY KEY (person_name));

insert into pet_preference values ('1', 'lisa', 'cat');
insert into pet_preference values ('2', 'ken', 'dog');
insert into pet_preference values ('3', 'jeff', 'fish');
insert into pet_preference values ('4', 'steve', 'bird');
commit;

set serveroutput on size 1000000
declare
TYPE pet_t IS TABLE OF pet_preference%ROWTYPE
INDEX BY pet_preference.person_name%type;
pet pet_t;
BEGIN
for pet_rec in (select * from pet_preference)
loop
pet (pet_rec.sid) := pet_rec;
end loop;
for indx1 in pet.first .. pet.last
loop
dbms_output.put_line (indx1||' '||pet(indx1).person_name|| ' '||pet(indx1).pet_type);
end loop;
end;
The output is:

1 lisa cat
2 ken dog
3 jeff fish
4 steve bird

but what I want is:

3 jeff fish
2 ken dog
1 lisa cat
4 steve bird

Can you please tell me what I am doing wrong? I have searched the site and Oracle documentation but can't seem to find anything specific to this issue. Any help would be greatly appreciated. Thank you in advance. This is a great site and you are a great help to many people!
Tom Kyte
August 05, 2007 - 12:35 pm UTC

umm, arrays are - arrays

I don't see the logic in having "jeff fish" assigned to slot number 3 - why not slot #1


use "select * from pet_preferences ORDER BY SOMETHING"

to retrieve the data sorted and then print it out in order.

a reader, August 05, 2007 - 6:46 pm UTC

The order by did not work as I had tried it. I printed out the output that I wanted. The first column is an id that gets loaded. The point was that I want them sorted by the name not the id which is what is happening no matter I try,.
Tom Kyte
August 06, 2007 - 11:38 am UTC

they are in an array.

if you want them sorted in that array - you SELECT THEM OUT ORDERED AND PLACE THEM INTO THE ARRAY SORTED

Or, you SORT THE ARRAY.

Look at your code:

declare
      TYPE pet_t IS TABLE OF pet_preference%ROWTYPE
         INDEX BY pet_preference.person_name%type;
      pet pet_t;
   BEGIN
      for pet_rec in (select * from pet_preference)
      loop
         pet (pet_rec.sid) := pet_rec;
      end loop;  
      for indx1 in pet.first .. pet.last
      loop
         dbms_output.put_line (indx1||' '||pet(indx1).person_name|| ' '||pet(indx1).pet_type);
      end loop;      
   end;


you retrieve the data UTTERLY RANDOM - so, just put an ORDER BY on that query and you are *done*

I'm not getting it.

The order by *does work*, but you'll get as output:

1 jeff fish
2 ken dog
3 lisa cat
4 steve bird


which is just dandy since that 1,2,3,4 are array indices and rather *meaningless* (since you don't have an order by - the data is returned to you from the database in any order we fell like - not in the order of insertion or anything like that - so your random assignment of 1,2,3,4 to the rows is just that - rather random)

Just use ORDER BY and you are *done*

If you claim "order by doesn't work" we'll need to see your attempt so we can correct it.


the first column you are printing IS NOT AN ID, it is an index into an array, randomly assigned to the data each time it is fetch, it is not persistent data, it is rather meaningless.

arrays of different sizes

Kevin, February 02, 2008 - 9:26 pm UTC

Nice thread, I keep on refering to it. Recently had this question come along and I dont' have the indepth to answer it. Here it is:

What are the differences to expect (memory use?) between these two code snippets, recognizing that the only difference between themthe size of the varchar2 item at the root of each associative array?

declare
ta is table of varchar2(2000) index by binary_integer;
aa ta;
rc1 sys_refcursor;
begin
open rc1 for select object_name from dba_objects;
fetch rc1 bulk collect into aa;
close rc1;
end;


declare
ta is table of varchar2(32000) index by binary_integer;
aa ta;
rc1 sys_refcursor;
begin
open rc1 for select object_name from dba_objects;
fetch rc1 bulk collect into aa;
close rc1;
end;

I am wondering if there is benefit in defining multiple arrays in a particilar utility package I am working on, and then selecting the array that best fits the size of the data element I am working with.

OR

If one large array would be good for all purposes. The code is simpler to write with just one array definition, but not worth it if there are extreme memory requirements and/or performance changes when using arrays defined with wide string definitions.

Thanks, Kevin
Tom Kyte
February 04, 2008 - 3:47 pm UTC

large varchar2 sizes are dynamically allocated to "best fit", they are not preallocated.

this shows they are just about the same.

ops$tkyte%ORA10GR2> connect /
Connected.
ops$tkyte%ORA10GR2> set echo on
ops$tkyte%ORA10GR2> declare
  2     type ta is table of varchar2(2000) index by binary_integer;
  3     aa ta;
  4     rc1 sys_refcursor;
  5  begin
  6     open rc1 for select object_name from dba_objects;
  7     fetch rc1 bulk collect into aa;
  8     close rc1;
  9  end;
 10  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> @mystat pga
ops$tkyte%ORA10GR2> set echo off

NAME                        VALUE
---------------------- ----------
session pga memory         978516
session pga memory max    5762644

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> connect /
Connected.
ops$tkyte%ORA10GR2> set echo on
ops$tkyte%ORA10GR2> declare
  2     type ta is table of varchar2(32000) index by binary_integer;
  3     aa ta;
  4     rc1 sys_refcursor;
  5  begin
  6     open rc1 for select object_name from dba_objects;
  7     fetch rc1 bulk collect into aa;
  8     close rc1;
  9  end;
 10  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> @mystat pga
ops$tkyte%ORA10GR2> set echo off

NAME                        VALUE
---------------------- ----------
session pga memory         978516
session pga memory max    5697108


Just what I was looking for

Kevin, February 05, 2008 - 7:36 am UTC

Sorry, I should have checked the memory allocations myself (doh!).

Kevin

Christian, February 14, 2008 - 2:07 pm UTC

I like to process everything in a loop in pl/sql - is this a good programming habit ?

Christian
Tom Kyte
February 15, 2008 - 7:44 am UTC

not if it could have been done in a single sql statement.

bad:


for x in  (select * from t)
loop
    x.c1 := lower(x.c1);
    update t set c1 = x.c1 where pk = x.pk;
end loop;


good:

update t set c1 = lower(c1);



for a really simple example.

Never write procedural code

Until you have to

Hard-coded CASE vs SQL lookup

Sunj, March 21, 2008 - 7:00 pm UTC

Tom,
A very belated thanks (1 year old) for the response last year on March 21, 2007.

Your Associative Array solution was definitely the best.
I guess it was faster than SQL solution, because of lack of context switch between SQL and PL/Sql engines. Am I right?

Here's my experience:
1. SELECT col1 FROM my_table;
2. SELECT col1, my_function(col1) FROM my_table;
3. Enclose SQL#2 within dynamic SQL.

#1 was obviously the fastest
#2 was slowest
#3, even though is the same as #2, it performed much faster, because it was running fully within PL/SQL engine.
Tom Kyte
March 24, 2008 - 11:08 am UTC

I do not understand #3 at all


but I would add #4

select my_table.col1, my_lookup.col2 from my_table, my_lookup where <join>

as being the fastest of them all. (as stated by my time and time again above :) )

Fetch values from pl_sql table using column index instead of column_names

Maverick, April 07, 2008 - 4:22 pm UTC

Tom, I have been trying to do something with pl/sql Tables and got stuck at one place.

Here is the scenario:

  declare
      Type mytype is table of emp%rowtype;
      mytype1 mytype;
      k integer:=0;
      v_sql varchar2(4000):='Select * from emp ';
  begin
      execute immediate v_sql bulk collect into mytype1;

      for i in 1..mytype1.count
      loop
         dbms_output.put_line(mytype1(i).empno);
      end loop;
  end;


My problem is while outputting the data using DBMS_OUTPUT, is there any way to reference array with an index for column name instead of actual column name
like
dbms_output.put_line(mytype1(i).1stcolumn||','||mytype1(i).2ndcolumn);
?

Please don't ask me why I want to do this way , but I am interested to see if there is a way to do that?
oh..btw, I am using Oracle 10g [10.2.0] if it helps.

Thanks
Tom Kyte
April 09, 2008 - 9:32 am UTC


Please don't ask me why I want to do this way

fine, I won't - but that just means I won't be able to do anything useful for you. If I knew what your GOAL was, I might be able to suggest something useful.

But since you just want me to be a "answer my question and nothing but the question" guy - so be it.


The answer is "nope"

I was just comparing different languages

Maverick, April 09, 2008 - 9:59 am UTC

Tom, I was just trying to do something [I did it using PERL]in PL/SQL with Arrays. In PERL I can loop through a row and seperate each column of row and manipulate any way I want to. I don't have to know what are the actual columns in it.
I was just wondering if we can do the same in PL/SQL.

Trying to see differences between pl/sql and PERL or other programming languages.

So, I can still take the answer as no. correct?
Tom Kyte
April 09, 2008 - 2:54 pm UTC

correct, the answer is no, plsql is a static compiled language.

RE: just comparing different languages

Duke Ganote, April 09, 2008 - 11:09 am UTC

I think Tom's asking "what is your goal" because PL/SQL is an auxiliary to SQL. Hence his mantra:
if you can do it in a single sql, do it.
if not do it in as little plsql as possible...
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:5408706816128#30690569723524
Just as there's an "impedence mismatch" between "object oriented design" and "relational design", there's a similar mismatch between PERL (which seems to procedural and string-oriented), and SQL (which is set- and database-oriented).

There are similar mismatches between folks who say "In SQL Server I use local temp tables a lot, so how do I do use local temp tables in Oracle?" The answer is: temp tables are a crutch, don't use local temp tables in Oracle, there are better Oracle constructs for achieving your goal; e.g.
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1164655862293

Maverick, April 09, 2008 - 1:00 pm UTC

Dear Duke Ganote , I think we are talking two different things here.you are talking about Database or SQL [structured query language], but I am talking about Programming language [PL/SQL]. so, we can compare Pl/SQL with PERL or C or C# etc..[I might be wrong there!!]

But I was just trying to see if there is a way to do this in PL/SQL. and Tom, kind of mentioned we cannot [I guess].

Thanks

Thanks,

RE: compare Pl/SQL with PERL or C or C# etc

Duke Ganote, April 09, 2008 - 2:02 pm UTC

Maverick-- It just reminds my of one of my early software classes where I learned LISP, SNOBOL, SIMSCRIPT and a mathematical symbol manipulation package (it would manipulate differential equations) in one quarter. Each had its strengths and weaknesses, but they weren't usually comparable in functionality.

I did annoy my professor by using LISP (a list processing language used for artificial intelligence) to calculate some outrageously big Fibonacci number. LISP did it, but not very efficiently, and I think he was annoyed at how much CPU I'd chewed up overnight on the university's IBM 360 mainframe :)

Lookup Data From Associative Arrays

A reader, June 12, 2015 - 1:40 am UTC

Hi Tom,

I need to do a production data patch fix which can be done by a simple SQL statement but this a highly regulated environment (can be audited by FDA) and the simple update will not suffice as we need to produce a listing of the case data and before and after updates for each case impacted by the update. So the bottom line unfortunately is that we cannot use update and we would have to use pl sql to loop through each case. In order to improve the performance I am caching the lookup data in a associative array table and just looking up the value during sql bulk collect.

I am not giving the original code but a similar code where I am getting the same error.

DELETE COUNTRIES;

DROP TABLE COUNTRIES;

CREATE TABLE COUNTRIES
                      (
                        COUNTRY         VARCHAR2(100) NOT NULL,
                        COUNTRY_ID      NUMBER        CONSTRAINT COUNTRIES_PK PRIMARY KEY
                      );
                      
INSERT INTO COUNTRIES VALUES ('USA', 1);
INSERT INTO COUNTRIES VALUES ('United Kingdom', 2);
INSERT INTO COUNTRIES VALUES ('India', 3);

COMMIT;

DROP TABLE TOURIST;

CREATE TABLE TOURIST
                    (
                      FIRST_NAME            VARCHAR2(100),
                      LAST_NAME             VARCHAR2(100),
                      TOURIST_ID            NUMBER        CONSTRAINT TOURIST_PK  PRIMARY KEY,
                      COUNTRY_VISIT1_ID     NUMBER,
                      COUNTRY_VISIT2_ID     NUMBER,
                      CONSTRAINT TOURIST_FK1
                      FOREIGN KEY(COUNTRY_VISIT1_ID)
                      REFERENCES COUNTRIES(COUNTRY_ID)
                      ON DELETE CASCADE,
                      CONSTRAINT TOURIST_FK2
                      FOREIGN KEY(COUNTRY_VISIT2_ID)
                      REFERENCES COUNTRIES(COUNTRY_ID)
                      ON DELETE CASCADE
                    );
                    
INSERT INTO TOURIST VALUES ('ABC1', 'XYZ1', 1, 1,2);
INSERT INTO TOURIST VALUES ('ABC2', 'XYZ2', 2, 2,NULL);
INSERT INTO TOURIST VALUES ('ABC3', 'XYZ3', 3, NULL,NULL);
INSERT INTO TOURIST VALUES ('ABC4', 'XYZ4', 4, NULL,3);

COMMIT;


The above small DML/DDL script will setup the objects needed for the example. The below PL/SQL anonymous block is the code that was written

DECLARE
    TYPE TOURIST_TYPE_REC IS RECORD (
                                      TNAME           VARCHAR2(300),
                                      COUNTRY_VISIT1  COUNTRIES.COUNTRY%TYPE,
                                      COUNTRY_VISIT2  COUNTRIES.COUNTRY%TYPE
                                    );
    TYPE TOURIST_TYPE_TAB IS TABLE OF TOURIST_TYPE_REC;
    
    TYPE COUNTRY_CACHE_TYPE IS RECORD(COUNTRY COUNTRIES.COUNTRY%TYPE);
    TYPE COUNTRY_CACHE_TAB IS TABLE OF COUNTRY_CACHE_TYPE INDEX BY BINARY_INTEGER;
    
    TOURIST_DATA   TOURIST_TYPE_TAB;
    COUNTRY_CACHE  COUNTRY_CACHE_TAB;
    L_COUNTRY_REC  COUNTRY_CACHE_TYPE; 
BEGIN
    /* Cace the country details so that you dont have to look them up from table again*/
    FOR ALL_COUNTRIES_REC IN (SELECT COUNTRY, COUNTRY_ID FROM COUNTRIES UNION SELECT 'Unknown', -1 FROM DUAL)
    LOOP
          L_COUNTRY_REC.COUNTRY := ALL_COUNTRIES_REC.COUNTRY;
          COUNTRY_CACHE(ALL_COUNTRIES_REC.COUNTRY_ID) := L_COUNTRY_REC;
    END LOOP;
    
    SELECT LAST_NAME||', '||FIRST_NAME,
           COUNTRY_CACHE(NVL(COUNTRY_VISIT1_ID,-1)).COUNTRY,
           COUNTRY_CACHE(NVL(COUNTRY_VISIT2_ID,-1)).COUNTRY
    BULK COLLECT INTO TOURIST_DATA
    FROM TOURIST;
END;


The error coming from this program is as noted below.

Error starting at line : 1 in command -
DECLARE
    TYPE TOURIST_TYPE_REC IS RECORD (
                                      TNAME           VARCHAR2(300),
                                      COUNTRY_VISIT1  COUNTRIES.COUNTRY%TYPE,
                                      COUNTRY_VISIT2  COUNTRIES.COUNTRY%TYPE
                                    );
    TYPE TOURIST_TYPE_TAB IS TABLE OF TOURIST_TYPE_REC;
    
    TYPE COUNTRY_CACHE_TYPE IS RECORD(COUNTRY COUNTRIES.COUNTRY%TYPE);
    TYPE COUNTRY_CACHE_TAB IS TABLE OF COUNTRY_CACHE_TYPE INDEX BY BINARY_INTEGER;
    
    TOURIST_DATA   TOURIST_TYPE_TAB;
    COUNTRY_CACHE  COUNTRY_CACHE_TAB;
    L_COUNTRY_REC  COUNTRY_CACHE_TYPE; 
BEGIN
    /* Cace the country details so that you dont have to look them up from table again*/
    FOR ALL_COUNTRIES_REC IN (SELECT COUNTRY, COUNTRY_ID FROM COUNTRIES UNION SELECT 'Unknown', -1 FROM DUAL)
    LOOP
          L_COUNTRY_REC.COUNTRY := ALL_COUNTRIES_REC.COUNTRY;
          COUNTRY_CACHE(ALL_COUNTRIES_REC.COUNTRY_ID) := L_COUNTRY_REC;
    END LOOP;
    
    SELECT LAST_NAME||', '||FIRST_NAME,
           COUNTRY_CACHE(NVL(COUNTRY_VISIT1_ID,-1)).COUNTRY,
           COUNTRY_CACHE(NVL(COUNTRY_VISIT2_ID,-1)).COUNTRY
    BULK COLLECT INTO TOURIST_DATA
    FROM TOURIST;
END;
Error report -
ORA-06550: line 25, column 30:
PLS-00201: identifier 'COUNTRY_VISIT2_ID' must be declared
ORA-06550: line 25, column 30:
PLS-00201: identifier 'COUNTRY_VISIT2_ID' must be declared
ORA-06550: line 25, column 12:
PL/SQL: ORA-00904: : invalid identifier
ORA-06550: line 23, column 5:
PL/SQL: SQL Statement ignored
06550. 00000 -  "line %s, column %s:\n%s"
*Cause:    Usually a PL/SQL compilation error.
*Action:


Can you please tell me what I am doing wrong? If you can also recommend any other modifications (except for straight update :). It's been very frustrating :( ) in the anonymous block. My idea was to minimize the context switches. As this is a data patch which would be executed once and once only and during the deployment nobody would be logged in into the database and hence have the approval for using the maximum memory :). This is the reason I did not LIMIT my bulk collected rows.

Updated

Ravi, June 12, 2015 - 1:42 am UTC

The oracle version is Oracle 11GR2.

Thanks and Regards
Ravi