Home>Question Details



-- Thanks for the question regarding "load a pl/sql table with lookup table and use it in my program for look up values", version 8.1.7

Submitted on 15-Sep-2002 21:38 Central time zone
Last updated 9-Apr-2008 14:54

You Asked

Tom

Recently I was told that instead of repeatedly querying a lookup table for lookup 
values(like querying a description table for description using description_id), it is 
much more efficient to load up a pl/sql table and do a lookup on the pl/sql table.

1.Can you show me how we can load up a pl/sql table

2.How to do a look up using pl/sql table

3.If I want to load a pl/sql table with values from more than columns, can we do it. 
Arent pl/sql tables single dimensional? 

Thanks

Tom 

and we said...

Why do you believe all that you hear?  Did they give you a test case to "prove" this? 
Hmmmmm.

Did you know that databases were born to be queried?  It is one of the things they do 
best.


I would ask this person who gave you this advice for their test case that proves it.  In 
there you would find out

1) how to load the plsql table (eg: look up BULK COLLECT in the plsql manuals)
2) how to look up using plsql table (eg: they either wrote their own binary search -- 
which we happen to have in the database already, or they are using a simple, yet 
non-scalable linear search, start at index 1 and look till you find a match)
3) in 8i, yes they are -- you can bulk collect into a RECORD of ARRAYS however.

But, the scientific proof offered by this person will answer all of your questions. 

Reviews    
5 stars I think this was published in Oracle Professional   September 16, 2002 - 2pm Central time zone
Reviewer: Scott Watson from Canada
I beleive I read this in Oracle Professional written by Steve Feuerstein.  Here is a link to the 
article.  Basically it is for 9iR2 and uses associative arrays. 

http://www.oracleprofessionalnewsletter.com/op/OPmag.nsf/0/F2E754B65828DEF985256C1A0056DF3F
I can rationlize this article by saying that the PL/SQL engine does not have to perform a context 
switch to the SQL layer, however, using this method would require that each connection maintain a 
lookup table which may require more memory if the number of users using this approach is large.  
Like any good solution it requires testing in your enviroment etc. etc.

HTH 
Scott. 


Followup   September 16, 2002 - 8pm Central time zone:

In 9iR2 (not 817 and not before) this may be true with the new associative arrays which are b-tree 
indexed RAM data structures (very efficient for searching).  This is "brand new" at this point.  A 
quick and dirty test will show that they are in fact faster and probably more scalable but at the 
cost of RAM (tradeoffs as usual).

This will NOT work in 817 (note to the original questioner, this will not work for you)

ops$tkyte@ORA920.US.ORACLE.COM> create table t ( code varchar2(5) constraint t_pk primary key, data 
varchar2(40) )
  2  organization index
  3  /

Table created.

ops$tkyte@ORA920.US.ORACLE.COM> exec gen_data( 'T', 5000 );

PL/SQL procedure successfully completed.

ops$tkyte@ORA920.US.ORACLE.COM> select count(*) from t;

  COUNT(*)
----------
      5000

ops$tkyte@ORA920.US.ORACLE.COM>
ops$tkyte@ORA920.US.ORACLE.COM> declare
  2      type  lookupTable is table of varchar2(40) index by varchar2(5);
  3      type  indexByTable is table of varchar2(40) index by binary_integer;
  4
  5      l_lookupTable lookupTable;
  6      l_codes       indexByTable;
  7      l_values      indexByTable;
  8      l_value       varchar2(40);
  9
 10      l_start number;
 11      l_run1  number;
 12      l_run2  number;
 13  begin
 14      select code, data BULK COLLECT into l_codes, l_values from t;
 15
 16      insert into run_stats select 'before', stats.* from stats;
 17
 18      l_start := dbms_utility.get_time;
 19      for i in 1 .. l_codes.count
 20      loop
 21          select data into l_value from t where code = l_codes(i);
 22      end loop;
 23      l_run1 := (dbms_utility.get_time-l_start);
 24      dbms_output.put_line( l_run1 || ' hsecs' );
 25
 26      insert into run_stats select 'after 1', stats.* from stats;
 27      l_start := dbms_utility.get_time;
 28
 29
 30      for i in 1 .. l_codes.count
 31      loop
 32          l_lookupTable( l_codes(i) ) := l_values(i);
 33      end loop;
 34      for i in 1 .. l_codes.count
 35      loop
 36          l_value := l_lookupTable( l_codes(i) );
 37      end loop;
 38      l_run2 := (dbms_utility.get_time-l_start);
 39      dbms_output.put_line( l_run2 || ' hsecs' );
 40      dbms_output.put_line
 41      ( 'run 1 ran in ' || round(l_run1/l_run2*100,2) || '% of the time' );
 42
 43      insert into run_stats select 'after 2', stats.* from stats;
 44  end;
 45  /
74 hsecs
3 hsecs
run 1 ran in 2466.67% of the time

PL/SQL procedure successfully completed.

ops$tkyte@ORA920.US.ORACLE.COM> select a.name, b.value-a.value run1, c.value-b.value run2,
  2         ( (c.value-b.value)-(b.value-a.value)) diff
  3    from run_stats a, run_stats b, run_stats c
  4   where a.name = b.name
  5     and b.name = c.name
  6     and a.runid = 'before'
  7     and b.runid = 'after 1'
  8     and c.runid = 'after 2'
  9     and (c.value-a.value) > 0
 10     and (c.value-b.value) <> (b.value-a.value)
 11   order by abs( (c.value-b.value)-(b.value-a.value))
 12  /

NAME                                 RUN1       RUN2       DIFF
------------------------------ ---------- ---------- ----------
STAT...recursive calls               5031         50      -4981
STAT...calls to get snapshot s       5010         16      -4994
cn: kcmgss

STAT...execute count                 5005          7      -4998
STAT...index fetch by key            5000          1      -4999
LATCH.shared pool                    5127        123      -5004
LATCH.cache buffers chains          10167        291      -9876
STAT...session logical reads        10056        103      -9953
STAT...consistent gets              10012         24      -9988
STAT...consistent gets - exami      10000          2      -9998
nation

LATCH.library cache pin             10105        101     -10004
LATCH.library cache                 10258        248     -10010
STAT...session pga memory               0     476412     476412

62 rows selected.

ops$tkyte@ORA920.US.ORACLE.COM>


All in all, if you got the RAM and you got 9iR2, this makes sense.  Before then -- I question it.
 

3 stars Tried and Tested   September 16, 2002 - 3pm Central time zone
Reviewer: Adrian Billington from UK
Tom

Initialising package global index-bys with data from lookup tables, using the numeric-PK as the 
index (can be varchar2 now with associative arrays of course) is a tried and tested technique of 
returning lookup values much quicker than repeated database access. I took an hour off a one hour 
fifteen minutes INSERT using this method and I have a couple of demos "proving" them to be quicker 
(both methods have also been through your run_stats harness as well to great effect).

I am off work at the moment so can't get to the demos, but if no-one beats me to it, then I'll post 
something next week.

The SF article in Oracle Professional this month takes it from an associative array context, but I 
already have a demo of that as well and again, they are much more efficient than repeated lookup 
table access.

BUT, the only thing I have not tested against is repeated table access with a BUFFER POOL KEEP / 
KEEP CACHE, so maybe that's for someone else to try and report on.  But hey, SQL can't ALWAYS win, 
surely ;).

Regards

Adrian 


Followup   September 16, 2002 - 8pm Central time zone:

I betcha using a single INSERT -- removing the procedural code all together -- would get rid of the 
rest of the time.


I rarely see codes that are 1, 2, 3, 4, 5, .... N -- making the index by tables of some what 
limited use.  It is usually something like 

AK Alaska
VA Virginia
...

 

5 stars What about the other way around   September 17, 2002 - 8am Central time zone
Reviewer: Scott Watson from Montreal, Canada
Hi Tom,

I was thinking performing lookups in PL/SQL is faster than doing the same lookup in sql but what 
about the other way around.  (I don't have 9i so I cannot test this myself)  What would be faster 
in this case.

Select a.col1, b.col2
  from a, b
 where a.id = b.a_id;

or 

select a.col1, lookup_in_pl_sql(a.id)
  from a

Thanks,
Scott. 


Followup   September 17, 2002 - 12pm Central time zone:

Select a.col1, b.col2
  from a, b
 where a.id = b.a_id;


Try it and see. 

5 stars That is not what I meant   September 17, 2002 - 1pm Central time zone
Reviewer: Scott watson from Montreal, Canada
Funny!!! 

I know I can do the table lookup but what about the lookup using a PLSQL function in terms of the 
perfomance we have beed discussing.  We have already asserted that these types of lookups are 
faster when executing within PLSQL than SQL but what about when we access PLSQL from our sql 
statement.

select name_en, convert_to_french(name_en) PLSQL_FUNCTION
  from tableA

This is what I was refering to when I said I did not have 9i and could not test it. 


Followup   September 17, 2002 - 8pm Central time zone:

well, you could have tested with an INDEXED BY BINARY_INTEGER table (eg: number to string 
lookup)....

but here you go -- sql will be FASTER here:


ops$tkyte@ORA920.LOCALHOST> /*
DOC>drop table t;
DOC>
DOC>create table t ( code varchar2(5) constraint t_pk primary key, data varchar2(40) )
DOC>organization index
DOC>/
DOC>
DOC>exec gen_data( 'T', 5000 );
DOC>
DOC>select count(*) from t;
DOC>
DOC>pause
DOC>*/
ops$tkyte@ORA920.LOCALHOST> 
ops$tkyte@ORA920.LOCALHOST> 
ops$tkyte@ORA920.LOCALHOST> create or replace package demo_pkg
  2  as
  3           function get_value( p_code in varchar2 ) return varchar2;
  4  end;
  5  /

Package created.

ops$tkyte@ORA920.LOCALHOST> create or replace package body demo_pkg
  2  as
  3      type  indexByTable is table of varchar2(40) index by binary_integer;
  4      type  lookupTable is table of varchar2(40) index by varchar2(5);
  5
  6      l_lookupTable lookupTable;
  7
  8          function get_value( p_code in varchar2 ) return varchar2
  9          as
 10          l_codes       indexByTable;
 11          l_values      indexByTable;
 12          begin
 13                  if nvl( l_lookupTable.count,0 ) = 0
 14                  then
 15                          dbms_output.put_line( 'filling table' );
 16                  select code, data BULK COLLECT into l_codes, l_values from t;
 17                  for i in 1 .. l_codes.count
 18                  loop
 19                          l_lookupTable( l_codes(i) ) := l_values(i);
 20                  end loop;
 21                  end if;
 22                  return l_lookupTable( p_code );
 23          exception
 24                  when no_data_found then
 25                          raise program_error;
 26          end;
 27  end;
 28  /

Package body created.

ops$tkyte@ORA920.LOCALHOST>
ops$tkyte@ORA920.LOCALHOST>
ops$tkyte@ORA920.LOCALHOST> declare
  2       l_start number;
  3       l_run1  number;
  4       l_run2  number;
  5  begin
  6       insert into run_stats select 'before', stats.* from stats;
  7
  8       l_start := dbms_utility.get_time;
  9       for x in ( select a.code, b.data from t a, t b where a.code = b.code )
 10       loop
 11           null;
 12       end loop;
 13       l_run1 := (dbms_utility.get_time-l_start);
 14       dbms_output.put_line( l_run1 || ' hsecs' );
 15
 16       insert into run_stats select 'after 1', stats.* from stats;
 17       l_start := dbms_utility.get_time;
 18
 19
 20       for x in ( select a.code, demo_pkg.get_value( a.code ) from t a )
 21       loop
 22           null;
 23       end loop;
 24       l_run2 := (dbms_utility.get_time-l_start);
 25       dbms_output.put_line( l_run2 || ' hsecs' );
 26       dbms_output.put_line
 27       ( 'run 1 ran in ' || round(l_run1/l_run2*100,2) || '% of the time');
 28
 29       insert into run_stats select 'after 2', stats.* from stats;
 30  end;
 31  /
23 hsecs
filling table
65 hsecs
run 1 ran in 35.38% of the time

PL/SQL procedure successfully completed.

ops$tkyte@ORA920.LOCALHOST>
ops$tkyte@ORA920.LOCALHOST>
ops$tkyte@ORA920.LOCALHOST> select a.name, b.value-a.value run1, c.value-b.value run2,
  2         ( (c.value-b.value)-(b.value-a.value)) diff
  3    from run_stats a, run_stats b, run_stats c
  4   where a.name = b.name
  5     and b.name = c.name
  6     and a.runid = 'before'
  7     and b.runid = 'after 1'
  8     and c.runid = 'after 2'
  9     and (c.value-a.value) > 0
 10     and (c.value-b.value) <> (b.value-a.value)
 11   order by abs( (c.value-b.value)-(b.value-a.value))
 12  /

NAME                                 RUN1       RUN2       DIFF
------------------------------ ---------- ---------- ----------
....
LATCH.cache buffers chains          25135      10515     -14620
LATCH.library cache pin                95      30215      30120
LATCH.shared pool                     112      35210      35098
LATCH.library cache                   236      50445      50209
STAT...session uga memory               0     589176     589176
STAT...session pga memory         -131072     589824     720896

70 rows selected.

ops$tkyte@ORA920.LOCALHOST>

Also, I'll REVISE my earlier answer.  

Instead of doing it PROCEDURALLY, lets do the lookup where it belongs!  In SQL in the first place.  
EG: I would not be caught reading a table and then doing another query to LOOKUP data in the first 
place.  Only if I was doing some data load would I do that -- else, I would have JOINED in the 
first place...

What I mean by that -- is that the basic PREMISE that you should


  for x in ( select * from t1 )
  loop
      select value into l_var from t2 where t2.code = t1.code;
      .....


is flawed in the FIRST place, you should have JOINED (or selected a select)

Now that will outperform using a index by table (associated array or not)... 

3 stars Strange behaviour...   September 18, 2002 - 4pm Central time zone
Reviewer: Padders from UK (where else?)
Since when did collection.COUNT return NULL for empty collection? I think you'll find you never 
even populate your collection. 

This raises another issue (and I am testing on 9.2, btw) - if the collection is never populated, 
RETURN l_lookuptable (p_code) should raise NO_DATA_FOUND, and yet we get no error from test script, 
although if we call function from PL/SQL, we get NO_DATA_FOUND raised as expected, e.g.

SQL*Plus: Release 9.2.0.1.0 - Production on Wed Sep 18 21:12:43 2002

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.


Connected to:
Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.1.0 - Production

SQL> SELECT demo_pkg.get_value ('Rubbish') 
  2  FROM   dual;

DEMO_PKG.GET_VALUE('RUBBISH')
---------------------------------------------------------------------


SQL> BEGIN
  2    DBMS_OUTPUT.PUT_LINE (demo_pkg.get_value ('Rubbish'));
  3  END; 
  4  /
BEGIN
*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at "SCOTT.DEMO_PKG", line 28
ORA-06512: at line 2

Bizarrely, it has nothing to do with collection, since we can create function which just raises 
NO_DATA_FOUND and get the same behaviour, e.g.

SQL> CREATE OR REPLACE FUNCTION ndf
  2    RETURN VARCHAR2 
  3  IS
  4  BEGIN
  5    RAISE NO_DATA_FOUND;
  6  END;
  7  /

Function created.

SQL> SELECT ndf 
  2  FROM   dual;

NDF
----------------------------------------------------------------------


SQL> BEGIN
  2    DBMS_OUTPUT.PUT_LINE (ndf);
  3  END; 
  4  /
BEGIN
*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at "SCOTT.NDF", line 5
ORA-06512: at line 2

Note that the same does not appear to apply to other exceptions, and no I haven't tried them all, 
but VALUE_ERROR, for example, appears to work as expected, e.g.

SQL> CREATE OR REPLACE FUNCTION ve
  2    RETURN VARCHAR2 
  3  IS
  4  BEGIN
  5    RAISE VALUE_ERROR;
  6  END;
  7  /

Function created.

SQL> SELECT ve 
  2  FROM   dual;
SELECT ve
       *
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at "SCOTT.VE", line 5


SQL> BEGIN
  2    DBMS_OUTPUT.PUT_LINE (ndf);
  3  END; 
  4  /
BEGIN
*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at "SCOTT.NDF", line 5
ORA-06512: at line 2

Please confirm - have I lost the plot or is this expected behaviour (not the me losing the plot 
bit, the other bit)? 


Followup   September 18, 2002 - 6pm Central time zone:

whoops -- corrected that.  surprisingly, the results were not materially affected.  see above 

3 stars Still wondering...   September 19, 2002 - 4am Central time zone
Reviewer: Padders from UK
Thanks for correcting.

But why is NO_DATA_FOUND exception ignored by SQL, and NULL returned instead? 


Followup   September 19, 2002 - 7am Central time zone:

I guess cause the answer is "unknown"

just like if I 


select ( select * from dual where 1=0 ) from dual;


that does not fail, even though the subquery (function) returns NO_DATA_FOUND.  or

select * from dual where 1 = ( select 0 from dual where 1=0 )

the subquery returns NULL - well, actually the subquery doesn't return anything, it is "unknown" 
what it returns.  the function is treated in the same way. 

4 stars Some results...   September 19, 2002 - 5am Central time zone
Reviewer: Adrian Billington from UK
Tom

I agree with your comment about SQL being faster when the lookup table is available (i.e. just join 
to it), but I don't think that was ever disputed. We wouldn't write a PL/SQL function to do such 
simple SQL work in the case of a straight insert. But when you don't have access to that lookup 
table (i.e. during a data load), the use of an index-by (or extended assoc. array in 9i) beats the 
hell out of a database-accessing function.

On a simple 10,000 row insert demo, looking up the description from a lookup table for each FK row 
in the staging table, the results were as follows (consistently - ran about 4 times):-

SQL with join to lookup table........ 0 secs 16 hsecs
PL/SQL function using index-by....... 0 secs 67 hsecs
PL/SQL function using lookup table...10 secs 66 hsecs

Which I guess validates the assertions.

I would have thought that from an OLTP perspective, the use of index-bys for the common lookup 
tables should be encouraged, to save all those repeated accesses to the numerous lookup tables that 
OLTP systems seem to have, which is possibly relevant the original poster...

Regards

Adrian 


Followup   September 19, 2002 - 7am Central time zone:

Yes, that is exactly what i said tho:

...
Instead of doing it PROCEDURALLY, lets do the lookup where it belongs!  In SQL 
in the first place.  EG: I would not be caught reading a table and then doing 
another query to LOOKUP data in the first place.  Only if I was doing some data 
load would I do that -- else, I would have JOINED in the first place...
.....


What I see CONSTANTLY, continously and all of the time is code like

  for x in ( select * from t )
  loop
      select ... into ... from t2 where t2.code = t.code;
      ......


that should be a join, if you see yourself doing that, stop.

During a data load -- you are 100% correct.


(consider an OLTP application a loader and the premise continues ;) 

5 stars illustrate   September 19, 2002 - 8pm Central time zone
Reviewer: Nag 
"
What I see CONSTANTLY, continously and all of the time is code like

  for x in ( select * from t )
  loop
      select ... into ... from t2 where t2.code = t.code;
      ......
"


Tom, can you give us an example and show us , how a loop like the above can be converted into sql 


Followup   September 20, 2002 - 7am Central time zone:

It is called a join.  join t to t2. 

5 stars Have a look at this , and tell us wha tyou have to say   September 26, 2002 - 3pm Central time zone
Reviewer: Famy 
this is from quest-pipelines , a very well respected resource..


July's Tip of the Month 


Loading a Lookup Table in Memory to Speed Up Queries 

Compliments of Dan Clamage, PL/SQL Pipeline SYSOP (danielj@clamage.com) 

Recently, I worked with a developer who faced the challenge of speeding up his Pro*C application. 
It was taking about 2 hours to run. After examining his application, I discovered that he had one 
table on which every row returned in the driving loop required a random lookup. Furthermore, he 
could expect nearly every row in this lookup table to be accessed at some point over the life of 
the program. I instructed him to preload this moderately sized table (~50k rows) into a persistent 
PL/SQL table. In order to be persistent, the PL/SQL table had to be declared in the package body, 
outside of any routine: 

CREATE OR REPLACE
PACKAGE BODY lookup
IS
TYPE typ_val_data IS TABLE OF tbl_lookup.val_data%TYPE
  INDEX BY BINARY_INTEGER;
tab_val_data typ_val_data;
I explained to him that the numeric primary key could be used as the index offset, and the column 
value he needed could then be stored at that offset. For example:
Lookup Table 

OID NUMBER(10)  VAL_DATA VARCHAR2(4)
3               'GHGL'
27              'ACBF'
101             'LEFP'
157             'KPOI'
PL/SQL Table (also known as index-by table or array) 
Array Offset    Array Value
3               'GHGL'
**no data in array elements 4-26**
27              'ACBF'
**no data in array elements 28-100**
101             'LEFP'
**no data in array elements 158+**
157             'KPOI'
This would result in a sparsely populated array. He populated the array up front, as part of his 
program initialization, using a packaged procedure. This procedure simply read every row in his 
lookup table in a cursor FOR loop, and used the primary key as the element offset into the PL/SQL 
table. For example: 
PROCEDURE populate_lookup
IS
  CURSOR get_all_data IS
  SELECT oid, val_data
  FROM tbl_lookup;
BEGIN
  FOR rec IN get_all_data LOOP
    tab_val_data(rec.oid) := rec.val_data;
  END LOOP;
END populate_lookup;
Then he provided a packaged function to interface with this array. He could use this function in 
PL/SQL or in SQL. For example: 
FUNCTION get_val_data(p_oid IN tbl_lookup.oid%TYPE)
RETURN tbl_lookup.val_data%TYPE
IS
BEGIN
  RETURN(tab_val_data(p_oid));
EXCEPTION
  WHEN NO_DATA_FOUND THEN
  RETURN(NULL);
END get_val_data;
Inside his main loop, he coded something like this: 
INSERT INTO tbl_target (val_data, ...)
VALUES (lookup.get_val_data(fk_oid), ...);
He had to fully qualify the function, since it's being called from SQL. He reported that his 
program now runs in 20 minutes, a 5x speed improvement.  


Followup   September 26, 2002 - 4pm Central time zone:

And he could have speed it up 10x more by simply coding:

insert into tbl_target ( val_data, .... )
select max(t.val_data), :bv1, :bv2, :bv3, :b4, ....
  from tbl_lookup t
 where oid = :bv0;

or 80x faster by using array binds and the above!!!!

Here is the output from my testing (totally consistent, ran it many many times):


ops$tkyte@ORA920.US.ORACLE.COM> !./test

Connected to ORACLE as user: /

Using SQL 2002-09-26 16:38:11.041532
+000000000 00:00:02.851382000
Using plsql 2002-09-26 16:38:13.902126
+000000000 00:00:04.002790000
Using SQL 2002-09-26 16:38:17.915989
+000000000 00:00:00.534158000

ops$tkyte@ORA920.US.ORACLE.COM> select count(*) from t2;

  COUNT(*)
----------
      9000

ops$tkyte@ORA920.US.ORACLE.COM>

so, SQL -- no PLSQL = 2.8 seconds.  
PLSQL = 4 seconds

SQL with array binds of 100 = 0.58 seconds......



Here is my code:

drop table t;

create table t  ( oid primary key, val_data )
organization index
as
select rownum oid, object_name val_data from all_objects;


drop table t2;

create table t2 ( val_data varchar2(30), data varchar2(400) );

create or replace package pkg
as
    function f( n in number ) return varchar2;
end;
/

create or replace package body pkg
as

type lookup is table of varchar2(30) index by binary_integer;

g_data lookup;

function f( n in number ) return varchar2
is
begin
    return g_data(n);
exception
    when no_data_found then return null;
end;

begin
    for x in ( select * from t )
    loop
        g_data(x.oid) := x.val_data;
    end loop;
end;
/




#include <stdio.h>
#include <string.h>

static char *     USERID = "/";

#define SQLCA_INIT
EXEC SQL INCLUDE sqlca;

static void sqlerror_hard( int lineno)
{
    EXEC SQL WHENEVER SQLERROR CONTINUE;

    printf("\nORACLE error detected (%d):", lineno);
    printf("\n% .70s \n", sqlca.sqlerrm.sqlerrmc);

    EXEC SQL ROLLBACK WORK RELEASE;
    exit(1);
}

static void process1( void )
{
EXEC SQL BEGIN DECLARE SECTION;
varchar        data[400];
varchar        time[125];
int            i;
EXEC SQL END DECLARE SECTION;

    EXEC SQL WHENEVER SQLERROR DO sqlerror_hard(__LINE__);

    memset( data.arr,'*',400 );
    strncpy( data.arr, "process1", strlen("process1") );
    data.len = 400;

    EXEC SQL SELECT to_char(systimestamp,'YYYY-mm-dd hh24:mi:ss.ff')
               INTO :time
               FROM DUAL;
    printf( "Using SQL %.*s\n", time.len, time.arr );

    for( i = 0; i < 3000; i++ )
    {
        EXEC SQL insert into t2
        select max(val_data), :data
          from t
         where oid = :i;
    }
    EXEC SQL SELECT SYSTIMESTAMP-to_timestamp(:time,'yyyy-mm-dd hh24:mi:ss.ff') INTO :time FROM 
DUAL;
    printf( "%.*s\n", time.len, time.arr );
    EXEC SQL COMMIT;
}


static void process2( void )
{
EXEC SQL BEGIN DECLARE SECTION;
varchar        data[400];
varchar        time[125];
int            i;
EXEC SQL END DECLARE SECTION;

    EXEC SQL WHENEVER SQLERROR DO sqlerror_hard(__LINE__);

    memset( data.arr,'*',400 );
    strncpy( data.arr, "process2", strlen("process2") );
    data.len = 400;

    EXEC SQL SELECT to_char(systimestamp,'YYYY-mm-dd hh24:mi:ss.ff')
               INTO :time
               FROM DUAL;
    printf( "Using plsql %.*s\n", time.len, time.arr );

    for( i = 0; i < 3000; i++ )
    {
        EXEC SQL insert into t2
        values ( pkg.f(:i), :data );
    }
    EXEC SQL SELECT SYSTIMESTAMP-to_timestamp(:time,'yyyy-mm-dd hh24:mi:ss.ff') INTO :time FROM 
DUAL;
    printf( "%.*s\n", time.len, time.arr );
    EXEC SQL COMMIT;
}


static void process3( void )
{
EXEC SQL BEGIN DECLARE SECTION;
varchar        data[100][400];
varchar        time[125];
int            oids[100];
int            i;
EXEC SQL END DECLARE SECTION;

    EXEC SQL WHENEVER SQLERROR DO sqlerror_hard(__LINE__);

    for( i = 0; i < 100; i++ )
    {
        memset( data[i].arr,'*',400 );
        strncpy( data[i].arr, "process3", strlen("process3") );
        data[i].len = 400;
    }

    EXEC SQL SELECT to_char(systimestamp,'YYYY-mm-dd hh24:mi:ss.ff')
               INTO :time
               FROM DUAL;
    printf( "Using SQL with Array binds %.*s\n", time.len, time.arr );

    for( i = 0; i < 3000; i++ )
    {
        oids[ i%100 ] = i;

        if ( i%100 == 99 )
        {
            EXEC SQL insert into t2
            select max(val_data), :data
              from t
             where oid = :oids;
        }
    }
    EXEC SQL SELECT SYSTIMESTAMP-to_timestamp(:time,'yyyy-mm-dd hh24:mi:ss.ff') INTO :time FROM 
DUAL;
    printf( "%.*s\n", time.len, time.arr );
    EXEC SQL COMMIT;
}

main( argc, argv )
int    argc;
char *    argv[];
{
EXEC SQL BEGIN DECLARE SECTION;
VARCHAR   oracleid[50];
EXEC SQL END DECLARE SECTION;

    strcpy( oracleid.arr, USERID );
    oracleid.len = strlen(oracleid.arr);

    EXEC SQL WHENEVER SQLERROR DO sqlerror_hard(__LINE__);
    EXEC SQL CONNECT :oracleid;
    printf("\nConnected to ORACLE as user: %s\n\n", oracleid.arr);

    process1();
    process2();
    process3();

    exit(0);
}

 

4 stars Then why use pl/sql tables?   October 4, 2002 - 1pm Central time zone
Reviewer: Kashif from Houston, TX USA
Hi Tom,

Great discussion, thanks for the invaluable input. Just a quick couple of questions. Firstly, you 
mention:

"
What I see CONSTANTLY, continously and all of the time is code like

  for x in ( select * from t )
  loop
      select ... into ... from t2 where t2.code = t.code;
      ......
"

and you mention this is not the correct way of doing lookups. Does this apply when there is the 
need to outer join the driving table in the main query, i.e.

select t.*
from t, t2
where t.code (+) = t2.code

Is it not more efficient to read the main table and then read the lookup table, and return nulls 
(using exception when no_data_found then return null) when the lookup table doesn't retrieve 
anything?

Secondly, your response to the original poster seemed to indicate that pl/sql tables aren't always 
the best solution when it comes to lookup tables. Can you give examples of when they are the best 
solution? Thanks in advance.

Kashif 


Followup   October 4, 2002 - 6pm Central time zone:

I would outer join or select a select (eg:
  
   select dept.*, (select count(*) from emp where emp.deptno = dept.deptno )
     from dept

never procedurally loop running a select based on other selects if I can.

A plsql table to do some lookups might make sense when doing a flat file data load and you don't 
have external tables (eg: pre 9i)  in 9i and beyond, maybe NEVER. 

4 stars Whoops...   October 4, 2002 - 4pm Central time zone
Reviewer: Kashif from Houston, TX USA
I meant the outer join should be on the lookup table (t2), i.e. the query would be something like 

select t.*
from t, t2
where t.code = t2.code (+)

Wouldn't it be more efficient to simply read the rows needed from the driving table (t) and then do 
inner selects on the lookup (t2) table as needed? THanks.

Kashif
 


Followup   October 4, 2002 - 7pm Central time zone:

if you can do it in a single query -- do it, it's almost always faster, easier, more maintainable, 
faster to type, etc etc etc 

4 stars Decode on select (select)?   October 11, 2002 - 12pm Central time zone
Reviewer: Kashif from Houston, TX USA
Hi Tom,

Thanks for your tip, it works wonders in my query! Quick question, if I can control when I execute 
the 'select a select' somehow, for example using a decode, will that be more efficient versus 
executing the 'select a select' for each row? Example:

select deptno, decode (deptno, null, null, (select deptname from dept where dept.deptno = 
emp.deptno)
from emp;

versus

select deptno, (select deptname from dept where dept.deptno = emp.deptno) 
from emp

assuming the deptno in emp is not always populated. Thanks for your feedback in advance.

Kashif

P.S. Hope you're having fun at the VOUG conference! 


Followup   October 11, 2002 - 8pm Central time zone:

Well, lets see....


DOC>create table emp ( deptno int );
DOC>
DOC>create table dept as select * from scott.dept;
DOC>alter table dept add constraint dept_pk primary key(deptno);
DOC>
DOC>insert into emp
DOC>select decode( mod(rownum,5), 0, null, mod(rownum,5)*10 )
DOC>from all_objects;
DOC>*/
ops$tkyte@ORA920.US.ORACLE.COM>

ops$tkyte@ORA920.US.ORACLE.COM> set autotrace traceonly
ops$tkyte@ORA920.US.ORACLE.COM> select deptno, (select dname from dept where dept.deptno = 
emp.deptno)
  2    from emp
  3  /

29284 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'EMP'




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2012  consistent gets
          0  physical reads
          0  redo size
     570620  bytes sent via SQL*Net to client
      21971  bytes received via SQL*Net from client
       1954  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      29284  rows processed

ops$tkyte@ORA920.US.ORACLE.COM> select deptno, decode( deptno, null, null, (select dname from dept 
where dept.deptno = emp.deptno) )
  2    from emp
  3  /

29284 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'EMP'




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2012  consistent gets
          0  physical reads
          0  redo size
     570645  bytes sent via SQL*Net to client
      21971  bytes received via SQL*Net from client
       1954  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      29284  rows processed

ops$tkyte@ORA920.US.ORACLE.COM> select deptno from emp
  2  /

29284 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'EMP'




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2006  consistent gets
          0  physical reads
          0  redo size
     347959  bytes sent via SQL*Net to client
      21971  bytes received via SQL*Net from client
       1954  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      29284  rows processed

ops$tkyte@ORA920.US.ORACLE.COM> set autotrace off
ops$tkyte@ORA920.US.ORACLE.COM>



I threw in that last query to show you that there is "magic" happening with select of selects.  See 
the number of consistent gets on that last query?  Very very close to the others


There are optimizations happening here that go beyond what we can see. 

5 stars exec gen_data( 'T', 5000 );   October 12, 2002 - 4am Central time zone
Reviewer: A reader 
Hi Tom,
I've seen this procedure executed in the response above,
can you tell what the procedure is doing, Of course it is populating the table with test data, but 
how?

thanks in advance.

 


Followup   October 12, 2002 - 9am Central time zone:

see
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:1552006405645
code for gen_data is there. 

3 stars Comments on this solution?   October 25, 2002 - 4pm Central time zone
Reviewer: andrew from ca, usa
Tom
The URL below advocates pre-loading a pl/sql table for speed. The solution looks efficient to me 
and the performance claim sounds convincing. What do you think of it?
http://www.quest-pipelines.com/Pipelines/PLSQL/tips01.htm#july


Followup   October 26, 2002 - 12pm Central time zone:

its the same as the other discussions here.  

if you CANNOT DO IT in a single sql statement and must loop procedurally (which many people do when 
they SHOULD NOT), this can help.

If you can do it in a join in the first place, rather then lookup lookup lookup all of the time -- 
do it that way.

 

4 stars Whoops - nevermind...   October 25, 2002 - 4pm Central time zone
Reviewer: Andrew from ca, usa
Sorry Tom - I just saw your comments on the link above. 


4 stars Lookups in SGA?   July 31, 2003 - 8am Central time zone
Reviewer: Clark from Leeds, England, UK.
Tom, my first time, so please be gentle! And please excuse (and explain) any misunderstandings I 
may have of Oracle.

To summarise the whole thread, I think, and a common point you make in all discussions, do it in 
SQL unless it HAS to be PLSQL - so do a(n sql) JOIN to a lookup table and not an x-ref to a (plsql) 
memory structure pre-loaded with a lookup. (Tho, exception from 9iR2 noted, but I'm on 817.)

However, I often have the need to perform a lookup within plsql but without a join to another 
table, e.g to validate input. Right, now I think I've got this. This is an OLTP system, so it 
should be considered as a data loader and therefore keeping these lookups in plsql memory is an ok 
way of doing it. Yes?

So -- is there not some way of sticking them in some part of the SGA and avoiding duplicating UGA 
for each session? Something along the lines of a SERIALLY_REUSABLE package that doesn't lose state 
between calls would do the trick -- it would only ever need a SINGLE copy of the package, globals 
wouldn't need re-initialising on every call, etc.

Or am I worrying too much about memory/processing usage? Maybe there's so little overhead in doing 
a genuine sql lookup, from plsql, and letting Oracle store the lookups in the KEEP pool (thus 
managing SGA usage for me too) that I should just do it that way -- but comments about sql-plsql 
context switching lead me to believe that this SHOULD be avoided.

I could/can do straight performance tests for speed, but I am not experienced enough to understand 
and find out the wider implications regarding SGA usage etc. Speed is not always the primary 
concern in an OLTP system -- an extra 2 millisecs per user query, say, would not be of concern if 
you can shave 20% from the memory you need.

Many thanks for your time.
Cheers,
Clark. 


Followup   July 31, 2003 - 8am Central time zone:

give me a concrete example whereby plsql would have to perform a validation PRIOR to inserting the 
data to work with.  Me, I would let SQL do the validation itself.

If I needed to lookup some data, 


insert into t
select plsql_variable_a, plsql_variable_b, plsql_variable_c, column_x_from_table
  from lookup_table
 where lookup_table.key = plsql_variable_d;


that would do my "lookup" for me straight away.  SQL%rowcount would tell me if the lookup "failed"


If this was a data validation -- well, then I would use declaritive RI and let the database tell me 
when the data was garbage. 

1 stars lookup x-ref   July 31, 2003 - 9am Central time zone
Reviewer: Clark from Leeds,UK
Sorry, put it wrongly, I do use RI constraints, it wasn't for validation it was for x-ref.

The example I have just implemented uses a table of response codes (ids) with associated meanings 
and error-levels, eg.

create table response_codes as
 ( id       NUMBER
 , descr    VARCHAR2(100)
 , errlevel NUMBER
);

Crucially, MULTIPLE "detail" response codes are possible, which get stored in a child table to an 
"overall" response -- the overall being dictated by the worst-case detail, the highest 
response_codes.errlevel.

So, during validation, I log a detail response when a problem is found and keep a global 
g_worst_errlevel var which is a reflection of the worst response_codes.errlevel that has been 
found.

I could, of course, do an sql-query each time I have a detail response, but I have implemented the 
worst-lookup by pre-storing the lookup errlevels in a plsql table indexed by the id-col from 
response_codes. Here's the key proc:

  -- - - - - - - - - - - - - - - - - - - - - - - -
  PROCEDURE pr_detail_response
  -- - - - - - - - - - - - - - - - - - - - - - - -
    -- Acts on a 'detail' response
    ( pi_detail_response IN  s_o_TYPE )
  IS
  BEGIN
    g_detresps_TAB( pi_detail_response ) := NULL ;
    g_worst_err :=
      GREATEST
       ( g_worst_err
       , g_det_errlvls_TAB( pi_detail_response ) );
  END ;

g_detresps_TAB stores all detail responses that I've had**.

g_det_errlvls_TAB is the pre-stored lookup to the response_codes.errlevel col.

**In case you're wondering, it's not sensible to INSERT these at this point because the overall 
response-ID is the parent -- but I don't know my overall response until I've found out all my 
detail responses.

Thanks,
Clark 


5 stars   July 31, 2003 - 4pm Central time zone
Reviewer: Suri from PA


5 stars Loading Data into PL/SQL table   July 31, 2003 - 4pm Central time zone
Reviewer: Ketan. Bengali from Sunnyvale, CA, USA
Hi Tom,

My data is like this:
Level 1.1:
  Level 2.1 (Parent = Level1.1, Previous Sibling=NULL)
     Level 3.1.1(Parent = Level2.1, Prev Sibling=NULL)
     Level 3.1.2(Parent = Level2.1, Prev Sibling=Level 3.1.1)
  Level 2.2(Parent = Level1.1, Previous Sibling=Level2.1)
     Level 3.2.1(Parent = Level2.2, Prev Sibling=NULL)
     Level 3.2.2(Parent = Level2.2, Prev Sibling=Level 3.2.1)

Basically, it is hierarchial and ordered data. To display this data to the users, have created all 
the records in pl/sql table and provided view on top of it.

Package Test_Pack is 
    -- rec tree_hierarchy_array;
    function get_tree return tree_hierarchy_array;
    --
end test_pack;
--
View is :
    SELECT *
    FROM   TABLE(CAST(test_pack.get_tree() AS tree_hierarchy_array))

Today, I recreate pl/sql table whenever user selects from the view because variable, which stores 
pl/sql table, is local variable declared inside the get_tree function. I could store this variable 
in a session by declaring it in a package body. 

If I store it to session level then the PL/SQL table has to be recreated only when the data is 
changed in the database. This will improve the performance. How do I find out whether database is 
changed since last run?

Creating a package and displaying it using a view, is it a right approach?

Thanks,

Ketan. 


Followup   July 31, 2003 - 7pm Central time zone:

to discover that the data has changed is "hard" to do in a scalable fashion, unless you add 
auditing to the base table.

If you added a trigger that 

create table last_update( x int primary key ) organization index;
create sequence s;

create trigger t_trigger after insert or update or delete of t
begin
   insert into last_update values ( s.nextval );
end;


your routine can now keep a package variable that was the max(x) from t and every time it is 
called, it simply selects max(x) from t again to see if it changed -- if so, it must requery.


if modifications are really really "light" on this table, you could update a single row instead of 
inserting a new row each time.  that will cause all modifications of the table to serialize by 
session though.


is creating a package and displaying as view the right approach?  depends -- if it CANNOT be done 
in pure sql, maybe.  if it could be done in pure sql, probably not. 

3 stars The Questions.   August 1, 2003 - 8am Central time zone
Reviewer: Clark from Leeds, UK
Hi Tom,

wrt my comments yesterday - (a) sorry for the low review grade, that was mainly because I asked the 
wrong q.. and (b) the questions I still have are:

1.
Is there anything inherently wrong/ bad practice with using a plsql index-by table to store values 
from the db this way - would you personally do it with a query each and every time you needed to 
x-ref the errlevel? If so, what are your reasons?

2.
Is there any way of storing common data (available to plsql) in the SGA to avoid duplicating 
'application-static' data across all sessions? Are you aware of any plans to introduce such a 
concept if not?

Cheers,
Clark.

P.S. Keep up the good work! 


Followup   August 1, 2003 - 9am Central time zone:

1) if I had procedural code that I could not turn into a query (a single query, not a single query 
executed 500 times, a single query executed once) I would be looking at using lookup tables.  Yes, 
I do that.

2) there have been talks about it -- it is a hard problem to solve actually.  introduces a new 
address space, concurrency issues (session 1 updates a variable at the same time session 2 does) 
and so on.

right now, you would have to do it "per session" 

5 stars Perfect   August 1, 2003 - 10am Central time zone
Reviewer: Clark from Leeds
Thanks Tom, at least I know I'm not missing something.

Point 2 tho - my concept was simply as somewhere to store absolutely-static data, which you know 
absolutely will not change; or alternatively, if it does change, every session needs the current 
values. (Why? To provide for different installations of the same application with, say, different 
application parameters.)

Thanks again.
Clark. 


Followup   August 1, 2003 - 11am Central time zone:

and we call that a database table ;)

it'll be in the cache if

a) it is small
b) it is used. 

5 stars insert select for load (best practices)   September 12, 2003 - 4pm Central time zone
Reviewer: Joe from PA
Hi tom,
best practices question related to this thread

during a load into a normalized schema, i have to determine if the dimension row of the values I am 
loading already exists in the dim table.  if it does, than use the key assigned to that set of 
values in my fact table.

did I understand all the posts correctly that I should

insert into t
select pk_from_dimension_table,plsql_var_a,plsql_var_b
FROM dimension_table
where dim_col_a = 'plsql_variable_d;
and dim_col_b='plsql_variable_e'

and if I get no rows inserted, assume that combo didn't exist and go ahead and insert it into the 
dim table?

This single line approach potentially followed by an insert would be better than a lookup query, 
followed by one or two inserts. (fact row and dim row)

In this case, after loading starts, I would assume that almost all combinations will be exist after 
a fairly short time.

  


Followup   September 12, 2003 - 7pm Central time zone:

how would you populate the dim table?  if the dim table is just "number" to "value" and nothing 
else -- it is not very useful?

give me a real case to work with. 

4 stars good, but one 'problem'   September 22, 2003 - 11am Central time zone
Reviewer: joachim mayer from austria/europe
Hi tom,

i spend some time with this article before i started to use lookup-tables myself – which turned out 
to be very helpful and improved the performance of my applications a lot. But there is still one 
“problem” – or better unpleasant thing – in it. 

I will try to describe it as good as I can. Sorry, I am not an native English speaker!

Most of my oracle-applications are called by other programs which stay connected to the database 
and therefore keep their sessions. When the first time call one of my package it reads (inialize) 
the lookup-table(s) for the package and it keeps going as long as it takes. Sometimes for hours or 
longer… very rare, but it happens I would like to chance one value in the lookup-table and want 
this change to sow effect at once. But I can only change the value in the table where the process 
did get its original data but of course it wont get reread. 
So, my question is – is there a way to inform this package to this session it should now reread its 
table and renew the lookup-table. (function-call, signal?)? At the moment, we must restart the 
process which is connected to the DB and uses my packages – which is sometimes cumbersome.

Thx

hannibal
 


Followup   September 22, 2003 - 11am Central time zone:

you would have to poll the database using something like dbms_alert, or AQ or just a simple single 
row table that contains the date/time of last update to see if you needed to "refresh" 

5 stars Associative arrays   December 1, 2003 - 12pm Central time zone
Reviewer: Zoran Martic from Dublin, Ireland
Tom,

This is very nice discussion.

I want to add one more case in this story to have complete picture.
I want to cover this situation:
You have the application (forms, c, c++, ...) and you have all lookup tables in Oracle database.
It is commonly known that doing lookups one by one from the client side is going to be slow (or to 
say not fast enough) in some cases because of SQL*Network roundtrip (TCP, IPC or even BEQ). On the 
newest hardware you can do to say up to few thousands lookups from C or C++. If you do this with 
somekind of in-process-memory structure (as associative arrays, hash table, ...) you can get 10 
times more lookups. In this case the biggest overhead in doing Oracle lookups are roundtrips 
(including context switches) then executing lookup SELECT.
Of course this is the case only if you cannot load the data into Oracle with bulk inserts and after 
that do the magic in PL/SQL. You need to do validation and other things on the client side and in 
this case perform lookups one by one.
In this scenario you need somekind of client cache.

It looks to me that Oracle filled the gap with Oracle Forms if Oracle Forms can use associative 
arrays on the client side PL/SQL engine. 
Associative arrays are much faster because it is in-process-memory, no need for locking, .....

For example if you use associative array on the Forms client side (if possible) you can get much 
better performances then doing lookups on a table.

This is from my perspective the biggest gap that I cannot solve with Oracle yet.
What with C, C++, Java, .... Oracle 9iAS.
Are there any way that Oracle is going to address this issue on the client side (as in PL/SQL with 
associative arrays)?
What you can tell about OCI object cache ? Is it to do anything with this story?

Thanks for any response.

Regards,
Zoran
 


Followup   December 2, 2003 - 7am Central time zone:

question -- why are you doing lookups on the client.   

5 stars Why to do lookups on the client?   December 2, 2003 - 10am Central time zone
Reviewer: Zoran Martic from Dublin, Ireland
In some special cases we need (or want) to do lookups on the client because of the needed speed.
The main idea is that it is faster because you do not have networking overhead (as SQL in this 
case), everything is in your process memory (as for associative arrays).

If you need to respond to some "real-time" (if something like that exists) event faster then you 
want to spend on the everhead on communicating between the client and the database (OCI) you need 
to consider somekind of client cache.
All other cases I covered by doing bulk insert and later on join with other tables and process 
that.

You saw how much faster is associative array then SQL lookup. I tested simple 2 column IOT lookup 
with 1M rows versus associative array lookup based on the same table. The difference was 25 times.
Of course this is limited to just one value lookup and it is unique lookup.
Also I tested some so called in-process or in-memory databases (Berkley Sleepycat) and they are 
about 5 times faster for simple 1-column lookup then PL/SQL, but much more faster if it is OCI call 
from the client even on the same server (SQL*Network time).

This is the only case when I cannot find the solution for the problem in Oracle.

From the results I have associative arrays are faster then B*Tree implementations in those fastest 
in-memory databases (they are simple, small, not reliable and robust like Oracle), even 5 times, 
but I cannot use PL/SQL in C or C++.
I want to use Oracle 100% because I am Oracle DBA/Dev, but I am facing problems to be fast enough 
in these special cases.

OCI object cache is looking like to simekind of caching mechanism, but hard to apply to normal 
tables and it is probably not like associative arrays (B*Tree) in PL/SQL.
Also 9iAS needs to be fast with dealing with data. They do not want for any validation to go to the 
server.
I found network roundtrips plus type conversion between C or Java and Oracle types very expencive.

I am asking you because I know that you are curious about this as I am.
I was thinking a few months ago that there is nothing faster then standard SQL lookup from IOT (or 
hash table or normal table). But as you saw with associative arrays this is not true (also by 
similar implementations in some small in-process databases like Sleepycat and TimesTen).
One-task process is not supported by Oracle anymore, that means no way to be fast as PL/SQL if you 
are using OCI.

What are you thinking about all of this?
Any new things in 10g with associative arrays or something similar on the OCI client side?

Regards,
Zoran
 


Followup   December 2, 2003 - 11am Central time zone:

I would encourage you to post your simple test -- I'll betcha we can change the algorithm to go the 
OTHER WAY (iot creams associative array ;)

it is all about thinking in sets, if you go slow by slow (opps -- meant row by row), the assoc 
array might win but if you do set based processing (single sql), the DB will cream you.

I blame the "procedural mindset", not the database here (i prototype this in my latest book 
effective oracle by design if you are interested).  


I don't do lookups on the client side.  to me -- the database IS the client side (mod_plsql).

For OCI, you'd be looking at a C subroutine (binary tree, avl tree, something like that).

Java's got hash tables...

for the 10g database -- its all about "the server", wouldn't really do "the client"
 

5 stars Client lookups   December 2, 2003 - 11am Central time zone
Reviewer: Zoran Martic from Dublin, Ireland
Tom,

Thanks for your answer.
I have read all your books.

I agree that one SQL join per 50,000 rows in set is better then doing 50,000 lookups.
I am using that idea for a long time (as you described it in your books):
1. simple SQL
2. PL/SQL
3. everything else

I tested stupid associative lookups here because I wanted to know the difference between these 
fastest algorithms (or databases, TimesTen, Sleepycat, ...) and Oracle pure B*Tree implementation.
I was happy to see that even doing lookups on associative arrays was 5 times faster in my case then 
using these "fast" databases (algorithms).

You know that the most app clients are written in something diffeernt then PL/SQL and because of 
that, in my case C, you need to perform somekind of lookups (database, C hash table or whatever).

For you the client is PL/SQL (and me if not looking the company I am working for :), for many of us 
is something else.

Just that example about these lookups and associative arrays (to confirm that maybe my test is not 
wrong):

SQL> desc x2
 Name                                      Null?    Type
 ----------------------------------------- -------- -------- X                                      
   NOT NULL VARCHAR2(100)
 Y                                                  VARCHAR2(100)

Populated with 1M rows with this script:

declare
   x varchar2(100);
   y varchar2(100);
   i number;
begin
   for j in 1..1000000 loop
    insert into x2 values(to_char(j)||'abcdffgghhghg','abcdffgghhghg');
    if mod(j,10000) = 0 then
        commit;
    end if;
   end loop;
   commit;
end;
/

*************** Test ************************************
declare
      type  lookupTable is table of varchar2(100) index by varchar2(100);
      type  indexByTable is table of varchar2(100) index by binary_integer;

      l_lookupTable lookupTable;
      l_codes       indexByTable;
      l_values      indexByTable;
      l_value       varchar2(100);

      l_start number;
      l_run1  number;
      l_run2  number;
  begin

      dbms_output.put_line(dbms_utility.get_time);

      select x, y BULK COLLECT into l_codes, l_values from x2;

      dbms_output.put_line(dbms_utility.get_time);

      for i in 1 .. l_codes.count
      loop
          l_lookupTable( l_codes(i) ) := l_values(i);
      end loop;

      dbms_output.put_line(l_codes.count);
      dbms_output.put_line(dbms_utility.get_time);

      for i in 1 .. 1000000
      loop
          l_value := l_lookupTable( '501abcdffgghhghg' );
      end loop;

      dbms_output.put_line(dbms_utility.get_time);

      for i in 1 .. 1000000
      loop
           select y into l_value from x2
                   where x = '501abcdffgghhghg';
      end loop;

      dbms_output.put_line(dbms_utility.get_time);

  end;
/

Result:
411879663
411886578
1000000
411898013
411898682
411912955

You can see the difference between last 3 numbers.
I fixed the string I am searching for because I did not want to have possibility using the disk 
(the table is about 100M I think and also PGA was 180M for this test).
I did it for 100000 also. It was also much faster.
I did not try more columns or anything else.
I know that could be a way different.

Let me know if my test is wrong.
If not please give us some example where the table lookup is faster then associative array (no big 
JOIN technique please).

I agree again in PL/SQL is probably stupid to do lookups when you have joins, but sometimes you 
need to process event by event and not the bunch of events together (Imagine processing connect 
request or something similar in real world, you cannot batch that). 

Thanks,
Zoran 


Followup   December 2, 2003 - 12pm Central time zone:

i would have just joined, once again, that original bulk collect -- 0% chance I would have looked 
up on the client, 1000% chance I would have looked up on server (even if I had to create a global 
temporary table and shoot the data over first!)

In PLSQL, just use sql.

In C, -- well -- call PLSQL :)

(or look to that LANGUAGE to provide you a lookup tool -- C has that nifty bsearch routine) 

5 stars Join   December 3, 2003 - 4am Central time zone
Reviewer: Zoran Martic from Dublin, Ireland
Tom,

If you read carefully, there are not N records. There is not set of rows.
You are dealing with one event you need to respond as fast as possible.
Imagine the situation where you need to respond to some network event immediately, or even to some 
database event, you cannot use joins, you need to respond immediately. You cannot batch it.
That is the point, no way to have a batch.
You have "real-time" event you need to respond to.
In this case this server call (from C) from my perspective is very expensive.
In the same way you can look from PL/SQL you do not want sometimes to go and look the table, you 
will use local variables or associative arrays. That is the point.

My example was to show that looking into associative array is much faster then looking into IOT for 
direct hit lookups.

You owe me an example where IOT is faster then associative array as you said previously (just for 
PK matches).

I am pretty confident that even if you have the multicolumn primary key (doing concatenation in 
associative array to make the index), the lookup will be faster.

That means if you need to do something faster then your tables can do, you will opt for associative 
arrays.

At the end why should Oracle put associative arrays in the product if you can do everything with 
join :)
Are you saying to us that associative arrays are not needed?
I am sure that you want to use it in some special cases where you need to do things record by 
record (or event by event).
Oracle put it to fullfil PL/SQL speed.
I asked you in the first place is Oracle going to make something on the client side similar to the 
OCI object cache but more advanced.
I want to rely on Oracle, do not want to use other products.
I saw (tested) that Oracle B*tree search in associative arrays are faster then coresponding 
"fastest" in-process in-memory databases.

Thanks for all your time.

Zoran





 


Followup   December 3, 2003 - 7am Central time zone:

then, as I keep saying, you need to look to your LANGUAGE to help you out.  be it c, java, 
whatever.  the LANGUAGE will supply this feature

then you have to figure out how to make it so that when the database changes the client knows to 
refresh its cache.

you would not be using a plsql stored procedure in your case -- there would be no need to have it 
do a lookup as plsql stored procedures are running in the database and have no UI.  

you would not be using forms for a real time interface either (i wouldn't imagine you would)


we put lots of stuff in the database cause people ask for it.  its not always the best.  i've 
demonstrated the best, most performant way.

I've yet to use an associative array in code, for real.

 

5 stars Associative arrays   December 3, 2003 - 10am Central time zone
Reviewer: Zoran Martic from Dublin, Ireland
Tom,

Thanks a lot for your time.
That is what I am doing at the moment.

I am impressed with the speed of associative arrays.
I mean 1M lookups in 6 seconds on 1M table (simple two columns but..) is very good. The server is 
just 450MHz Sparc.
That is proving that Oracle has crazy good B*Tree algorithm, just all code on top of it (optimizer, 
SQL, locking, ....) are having initial overhead per SQL.
At the end this is why joins are fast, the overhead is too small in comparison to memory operations 
performed to read database blocks from the buffer cache (about 5k - 10k LIO's per 100MHz CPU).

Regards,
Zoran 


5 stars This was incredible!   December 10, 2003 - 12pm Central time zone
Reviewer: Roland G. Bender from Ashburn, VA USA


5 stars validation and auditing   June 30, 2004 - 5am Central time zone
Reviewer: dxl from uk
Tom

We have a requirement to bulk load some data from a legacy system into oracle.  The data needs to 
be validated. The validation is not particularly complicated but we do have to audit any rows 
failing validation.

So first the data is passed from a java web app via jdbc using oracle batch updating into a staging 
table (possible gtt) that just contains a load of varchar2 fields in order to get the data into 
oracle. 
Then we need to get move the data from the staging table into the real table with the correct 
datatypes, validating the data as we go and logging any rows which don't pass but also sometimes 
preventing these rows from getting into the real table and in other cases where the validation 
fails as a warning we would let the row get into the table but just log it as a warning.

Hence we have something like:  (this is a very simplified view of our tables)

create table stage (
    forename    varchar2(100),
    surname        varchar2(100),
    dateofbirth    varchar2(100),
    postcode    varchar2(100),
    codeA        varchar2(100));



create table myTable (
    forename    varchar2(30),
    surname        varchar2(30),
    dateofbirth    date,
    postcode    varchar2(7),
    codeA        number(2));



My question is what will be the most efficient method of loading into myTable?  I would like to use 


INSERT INTO myTable select ..validate.. from stage;

because using this in the past has proved to be amazingly quick.

However a lot of our fields require validation using lookup tables from the database.  I belive the 
validation is bulky and awkward enough not to do it using CASE and decodes etc, plus it also uses 
some lookup tables, so I was thinking that i could write lots of functions to do the validation 
hence we would have:

begin

    INSERT INTO myTable 
    select     function1(forename),
        function2(surname),
        function3(dateofbirth),
        function4(codeA),
        ..
        .. etc
        ..
    from stage;

end;

1) would this involve a lot of context switching? How efficient could this be? 

2) Doing it this way i can put the auditing of each field into the functions BUT how do i prevent 
the rows failing validation from getting into the table without the whole insert statement failing? 
 Should i let them get in and then run another procedure afterwards to delete them??


3) Maybe my whole approach is wrong, should i just go for a cursor loop so that i can validate as i 
go along and easily audit any errors and warnings. This would give me complete control over which 
rows get into myTable, but i don't like the thought of row by row processing since we may be 
dealing with a lot of rows.

4) Would loading up plsql lookup tables be quicker instead of using the real db tables and then 
doing something like


    INSERT INTO myTable 
    select     (select a from plsql_tble1 where b = forename),
        (select a from plsql_tble2 where b = surname),
        (select a from plsql_tble3 where b = dateofbirth),
        (select a from plsql_tble4 where b = codeA),
        ..
        .. etc
        ..
    from stage;



I would still have the same problem of logging the errors and warnings and contolling which rows 
get into myTable.


Any advice here would be greatly appreciated.

Thanks 


Followup   June 30, 2004 - 10am Central time zone:

is this an ongoing thing or one time?
how big is the input data set (rows and megabytes)?
why are you loading from a middle tier like that?  seems like external tables would be more than 
sufficient.
 

4 stars thanks for reply   June 30, 2004 - 12pm Central time zone
Reviewer: dxl from uk
Forgot to say we are on 8.1.7.4 so no external tables.
This is an ongoing process which will be created in a stored procedure.  
The data is coming from a java app on a middle tier app server, although we may at some point try 
and connect direct to the legacy system using HS, but not yet, it is coming from the java app for 
now.

As for size it will be done on a batch basis and i'm  expecting it to be on average in the order of 
between 10,000 and 50,000 rows, i know this is vague but thats all i know right now, i've been told 
it "shouldn't" ever be more than 100,000 rows but that isn't definite either.  There are probably 
about 40 - 50 columns to validate and the requirement is to make it go as quick as we can!

 


Followup   June 30, 2004 - 1pm Central time zone:

sqlldr then.. but anyway.


bulk process the data 100 rows at a time.  bulk collect limit 100 into plsql arrays, process the 
data, forall i insert the "good" data and log the bad data (eg: process the one array you bulk 
collected into into two sets of arrays -- one good and one bad)


 

5 stars please give me some more details   July 1, 2004 - 5am Central time zone
Reviewer: dxl from uk
I'm not sure that i totally understand the whole process you've suggested. Please can you give me 
more details with an example.

At the moment i'm thinking that you mean something like:


1) bulk collect table (of varchar columns) into a scalar object of varchars with a 100 limit
2) then loop through each element of the array and do the validation 
3) if any elements pass validation then add to the "good" array else add to the "bad" array
4) load each good and bad array into  myTable or the error table using forall


For example: 

(please excuse the pseudo code, i'm just interested in the process rather than correcting syntax at 
the moment, although any comments on whether to use scalartypes, arraytypes or rowtypes etc would 
also be helpful)


create or replace type myScalarType as object
    (forename        varchar2(100),
     surname        varchar2(100),
    dateofbirth        varchar2(100),
    postcode        varchar2(100),
    codeA            varchar2(100))
/


create or replace type myArrayType as table of myScalarType
/





declare
        cursor c is select myScalarType( forename, surname, dateofbirth, postcode, codeA ) from 
stage;

        l_data myArrayType;
        l_good_data myTable%rowtype;
        l_bad_data myTable%rowtype;

    type good_Array is table of l_good_data index by binary_integer;
    type bad_Array is table of l_bad_data index by binary_integer;
    begin
    
        open c;
        loop
            l_data := myArrayType();
            fetch c bulk collect into l_data limit 100;
   
   
           for i in 1 .. l_data.count loop

        -- then should i do the validation here??

        -- I need to do things like, for example :


        select code into v1 from lookup table where code = l_data(i).codeA;
        if sql%rowcount = 1 then 
            -- pass_validation
        else
            -- fail_validation
        end if;

        ..
        ..
        .. etc (lots more validation like this)
        ..
        ..
        

        -- After the validation decide if it is a good row 
         
        if pass validation then
            -- 
            l_good_data := l_data(i);
        else
            l_bad_data := l_data(i);
        end if;
    
        ..
        .. etc
        ..

        -- now build good_array and bad_array??

        if good row then
        good_array(n):= l_good_data ;
    
        if bad row then
        bad_array(m):= l_bad_data ;
        
        n:= n + 1;
        m:= m + 1;


        end loop;


           if ( good_array.count is not null )
            then
               forall i in 1 .. good_Array.count
                     insert into myTable values
                            ( good_array(i) );
           end if;


           if ( bad_array.count is not null )
            then
               forall i in 1 .. bad_Array.count
                     insert into myErrorTable values
                            ( bad_array(i) );
           end if;


              bad_Array := empty;
              good_Array := empty; 

   
           exit when c%notfound;
       end loop;
      close c;
end;



Is this the kinda of processing that you meant?? or have i got the wrong end of the stick!? 


Followup   July 1, 2004 - 10am Central time zone:

in this example -- mod() is my function to determine goodness or badness.  

this is 9ir2.

ops$tkyte@ORA9IR2> create table t as select * from big_table.big_table where rownum < 200;
 
Table created.
 
ops$tkyte@ORA9IR2> create table good as select * from t where 1=0;
 
Table created.
 
ops$tkyte@ORA9IR2> create table bad as select * from t where 1=0;
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> declare
  2      type array is table of t%rowtype index by binary_integer;
  3      l_source   array;
  4      l_good     array;
  5      l_bad      array;
  6
  7      cursor c is select * from t;
  8  begin
  9      open c;
 10      loop
 11          fetch c bulk collect into l_source limit 100;
 12          l_good.delete;
 13          l_bad.delete;
 14          for i in 1 .. l_source.count
 15          loop
 16              if (mod( l_source(i).id,2 ) = 0 )
 17              then
 18                  l_good(l_good.count+1) := l_source(i);
 19              else
 20                  l_bad(l_bad.count+1) := l_source(i);
 21              end if;
 22          end loop;
 23          forall i in 1 .. l_good.count
 24              insert into good values l_good(i);
 25          forall i in 1 .. l_bad.count
 26              insert into bad values l_bad(i);
 27          exit when c%notfound;
 28      end loop;
 29  end;
 30  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select count(*) from t;
 
  COUNT(*)
----------
       199
 
ops$tkyte@ORA9IR2> select count(*) from good;
 
  COUNT(*)
----------
       100
 
ops$tkyte@ORA9IR2> select count(*) from bad;
 
  COUNT(*)
----------
        99
 

5 stars excellent   July 2, 2004 - 4am Central time zone
Reviewer: dxl from uk
Thanks very much for that.

I am on 8.1.7.4 so will your method work or do i just need to replace the record types with object 
types?? 


Followup   July 2, 2004 - 9am Central time zone:

with an array per column I would suggest.  You didn't have record operations in forall in 8i -- you 
need an array per column. 

4 stars Bulk collecting into an associative array?   July 23, 2004 - 7pm Central time zone
Reviewer: A reader 
How can I bulk collect into an associative array?

I have a lookup table containing code (vc 10) and  value (vc2 100).

Can I 

type assoc_t is table of varchar2(100) index by varchar2(10);
mytab assoc_t;

select code,value BULK COLLECT into mytab from lookup_table;

How can I do this without resorting to "slow-by-slow" processing?

Thanks

[Sorry, I dont have my database in front of me right now to try this] 


Followup   July 23, 2004 - 8pm Central time zone:

you cannot right now, you bulk collect the "subscripts" into array1, the "subscriptee" into array2 
and then you have to build array3 by assigning array2 to array3:

  for i in 1 .. array1.count
  loop
     associative_array3(array1(i)) := array2(i);
 

5 stars Magic?   August 2, 2004 - 11pm Central time zone
Reviewer: A reader 
Refer to one of your earlier followups...

"I threw in that last query to show you that there is "magic" happening with select of selects.  
See the number of consistent gets on that last query?  Very very close to the others. There are 
optimizations happening here that go beyond what we can see"

Would you please explain what is going on here?

How is it possible for a FTS of EMP to have the same consistent gets as 

select deptno, (select dname from dept where 
dept.deptno = emp.deptno)
from emp

Basically, the DEPT access is free!! How is this possible?

Thanks 


Followup   August 3, 2004 - 8am Central time zone:

it is not free -- it is however cached.

the scalar subquery -- the (select dname from dept where dept.deptno=emp.deptno) is "stripped out" 
and turned into:

select dname from dept where dept.deptno = :bv


the server then starts fetching data -- gets the first row from EMP, sees "DEPTNO 10" and runs that 
scalar subquery.  It then "saves this result in a small cache" -- so in a small cache we have 
"bv=10, dname=<xxxxx>".

The next time, it fetched 20, then 30, then 40, then bv gets NULL.  Then back to 10, 20, 30, 40, 
null and so on (because of the way I loaded the data in this example).

This small cache is able to hold the results of the 10, 20, 30, 40, null results so the scalar 
subquery is fired off just 5 times -- resulting in few logical IO's.

with scalar subqueries -- the "order of the data" can be very relevant.  consider:



ops$tkyte@ORA9IR2> set echo on
ops$tkyte@ORA9IR2> set linesize 121
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> define nvalues=100
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace function f(x in number) return number
  2  as
  3  begin
  4      dbms_application_info.set_client_info( userenv('client_info')+1 );
  5      return 42;
  6  end;
  7  /
 
Function created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> drop table t;
 
Table dropped.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create table t
  2  as
  3  select mod(rownum,&nvalues) r from all_objects;
old   3: select mod(rownum,&nvalues) r from all_objects
new   3: select mod(rownum,100) r from all_objects
 
Table created.
 
in this table, the data will be fetched 1,2,3,4,...99,0,1,2,3...,99,0,1...

that is, the r values do not repeat from row to row but rather every 100 rows.  Now:


ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_application_info.set_client_info(0);
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> set autotrace traceonly
ops$tkyte@ORA9IR2> select r, (select f(r) from dual) from t;
 
32160 rows selected.
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'DUAL'
   2    0   TABLE ACCESS (FULL) OF 'T'
 
 
 
 
Statistics
----------------------------------------------------------
         22  recursive calls
          0  db block gets
      25556  consistent gets
         49  physical reads
          0  redo size
     516751  bytes sent via SQL*Net to client
      24072  bytes received via SQL*Net from client
       2145  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
      32160  rows processed
 
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2> select userenv('client_info') from dual;
 
USERENV('CLIENT_INFO')
----------------------------------------------------------------
7783

we can see our function was called 7,783 times there.  Taking the same exact set of data but 
"having it reside on disk in sorted order":

 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> truncate table t;
 
Table truncated.
 
ops$tkyte@ORA9IR2> insert into t select mod(rownum,&nvalues) r from all_objects order by 
mod(rownum,&nvalues);
old   1: insert into t select mod(rownum,&nvalues) r from all_objects order by mod(rownum,&nvalues)
new   1: insert into t select mod(rownum,100) r from all_objects order by mod(rownum,100)
 
32160 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_application_info.set_client_info(0);
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> set autotrace traceonly
ops$tkyte@ORA9IR2> select r, (select f(r) from dual) from t;
 
32160 rows selected.
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'DUAL'
   2    0   TABLE ACCESS (FULL) OF 'T'
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2506  consistent gets
          0  physical reads
          0  redo size
     420930  bytes sent via SQL*Net to client
      24072  bytes received via SQL*Net from client
       2145  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      32160  rows processed
 
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2> select userenv('client_info') from dual;
 
USERENV('CLIENT_INFO')
----------------------------------------------------------------
103


only 103 calls (the cache worked really well this time as we basically only needed to call the 
function pretty much once per bind variable value).

question would then be "does data have to be sorted on disk for this to work?" and the answer is 
no, an inline view that sorts the data by the values you will be binding into the scalar subquery 
can do this:


ops$tkyte@ORA9IR2> truncate table t;
 
Table truncated.
 
ops$tkyte@ORA9IR2> insert into t
  2  select mod(rownum,&nvalues) r from all_objects;
old   2: select mod(rownum,&nvalues) r from all_objects
new   2: select mod(rownum,100) r from all_objects
 
32160 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_application_info.set_client_info(0);
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> set autotrace traceonly
ops$tkyte@ORA9IR2> select r, (select f(r) from dual)
  2    from (select r from t order by r);
 
32160 rows selected.
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'DUAL'
   2    0   VIEW
   3    2     SORT (ORDER BY)
   4    3       TABLE ACCESS (FULL) OF 'T'
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        353  consistent gets
          0  physical reads
          0  redo size
     420891  bytes sent via SQL*Net to client
      24072  bytes received via SQL*Net from client
       2145  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
      32160  rows processed
 
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2> select userenv('client_info') from dual;
 
USERENV('CLIENT_INFO')
----------------------------------------------------------------
100



and here is the challenge du-jour..... why did the consistent gets go way way down on this last 
query as opposed to the one right before :)


 

5 stars Challenge du-jour!   August 3, 2004 - 9am Central time zone
Reviewer: A reader 
Because the 'select r from t order by r' filled up the cache? Argh, I give up!

This was simply brilliant. You are single-handedly changing the world in how Oracle technologies 
are effectively used. 

Keep up the good work (and do post the solution to your challenge!) 


5 stars Challenge Du-jour   August 3, 2004 - 9am Central time zone
Reviewer: Bill from Vermont
Is it possible the consistent gets go way down because of the order by clause in the subquery?  
Still trying to get the hang of how this works, but very educational! 


Followup   August 3, 2004 - 10am Central time zone:

Here is a "hint" of sorts:



ops$tkyte@ORA9IR2> create table t as select rownum r from all_objects;
Table created.
 
ops$tkyte@ORA9IR2> set autotrace traceonly statistics;
ops$tkyte@ORA9IR2> set arraysize 15   
 /* the default... */

ops$tkyte@ORA9IR2> select * from t;
32174 rows selected.
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2197  consistent gets
         49  physical reads
          0  redo size
     448960  bytes sent via SQL*Net to client
      24083  bytes received via SQL*Net from client
       2146  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      32174  rows processed
 
ops$tkyte@ORA9IR2> select * from t order by r;
 
32174 rows selected.
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         52  consistent gets
          0  physical reads
          0  redo size
     448960  bytes sent via SQL*Net to client
      24083  bytes received via SQL*Net from client
       2146  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
      32174  rows processed
 
ops$tkyte@ORA9IR2> set arraysize 5000
ops$tkyte@ORA9IR2> select * from t;
 
32174 rows selected.
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         59  consistent gets
          0  physical reads
          0  redo size
     203788  bytes sent via SQL*Net to client
        565  bytes received via SQL*Net from client
          8  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      32174  rows processed
 
ops$tkyte@ORA9IR2> set autotrace off
 

5 stars order by   August 3, 2004 - 10am Central time zone
Reviewer: A reader 
So, the 'order by' automagically turns on array processing on the server side? Even so, wouldnt the 
client (sqlplus) arraysize (15) still limit the throughput?

Still not getting it! 


Followup   August 3, 2004 - 10am Central time zone:

Nope... not quite.

think about the definition of a consistent get -- were we get that data from vs what might happen 
with "temp data" 

5 stars Challenge Du-juor - take two   August 3, 2004 - 10am Central time zone
Reviewer: Bill from Vermont
Hmmm....further up in this thread, you stated the following:
"I threw in that last query to show you that there is "magic" happening with 
select of selects.  See the number of consistent gets on that last query?  Very 
very close to the others".

So, maybe it has to do with the select of the select?
 


5 stars Challenge Du-jour Take Two yet again   August 3, 2004 - 10am Central time zone
Reviewer: Bill from Vermont
Oops - just realized that that would not explain the last example you gave.  Back to the drawing 
board... 


5 stars Challenge Du-jour : I think I am on the right track now!   August 3, 2004 - 11am Central time zone
Reviewer: Bill from Vermont
It has to do with the array fetches.  They reduce the consistent gets. 


5 stars Challenge du-jour!   August 3, 2004 - 1pm Central time zone
Reviewer: A reader 
"think about the definition of a consistent get -- were we get that data from vs what might happen 
with "temp data""

I give up. consistent get is a point-in-time current-mode read. It comes from the buffer cache, it 
may have to incur a PIO in order to get the block into the cache in the first place.

temp data comes from tempfiles (if you are using them).

i still dont see how adding a order by to the inline view had the same effect as increasing the 
client's arraysize! or was that just a red herring?! 


Followup   August 3, 2004 - 5pm Central time zone:

when we fetched 15 rows by 15 rows without sorting.... we would

a) get a block
b) get 15 rows from block
c) give up block
d) goto a)

problem was -- there were about 500 or 600 rows per block here!  We would get block "1" 40 times, 
block "2" 40 times and so on.

setting the arraysize to 5000 meant we would get block 1 once, block two 2 and so on (in order to 
get the first 5000 rows).  We might have gotten block 10 or 11 twice -- but in essence, each block 
would be "consistent gotten" once -- instead of 40 or more times.

that is why the difference with array fetch of 15 vs 5000.

So, how did the sort affect that?  In order to get the FIRST row, we had to get ALL rows and sort 
them.  So the first row got all of the blocks -- put them into temp and sorted them.  We returned 
the rows from there without having to use consistent gets on each and every fetch call -- since 
they were copied on the first fetch into temp. 

5 stars Great!   August 3, 2004 - 8pm Central time zone
Reviewer: A reader 
Hm, great, that was very useful, thanks, but that raises a few more questions

"put them into temp and sorted them.  We returned the rows from there without having to use 
consistent gets on each and every fetch call"

1. So the table had about 59 blocks and they were all pulled into the buffer cache by the sort 
operation. the "fetch" phase of the query simply returned them from there instead of doing more 
consistent gets for them. So the sort essentially "subsidized" the cost of the fetch operation! 
Nice.

This is a (good) side-effect of all sort-operations, right? 
i.e. whenever you sort, you already get all the data you need before starting to output the first 
row.

"that is why the difference with array fetch of 15 vs 5000"

2. you have shown numerous times the benefit of using higher arraysizes. 

If higher arraysizes are always beneficial, why doesnt Oracle make that the default behaviour for 
all "fetches", sqlplus or otherwise? (well, lets stick to sqlplus for this discussion). 

I mean, getting a block, returning 15 rows and "throwing away" the block isnt terribly smart. If 
the user has set arraysize 15, he doesnt know what is good for him. Just bump it up without telling 
the user. Its not exactly like  "Open the pod bay doors Hal. I'm sorry Dave, I'm afraid I can't do 
that", is it?

3. Why does the example with the 59 consistent gets have

203788  bytes sent via SQL*Net to client

I can understand that higher arraysizes will reduce the "SQL*Net roundtrips to/from client" number, 
but the answer is the same, so the "bytes sent via SQL*Net to client" should be the same, right?

Thanks 


Followup   August 4, 2004 - 8am Central time zone:

1) the sort pulled all 50 some blocks from the cache into the "sort area size" allocated in the UGA 
-- once in the sort area, they never needed another "consistent get", they were already there in 
the sessions memory.  If the sort area exceeded the sort area retained OR the sort area was not 
large enough -- we would have swapped to disk.

So, caveat emptor - do not run out and throw bogus order by's on queries that do not need them.  
Also -- a sort performed by an index access (skipping the sort actually) won't have this effect.

2) they are not always beneficial.  I find 100 to be sort of a sweet spot.  sqlplus is "just a very 
very simple tool" and has the method to change the default.  The real problem is code like VB or 
Java -- there the default is "i don't know" for VB and 10 for jdbc - 10 is probably too small and 
the preFetch should change.  For PLSQL, in 10g they made a default of 100 for prefetching for 
implicit cursors (was 1 -- unless you used bulk collect)


3) every round trip incurs protocol layers (metadata so we understand what we just sent to 
ourselves), less round trips, less protocol.  Here because the data was so darn "small" (a simple 
number) the protocol was as large as the data itself 

5 stars PGA usage by assoc arrays   December 13, 2004 - 2pm Central time zone
Reviewer: A reader 
Following the thread on this page...Lets say I prove that my system would benefit by the 
associative array approach i.e. use a package instantiation code to bulk collect 2 arrays, load up 
your assoc array and then simply use that instead of making database calls.

What are the memory implications of doing this? 

Would all sessions using this package use as much memory as needed by the assoc array? Or do 
some optimizations kick in? 

If I am using dedicated server and pga_aggregate_target, where is all this memory used? In each 
session's PGA? Would that be on the server or on the client? 

Any way to reduce the memory usage?

Thanks
 


Followup   December 13, 2004 - 2pm Central time zone:

all sessions have their own data segment, they would each get a copy of the array in their address 
space.

it would be in the UGA in the PGA under dedicated server and in the UGA in the SGA under shared 
server and is not subject to the pga_aggregate_target which only really affects dynamic workareas 
like sort areas, hash areas.

the only way to reduce memory usage it to not use memory ;)

if at all possible, incorporate the "lookup" right in your SQL itself, that'll be faster still and 
not consume ram in the dedicated servers. 

4 stars Sorry I have a large backlog right now, please ask a question later   December 22, 2004 - 6am Central time zone
Reviewer: Tanweer from USA
Sorry I am not able to send the Fresh message bCOZ I have gotten  the message (I have a large 
backlog right now, please ask a question later )

How execute the Package,Procedures & Function on the system time.
Suppose I want to execute the (1) Package on time 3:00 A.M, (2) Procedures on time 4:00 A.M & (3) 
Functions on time 4:30 a.m.

Can you give me the Syntax/Code to take my Procedure/Package name?

I have given the Package & Procedure name for example:
(1)create or replace package lob_client_operations as

function get_lob_size(p_tablename in varchar2,
                      p_pkcolname in varchar2,
                      p_textcolname in varchar2,
                      p_docid in varchar2) return number;

procedure get_lob_chunk ( p_tablename in varchar2,
                          p_pkcolname in varchar2,
                          p_textcolname in varchar2,
                          p_docid in varchar2,
                          amount IN OUT NUMBER,
                          offset     IN NUMBER,
                          buff      OUT RAW);


procedure insert_new_record ( p_tablename in varchar2,
                          p_pkcolname in varchar2,
                          p_textcolname in varchar2,
                          p_docid in varchar2);
                    
end;
/
-- ---------------------------- ---
-- ---------------------------- ---

(2) 
(a)CREATE or REPLACE PROCEDURE SimpleTextMessage IS
 (b) create or replace procedure html_email(  p_to in varchar2, p_from in varchar2, p_subject in 
varchar2, 
p_text in varchar2 default null) is 
-- ---------------------------- ---
-- ---------------------------- ---

(3) create or replace function send(
        p_from                  in varchar2,
        p_to                    in varchar2,
        p_subject               in varchar2,
        p_body                  in varchar2) return number as

-- ---------------------------- ---
-- ---------------------------- ---

Thanks in Adv
 


Followup   December 22, 2004 - 9am Central time zone:

hmm, do you understand the meaning of that message?  


search this site for dbms_job.


doesn't matter if it is a package, procedure, function whatever -- you just need a job schedulers. 

5 stars When is it necessary to usE PL/SQL tables ?Version ORACLE 8I   December 23, 2004 - 1pm Central time zone
Reviewer: Stivi 
Tom,hello!
As i understood it is more efficient to use
in procedural code SQL statements with joins(because it works faster) than lookup in arrays,but 
when is it good to use PL/SQL tables instead of SQL'S? May you show the example where PL/SQL table 
is preferable than SQL statement.
Thank you.
Stivi
 


Followup   December 23, 2004 - 2pm Central time zone:

I don't have any :)

If you feel you MUST code procedurally, that you MUST do it in procedural code -- then a PLSQL 
lookup table is pretty darn efficient.

my mantra:

if you can do it in a single sql, do it.
if not do it in as little plsql as possible... 

5 stars QUESTION ABOUT INSERT IN SQL STATEMENT   December 23, 2004 - 2pm Central time zone
Reviewer: DAV 
Thank you Tom,
I have an insert statement that i do in procedural code:
insert into table_try (select 1,cstmr_no,func(1,ctmr_no)day
                       from customers  
                       where bsnss_no=:param_code 
                       and retuirning_no!=1);
i prefer sql for insert and not plsql code because you told it is faster.But i don't want to insert 
record if 
func(1,ctmr_no) return null or select doesn't return anything.How can i do it?
Thank you very much.
DAV
  


Followup   December 23, 2004 - 3pm Central time zone:

if select doesn't return anything -- what could possibly be inserted? :)  that is OK.

add a where clause to select only those rows you actually want to insert. 

5 stars Insert_continue   December 23, 2004 - 3pm Central time zone
Reviewer: A reader 
Tom,thank you very much,

in (select 1,cstmr_no,func(1,ctmr_no)day
                       from customers  
                       where bsnss_no=:param_code 
                       and retuirning_no!=1);

FUNC(1,cstmr_no) acceses to another table and 
and if func(1,cstmr_no) is null i don't want it to be inserted but if i use in where 
func(1,cstmr_no) is not null
this func will be executed twice in select and in where.
am i right?
if yes,how to improve it?
Have a nice day.

 


Followup   December 23, 2004 - 4pm Central time zone:

might be, might not be


select 1, cstmr_no, day
  from (select 1,cstmr_no,func(1,ctmr_no)day, ROWNUM
          from customers  
         where bsnss_no=:param_code 
           and retuirning_no!=1)
 where day is not null;


ops$tkyte@ORA9IR2> create or replace function f( x in number, y in number ) return number
  2  as
  3  begin
  4          dbms_application_info.set_client_info(userenv('client_info')+1);
  5      return x;
  6  end;
  7  /
 
Function created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_application_info.set_client_info(0)
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select empno,f(sal,comm)
  2    from emp
  3   where f(sal,comm) is not null;
 
     EMPNO F(SAL,COMM)
---------- -----------
      7369         800
      7499        1600
      7521        1250
      7566        2975
      7654        1250
      7698        2850
      7782        2450
      7788        3000
      7839        5000
      7844        1500
      7876        1100
      7900         950
      7902        3000
      7934        1300
 
14 rows selected.
 
ops$tkyte@ORA9IR2> select userenv('client_info') from dual;
 
USERENV('CLIENT_INFO')
----------------------------------------------------------------
28
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_application_info.set_client_info(0)
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select empno, f
  2   from (
  3  select empno,f(sal,comm) f, rownum r
  4    from emp
  5         )
  6   where f is not null;
 
     EMPNO          F
---------- ----------
      7369        800
      7499       1600
      7521       1250
      7566       2975
      7654       1250
      7698       2850
      7782       2450
      7788       3000
      7839       5000
      7844       1500
      7876       1100
      7900        950
      7902       3000
      7934       1300
 
14 rows selected.
 
ops$tkyte@ORA9IR2> select userenv('client_info') from dual;
 
USERENV('CLIENT_INFO')
----------------------------------------------------------------
16
 
ops$tkyte@ORA9IR2>
 

5 stars Procedural mindset   January 7, 2005 - 12pm Central time zone
Reviewer: Bob from Ky, USA
Tom,
Very helpful and insightful information.  I've been reading your book and I am now trying to 
breakout of the procedural mindset.  The one question this thread doesn't answer for me is this; I 
SQL*Load a table, then I need to validate each record from that table against two other tables to 
determine if it should be inserted into the main table.  If the validation fails, an error needs to 
be logged in a seperate table.  The structure of all tables is different and it's version 8.1.7.  
The idea of using a good and bad table covers most of it, but how do you log the error associated 
with the bad records?
 


Followup   January 8, 2005 - 3pm Central time zone:

what is the validation?  I always try to let the database do my work for me, why can't it do it 
here? 

5 stars Procedural mindset - resolved   January 8, 2005 - 8am Central time zone
Reviewer: Bob from Ky, USA
After further reading and testing it's starting to make sense.

Thanks for the execellent site.
 


2 stars You can't think of one reason?   February 19, 2005 - 3pm Central time zone
Reviewer: bob mourning from Mullica Hill, NJ
You could not come up with a single reason to ever use a PL/SQL table over the SQL?  

How about when the table one must read does not belong to the reader and it is poorly indexed.  
Then it might make sense to do the select one time and efficiently access the data where possible 
via PL/SQL binary_integer.  

There are other solutions of course, but poorly performing SQL run 10M times would not work.

Also, it may be possible to limit the candidate values up front. For example, if my transaction 
data has only 12 lookup keys and the lookup table has 2M rows, it might make sense to go against 12 
rows in a PL/SQL table versus use a non-unique index.

BTW, I think your posts are generally excellent, but sometimes you are just arrogantly dogmatic. 


Followup   February 19, 2005 - 5pm Central time zone:

Hmm, lets see, dogmatic is a recursively defined thing

Main Entry: dogmatic
Pronunciation: dog-'ma-tik, dg-
Variant(s): also dogmatical /-ti-k&l/
Function: adjective
1 : characterized by or given to the use of dogmatism <a dogmatic critic>
2 : of or relating to dogma

so, we must dig deeper to find the meaning:

Main Entry: dogmatism
Pronunciation: 'dog-m&-"ti-z&m, 'dg-
Function: noun
1 : positiveness in assertion of opinion especially when unwarranted or arrogant
2 : a viewpoint or system of ideas based on insufficiently examined premises

If you choose number 1, then your use of arrogantly would be redundant.  It cannot be 
"unwarranted", for I tend to put the proof of what I talk about out there for all to see.  It 
definitely cannot be #2.

but actually, wait, it cannot be #1, because that includes "opinion".  All I asked was "where is 
the proof here".  I'm not offering opinion, but rather hard facts -- with numbers, with code.  

Anyway, I did give a bunch of them in "Effective Oracle by Design" as well -- spent the better part 
of a chapter on this topic.  The answer came down to "just join" (that would make a nice bumper 
sticker I think -- it would mean different things to different people but it has a nice ring to it. 
 Maybe "databases were born to join", but then the print would have to be pretty small to fit it 
all)


but hey -- I even gave a case, an example of using a plsql based lookup table versus querying 
repeatedly -- I agree, that can be faster, HOWEVER, I also (and will keep doing so over and over) 
point out that you've only gone a tiny fraction of the way to the right answer which in most cases 
is "just join darn it, databases were born to join"


Maybe you did not read the entire answer, the entire discussion, the give and take.


So, you can drop the arrogantly -- it is sort of "inferred" in the definition of the term you are 
using, but I do believe the term was used incorrectly, especially if you take into consderation the 
definition of 


Main Entry: dogma
Pronunciation: 'dog-m&, 'dg-
Function: noun
Inflected Form(s): plural dogmas also dogmata /-m&-t&/
Etymology: Latin dogmat-, dogma, from Greek, from dokein to seem -- more at DECENT
1 a : something held as an established opinion; especially : a definite authoritative tenet b : a 
code of such tenets <pedagogical dogma> c : a point of view or tenet put forth as authoritative 
without adequate grounds
2 : a doctrine or body of doctrines concerning faith or morals formally stated and authoritatively 
proclaimed by a church

"established opinion" -- nope.
"without adequate grounds" -- nope
and while database conversations sometimes take on a religious fervor -- i don't think #2 applies 
either.

Just my 2cents -- and yes, I'll keep saying the same things over and over and over and maybe 
they'll stick (use binds, use binds, just join, databases are really good at inserting too -- audit 
trails are good, every other line of code should be DEBUG (trace, instrumentation) code -- all of 
these things people don't do, i'll just keep plugging away at it)

(yes, this answer is tongue in cheek -- actually, I just read "arrogantly dogmatic" as "consistent 
and committed to provide the information with proof") 

5 stars Which would be faster   July 6, 2005 - 12pm Central time zone
Reviewer: hawk from thousand islands
Dear Tom

Please advise which of the 2 methods would be effecient and faster

Aim: Need to insert data into a table on daily basis (table P) and also I need to insert that data 
in form of  into other table (t)
I would be using data from table p for certain other purposes


method 1:
---------
For curref in (select a.*, b.* from a, b, c, d, e, f
               where  a.column = b.column
               and some join conditions among all 6 tables)
loop
    insert into p values (abv selected)

    insert into t values (abv selected some cals)
    insert into t values (abv selected but some diff cals)
end loop;

Method 2:
---------
insert into p (select a.*, b.* from a, b, c, d, e, f
               where  a.column = b.column
               and some join conditions among all 6 tables);

for cur_ref in (select * from p where date_time = sysdate)
loop
    insert into t values (abv selected)
    insert into t values (abv selected but some diff cals)
end loop;


Please advise which of the abv 2 methods is efficient and faster.

Thankyou. 


Followup   July 6, 2005 - 12pm Central time zone:

method three, multi-table insert, no code of course.

insert
   when (1=1) then into p ( ... ) values ( ... )
   when (whatever) then into t ( ... ) values ( ... )
   .....
select a.*, b.* from a,b, c,d.......
     
    

5 stars A note   July 6, 2005 - 1pm Central time zone
Reviewer: Hawk from Thousand Islands
Tom,
Great! but one thing every row in the select needs to get to all the tables hence there wouldn'nt 
be any when clause.

I hope still this(ur suggested method 3) is the best way to do it.

Please comment.

Thankyou

 


Followup   July 6, 2005 - 2pm Central time zone:

sure is, read about multi-table inserts, there is an INSERT ALL ..... as well no when clauses. 

5 stars Thankyou very much TOM   July 6, 2005 - 3pm Central time zone
Reviewer: Hawk from Thousand Island


5 stars very interesting   July 6, 2005 - 10pm Central time zone
Reviewer: Muhammad Ibrahim from Hongkong
Dear Tom,

   I have a situation as below

create table t ( id number, from_date date, to_date date, amount number, expired varchar2(1));

   I want to take a copy of the table t by id(column).
Process this data in the memory then write back
the same/new set of data to the table t and mark the
previous set as old. The processing includes many for 
loops and validations,comparisions,lookups etc...
   Here my question to process the data whether plsqltabl/
global temporary table which one is better? 
   If i use temp table i need to do lot of inserts and updates
for processing sometimes delete also inside that too may inside
for loops. Insert is fine but i am worried about updates and 
deletes Or plsql table is better for this kind of situation?

Regards,
Ibrahim.
 


Followup   July 7, 2005 - 9am Central time zone:

you missed the most important bit of data

how big.

and databases are really good at doing things -- lookups for example are called joins (they do them 
particularly weel).  validations and comparisions.  You may well find you can actually do a ton of 
stuff in sql. 

4 stars plsql table or global temp table   July 8, 2005 - 5am Central time zone
Reviewer: Muhammad Ibrahim from Hongkong
eg:
----

main table which keeps the transactions:
---------------------------------------
create table cessions ( cession_id      number(20)   not null,
                        start_date      date         not null,
                        mode            varchar2(10) not null,
                        end_date        date         not null
                        pol_group_id    number(10),
                        class_id        number(10),
                        arrangement_id  number(10),
                        sequence_number number(10),
                        premium         number,
                        exch_rate       number, 
                        overwritten     varchar2(1),
                        apply_copy      varchar2(1));

alter table cessions add ( constraint cessions_pk primary key(cession_id,start_date,ri_mode));

global temp table for processing:
--------------------------------
create global temporary table cessions_tmp ( cession_id number(20) not null,
                                             start_date date not null,
                                             mode varchar2(10) not null,
                                             end_date date not null,
                                             pol_group_id number(10),
                                             class_id number(10),
                                             arrangement_id number(10),
                                             sequence_number number(10),
                                             premium number,
                                             exch_rate number,
                                             overwritten varchar2(1),
                                             apply_copy varchar2(1)) 
on commit preserve rows;

alter table cessions_tmp add ( constraint cessions_tmp_pk
                               primary key(cession_id,start_date,ri_mode));

procedure merge_old_situation(pol_group_id in number)
is
begin
  delete cessions_tmp;
  for rpol in ( select pol_group_id
                from   main_pol
                where  ..) -- this can be one or more
                           --(select criteria may return min 1000 or more)
  loop
    insert into cessions_tmp( .... )
    select ...
    from   cessions
    where  pol_group_id = rpol.pol_group_id;
  end loop;
                 
end merge_old_situation;

procedure process_re(pol_group_id in number
Is
begin
  -- i dont want to directly manipulate the transaction table that
  -- is changing it directly. So i take a mirror image of the original
  -- table to the temp table by some conditions(not all the data).
  
  merge_old_situation(pol_group_id);
  
  for rintervals in ( select start_date, end_date 
                      from intervals )-- this is also a global temp table which 
                                      -- will have set of intervals 
                                      --(one or more may be 365 intevals per 
                                      -- year).
    for rexisting_pol in ( select ...
                           from   another_table
                           where  rintervals.start_date between start_date 
                                  and end_date )
                           -- this another_table is a big table but the select 
                           -- criteria may return min 100 or more
    loop
      -- here i take the new data which comes from rexisting_pol then 
      -- compare with cessions_tmp table
      -- insert or update depends on the necessaity
      update cessions_tmp
      ...
      if sql%rowcount = 0 then
        insert into cessions_tmp
        ...
      end if;
      for rsomething in (select ..
                         someother_bigtable -- more than a million records
                         where  rintervals.start_date between start_date 
                                and end_date )
      loop
        update cessions_tmp
        ...
        if sql%rowcount = 0 then
          insert into cessions_tmp
          ...
        end if;
      end loop;
    end loop;

  loop
  end loop;              

  -- finally i do merge for the new one with the old one based 
  -- on the primary key columns
  merge into cessions a
  using cessions_tmp
  on    ( a.cession_id = b.cession_id
          and a.start_date = b.start_date
          and a.mode = b.mode )
  when matched then
    update set a.end_date = b.end_date
               ...
  when not matched then
    insert (a.cession_id, a.....)
    values (b.cession_id,b....  );
end process_re;              
          

  -- so my question is using global temp table(cessions_tmp) in this above 
  -- scenario is better or plsql table is better? please tell us your 
  -- opinion.
  -- note: this is online transaction! 


Followup   July 8, 2005 - 7am Central time zone:

  for rpol in ( select pol_group_id
                from   main_pol
                where  ..) -- this can be one or more
                           --(select criteria may return min 1000 or more)
  loop
    insert into cessions_tmp( .... )
    select ...
    from   cessions
    where  pol_group_id = rpol.pol_group_id;
  end loop;

that should be a single sql statement -- why write code?


I've no clue if you have 1 row or 1000000000 rows or something in between.  


My main concern for performance would be "how to remove the procedural slow by slow code and do as 
much in a single sql statement as possible" 

5 stars using seq can cause issues in insert all   July 8, 2005 - 11am Central time zone
Reviewer: Hawk from Thousand Island
Tom,
I am using Insert all as suggested and looks great but....
insert all
into t1 values(a,b,c...)
into t2 values (seq.nextval, d, f)
into t2 values (seq.nextval, d* -1 , f+1)
(select a, b, c, d, f 
from
pp, qq 
where ,....conditions)

Above, issue is the I am using a same sequence in same table as 2 different rows hence the same 
seq.nextval goes in both rows which are duplicates for my table t2.
Limitation: I cannot use 2 different sequences.

Please suggest.

Thankyou. 


Followup   July 8, 2005 - 12pm Central time zone:

I can only suggest you revisit your "limitation" or create the sequence as


create sequnce seq increment by 2;

and use seq.nextval, seq.nextval+1

nextval is defined to be consistent that way -- constant -- if you refer to it a dozen times in a 
sql statement -- it'll be the same value a dozen times (same with currval).  It is as if it were 
"bound" in -- like it was a bind variable row by row. 

5 stars This works   July 8, 2005 - 12pm Central time zone
Reviewer: Hawk from Thousand Island
Dear Tom, 
I hope this works

function seq_nextval(seqname in varchar2)
  return number is
seq_next number;
begin 
--    select 'seqname'||.nextval into seq_next from dual;
execute immediate 'select '||seqname||'.nextval from dual'
                   into seq_next ;
    return seq_next ;
end; -- function seq_nextval

 


Followup   July 8, 2005 - 1pm Central time zone:

not reliably, a plan change can and will change the number of times PLSQL called from SQL is 
called.  Not recommended, you are relying on the function to be called for each invocation and that 
behavior can change over time.


the dynamic sql would not be necessary. 

5 stars how's this?   July 8, 2005 - 3pm Central time zone
Reviewer: Tyler from Edmonton Alberta
a possible solution to Hawk's problem? 

SQL> drop sequence test_seq ;

Sequence dropped.

SQL> create sequence test_seq start with 1 increment by 1;

Sequence created.

SQL>
SQL> create or replace function test_the_seq return number as
  2     hold_it number;
  3  begin
  4     select test_seq.nextval into hold_it from dual;
  5  return hold_it;
  6  end;
  7  /

Function created.

SQL>
SQL>
SQL> SELECT test_the_seq, test_the_seq from dual;

TEST_THE_SEQ TEST_THE_SEQ
------------ ------------
           1            2

SQL> SELECT test_the_seq, test_the_seq from dual;

TEST_THE_SEQ TEST_THE_SEQ
------------ ------------
           3            4

SQL> SELECT test_the_seq, test_the_seq from dual;

TEST_THE_SEQ TEST_THE_SEQ
------------ ------------
           5            6

SQL> SELECT test_the_seq, test_the_seq from dual;

TEST_THE_SEQ TEST_THE_SEQ
------------ ------------
           7            8

SQL> SELECT test_seq.nextval from dual;

  NEXTVAL
---------
        9
 


Followup   July 8, 2005 - 6pm Central time zone:

today that might work
yesterday it might not
tomorrow it might stop
the day after it could work again.

there is nothing saying how many times a function written in plsql will be called from SQL.

Nothing.

answer is same as answer immediately above 

5 stars ?   July 8, 2005 - 6pm Central time zone
Reviewer: Tyler from edmonton alberta
Hi Tom, 

Could you please elaborate on this behavior? I’m not questioning your statement, but it does 
confuse me. Personally, I’d expect SQL to execute a function call once per utilization always and 
forever. 

Thanks!

 


Followup   July 8, 2005 - 6pm Central time zone:

select 1+2, 1+2
  from t;

how many times do we need to evaluate 1+2?


now, add in different plans, joins, a nested loop versus hash join, a subquery materialized to 
temp, etc etc etc etc.


Please do not ascribe to the non-procedural language SQL anything procedural at all.

Consider:

select * from t where x = f();

How many times would you expect f() to be called?



ops$tkyte@ORA10G> create table t ( x int );
 
Table created.
 
ops$tkyte@ORA10G> insert into t select rownum from all_users ;
 
38 rows created.
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> create or replace function f return number
  2  as
  3  begin
  4          dbms_application_info.set_client_info(userenv('client_info')+1);
  5          return 42;
  6  end;
  7  /
 
Function created.
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> exec dbms_application_info.set_client_info(0);
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA10G> select * from t where x = f();
 
no rows selected
 
ops$tkyte@ORA10G> select userenv('client_info') from dual;
 
USERENV('CLIENT_INFO')
----------------------------------------------------------------
38
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> create index t_idx on t(x);
 
Index created.
 
ops$tkyte@ORA10G> exec dbms_application_info.set_client_info(0);
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA10G> select * from t where x = f();
 
no rows selected
 
ops$tkyte@ORA10G> select userenv('client_info') from dual;
 
USERENV('CLIENT_INFO')
----------------------------------------------------------------
2
 


why 38? (there are 38 rows in there)
but why 2?  

I can show you tricks to MINIMIZE the number of times a function is called, but you would be hard 
pressed to make it so that a function is called exactly as many times as you want -- USING SQL.

I would not rely on this behavior (at the very least I would dummy up parameters to this function 
and make sure it gets called with unique inputs -- eg: pass it rowid and something else -- so that 
the optimizer could never say "ah hah, don't need to keep calling this puppy over and over) 

5 stars Why 2?   July 8, 2005 - 7pm Central time zone
Reviewer: VA from New Jersey, USA
Thats very interesting, why does f() get called 2 times when you add the index on t(x)? What does 
the index on t(x) have to do with evaluating the predicate where t.x=f()?

Also, you confused me with your double-negative in your last paragraph.  "so that the optimizer 
could never say "ah hah, don't need to keep calling this puppy over and over"

If I add a dummy parameter to the function like f(rowid,...), the optimizer will be forced to call 
the function for every row, right? Isnt this exactly what we were trying to avoid?

Thanks


 


Followup   July 9, 2005 - 8am Central time zone:

not really a double negative.  The optimizer could say:

"ah hah, we do not have to call F() over and over, We'll call F() once and be done with it"

If you make the inputs into F() (add inputs) 'unique', the optimizer could not ever make this 
optimization happen.


IF you call a function AND everytime you call it, the inputs are unique, then the database would be 
forced to invoke it AT LEAST ONCE, eg, you have your function f() that selects the 
sequence.nextval.  You want to call it AT LEAST twice per row:


select f( rowid, 1 ), f(rowid, 2 ), ....
  from table;


rowid is unique in table, rowid+1 is unique -- rowid+2 is unique, there would be no way to NOT call 
f() at least twice per row.  Stress words "at least", it could call it lots more if it feels like 
it.


Why did it call it twice?  Just because it does, it could call it once, maybe in the future it 
will.  Maybe in the future it'll call it 3 times though.   

5 stars PL/SQL associative arrays   September 21, 2005 - 12am Central time zone
Reviewer: Quadro from Russia
if you need to do simple lookups with small tables, PL/SQL associative arrays go far far far away 
from even hash clusters in terms of performance (3x-10x times better) and scalability (latches go 
down in orders of magnitude).

So, if you need to do lookups in procedural code - by all means use it. We use them broadly in our 
PL/SQL tariffication routines (too much logic to do it just in sql joins). 


Followup   September 21, 2005 - 6pm Central time zone:

prove it.


if they are small, they hash in memory.
if they are small, they cannot be causing much lio.

I'll stick with Just Join, until shown otherwise. 

5 stars Prove it   September 22, 2005 - 3am Central time zone
Reviewer: Quadro from Russia
"prove it."

Sure. In our simple test case we will use PL/SQL table and hash cluster to lookup object_name by 
object_id.

SQL> select count(*) from user_objects;

  COUNT(*)
----------
       418

So, hash cluster with 1024 keys will be more than enougth.

SQL> create cluster hc_test
  2  (
  3   object_id number
  4  ) size 512 hashkeys 1024;

Cluster created.

SQL> create table hash_lookup
  2  (
  3   object_id number,
  4   object_name varchar2(30)
  5  ) cluster hc_test (object_id);

Table created.

SQL> insert /*+ append */ into hash_lookup select object_id, object_name from user_objects;

420 rows created.

SQL> commit;

Commit complete.

SQL> declare
  2   --type we will use for lookup table
  3   type varchar2_table is table of varchar2(30) index by binary_integer;
  4   l_lookup_table varchar2_table;
  5  
  6   --table for storing object identifiers
  7   type number_table is table of number index by binary_integer;
  8   l_lookup_values number_table;
  9  
 10   l_object_name varchar2(30);
 11  begin
 12   --Fill our lookup table
 13   for cur in (select * from user_objects where object_id is not null)
 14   loop
 15    l_lookup_table(cur.object_id):=cur.object_name;
 16   end loop;
 17  
 18   --fill table with lookup values
 19   select object_id bulk collect into l_lookup_values
 20    from user_objects
 21    where object_id is not null;
 22  
 23   --first - lookup in hash cluster
 24   runstats.rs_run1;
 25   for i in 1 .. 100
 26   loop
 27    for j in 1 .. l_lookup_values.count
 28    loop
 29     select object_name into l_object_name
 30      from hash_lookup
 31      where object_id=l_lookup_values(j);
 32    end loop;
 33   end loop;
 34   --second - lookup in associative array
 35   runstats.rs_run2;
 36   for i in 1 .. 100
 37   loop
 38    for j in 1 .. l_lookup_values.count
 39    loop
 40     l_object_name:=l_lookup_table(l_lookup_values(j));
 41    end loop;
 42   end loop;
 43   runstats.rs_stop(1000);
 44  end;
 45  /
Run1 run in 213 hsecs
Run2 run in 3 hsecs
Run1 run in 7100% of run 2
*
STATISTIC                                 RUN 1       RUN 2        DIFF
----------------------------------- ----------- ----------- -----------
STAT...no work - consistent read ge      41,900           0     -41,900
STAT...cluster key scans                 41,900           0     -41,900
STAT...buffer is not pinned count        41,900           0     -41,900
STAT...cluster key scan block gets       41,900           0     -41,900
LATCH..library cache pin                 83,810           2     -83,808
STAT...consistent gets from cache        41,907           8     -41,899
STAT...execute count                     41,901           1     -41,900
LATCH..simulator lru latch                2,300           0      -2,300
STAT...recursive calls                   41,902           1     -41,901
STAT...consistent gets                   41,907           8     -41,899
STAT...session logical reads             41,922          24     -41,898
LATCH..cache buffers chains              83,883          84     -83,799
LATCH..simulator hash latch               2,300           0      -2,300
LATCH..library cache                     83,812           2     -83,810
STAT...Cached Commit SCN referenced      40,900           0     -40,900
STAT...calls to get snapshot scn: k      41,902           1     -41,901
*
LATCH totals                              RUN 1       RUN 2         PCT
----------------------------------- ----------- ----------- -----------
Latch totals versus runs                256,381         110 233,073.64%

PL/SQL procedure successfully completed.

So - the difference is tremendous? 


Followup   September 22, 2005 - 1pm Central time zone:

you missed my point.

"JUST JOIN"

you would not be doing this sql in a loop, you would have JOINED to this data once.


You have some query you are processing, doing a slow by slow process apparently and doing a lookup 
on each row.

My point is JUST JOIN, don't do a lookup for each row, just JOIN and let SQL do the look up.


I agree, if you are going to run a little single row lookup over and over (slow by slow), this 
helps -- however, JUST JOIN in the first place. 

3 stars Need help in filtering values from pl/sql Table   September 29, 2005 - 5pm Central time zone
Reviewer: Mave from USA
I have an array [Associative array] with information like 

  Type rMyArray is record of (empno integer,
                              category integer);

  Type tMyArray is table of rMyarray index by pls_integer;

  Myarray tMyArray;

--  Values in this array 

  MyArray(1):={1234,1}
  MyArray(2):={1234,2}
  MyArray(3):={1234,3}
  MyArray(4):={1234,4}
  MyArray(5):={1234,5}
  MyArray(6):={1234,6}
  MyArray(7):={2345,1}
  MyArray(8):={2345,2}
  MyArray(9):={2345,3}
  MyArray(10):={2345,4}
  MyArray(11):={2345,5}

My requirement is to filter out this array for each empno [in this case 2 employees]
and dump them in another array[optional] called emparray and do some processing.
  loop over first empno[=1234] and fetch all the values for this empno and process them, then 
  loop over second empno[=2345] and fetch all the values for this empno.

  How can I acheive this?

Thanks for your time and help,
 


Followup   September 30, 2005 - 8am Central time zone:

you just answered your own question with the psuedo code? 

5 stars Filter PL/SQL array by value   September 29, 2005 - 8pm Central time zone
Reviewer: Quadro from Russia
This will have linear complexity. Pretty much unscalable and slow.

Instead of iterate use lookup value as key. In your case you can use two PL/SQL tables. First 
PL/SQL tables keyed by empno will point to second with categories.

declare
    type number_table is table of number /* categories */ index by binary_integer;
    type number_table2 is table of number_table index by binary_integer /*empno*/;

    Myarray number_table2;
begin
    Myarray(1234)(1):=1;
    Myarray(1234)(1):=2;
    Myarray(1234)(1):=3;
    Myarray(1234)(1):=4;
    Myarray(1234)(1):=5;
    Myarray(1234)(1):=6;

    Myarray(2345)(1):=1;
    Myarray(2345)(1):=2;
    Myarray(2345)(1):=3;
    Myarray(2345)(1):=4;
    Myarray(2345)(1):=5;

    --So, when you need to filter by empno, you just use it as key
    if (Myarray.exists(1234)) -- see, if such empno exists
    then
        for i in 1 .. Myarray(1234).count
        loop
            --process them here
        end loop;
    end if;
end; 


5 stars FIlter PL/SQL array   September 29, 2005 - 8pm Central time zone
Reviewer: Quadro from Russia
Or wait for a while - in a moment here will be the Tom himself, pointing out what the best way to 
get filtere data is to use SQL and *JUST WHERE* clause :-) 


3 stars Thanks for the replies..   September 30, 2005 - 9am Central time zone
Reviewer: A reader 
Tom, I got the pseudo code but not able to put it in code. I could do like Quadro mentioned here 
[thanks quadro], but he said myarray.exists(1234)..how to get that value 1234? 

if i have two arrays 
   myarray [with all the values] (empno,category)
   myarray1 (empno)

  can i say like this ?
  for i in 1..myarray1.count
  loop
    if myarray.exists(myarray1(i).empno)
    then
     do process --
    end if;
  end loop

Thanks a bunch. 


Followup   September 30, 2005 - 11am Central time zone:

he is saying to use an entirely different structure.  do you want to do that.


you should be able to split this into two arrays easily yourself, if you can fill ONE array, you 
can read that array and put it into two arrays? 

3 stars I can split into two arrays but..   September 30, 2005 - 11am Central time zone
Reviewer: A reader 
How do i check the value in an array..I am not quite following it. Could you please eloborate with 
a simple example?

That would be really helpful.. 


Followup   September 30, 2005 - 11am Central time zone:

if ( myarray(i).empno = something )

??

you created the array?  have you checked out the plsql guides, they have many examples.

 

3 stars Falling in infinite loop??   September 30, 2005 - 1pm Central time zone
Reviewer: A reader 
Ok, this is what i did [just testing ]
I am trying to loop over just one empno and it's falling in infinite loop??
It's showing me when it's not equal to v_empno [1234], but not exiting.

Can you point out the reason? 

Thanks for all your help..

declare

 Type Emparray is record (empno integer,
                          ename varchar2(20));

 Type temparray is table of Emparray index by pls_integer;
 
 emp_array temparray;

 v_empno  integer:=0;
begin

--  select empno,ename bulk collect into emp_array from emp;
  emp_array(1).empno:=1234;
  emp_array(1).ename:='name1';
  
  emp_array(2).empno:=1234;
  emp_array(2).ename:='name2';

  emp_array(3).empno:=1234;
  emp_array(3).ename:='name3';

  emp_array(4).empno:=1234;
  emp_array(4).ename:='name4';
  
  emp_array(5).empno:=2345;
  emp_array(5).ename:='name5';

  for i in 1..emp_array.count
  loop
    
     --v_empno:=emp_array(i).empno;
     v_empno:=1234;
    
     dbms_output.put_line(emp_array(i).empno);

     loop
          if emp_array(i).empno=v_empno
       then
         dbms_output.put_line(emp_array(i).empno);
       else
         --dbms_output.put_line('Not equal '||emp_array(i).empno);
         exit ;
       end if;          
     end loop;

  end loop;


    
end; 


Followup   September 30, 2005 - 2pm Central time zone:

I don't get the inner loop.  it is an infinite loop for sure unless they are not equal.


why do you have an inner loop at all. 

3 stars Just testing   September 30, 2005 - 2pm Central time zone
Reviewer: A reader 
Inner Loop is just for testing . I am trying to insert each value back into a log table for all 
those empno's that are equal to the value v_empno.

So, why it's an infinite loop? I am exiting if they are not equal. 

Thanks, 


Followup   September 30, 2005 - 2pm Central time zone:

but when the are equal.... infinite loop, you are not incrementing anything, it is the same 
condition over and over and over again.


     loop
          if emp_array(i).empno=v_empno
       then
         dbms_output.put_line(emp_array(i).empno);
       else
         --dbms_output.put_line('Not equal '||emp_array(i).empno);
         exit ;
       end if;          
     end loop;


no matter how many billions of times you execute that - (i) isn't changing. 

3 stars I think i is changing from outer loop   September 30, 2005 - 3pm Central time zone
Reviewer: A reader 
Tom, i is coming from outer loop [for i in 1..]. isnt't it?
 


Followup   October 1, 2005 - 8pm Central time zone:

yes, but inner loop is inside outer loop - once inner loop entered - forget about outer loop, there 
is no escape! 

4 stars Sorry. I got it   September 30, 2005 - 3pm Central time zone
Reviewer: A reader 
Sorry, I got what you are trying to say.

Thanks. 


4 stars Rocketship vs PL/SQL lookup   October 4, 2005 - 1pm Central time zone
Reviewer: Ian from Trumbull,CT
Tom,
   nice discussion and some very useful info.
Here's the skinny - i have inherited some code often referred to as "the rocketship" because of the 
DECODE lookup shape.  Unfortunetly the rocketship could use a few boosters.  Basically it creates 
an 8 million row flat file from a SQL query that looks up rates to apply to a sales amount.  I had 
initially rewrote this to do the lookups in a PL/SQL cursor and then tried using PL/SQL tables but 
both were slower which agrees with this thread. 
To simplify things i stipped out 90% of the code from the query below...its pretty ugly and has 4 
DECODE rockets.
The query attempts 8 different senarios to aquire an accrual rate from the same rate table...for 
example it first tries to get the rate from S1 if it finds a rate based on CUST_ID and 
UPC_CODE...then it will try from S2 based on CUST_ID,BRAND  and so on until S8.

My question is does using DECODES joined to 8 senario tables seem like a reasonable approach to 
doing lookups or perhaps there is another way to skin this cat.   Thanks for your time.    


select  account_num,
   trim(to_char(sum(
          decode(S1.ACCRUAL_RATE,NULL,
            decode(S2.ACCRUAL_RATE,NULL,
              decode(S3.ACCRUAL_RATE,NULL,
                decode(S4.ACCRUAL_RATE,NULL,
                  decode(S5.ACCRUAL_RATE,NULL,
                    decode(S6.ACCRUAL_RATE,NULL,
                      decode(S7.ACCRUAL_RATE,NULL,
                        decode(S8.ACCRUAL_RATE,NULL,0,S8.ACCRUAL_RATE * sales_amt ),
                      S7.ACCRUAL_RATE * sales_amt ),
                    S6.ACCRUAL_RATE * sales_amt ),
                  S5.ACCRUAL_RATE * sales_amt ),
                S4.ACCRUAL_RATE * sales_amt ),
              S3.ACCRUAL_RATE * sales_amt ),
            S2.ACCRUAL_RATE * sales_amt ),
           S1.ACCRUAL_RATE * sales_amt )
           ),'9999999990.00'))
 from weely_sales_summary  x,
-- Scenario 1
    ( SELECT CUST_ID, UPC_CODE,   ACCRUAL_RATE
      FROM RATE_TABLE    WHERE CUST_ID != '0000000000' and UPC_CODE != 'XXXXXXXXXXXX') S1,
-- Scenario 2
    (select CUST_ID, brand_code, family_code, ACCRUAL_RATE
     from RATE_TABLE
     where CUST_ID != '0000000000'       and brand_code != 'XXX'       and family_code != 'XXX') 
S2,
-- Scenario 3
    (select CUST_ID,brand_code, ACCRUAL_RATE
     from RATE_TABLE     where CUST_ID != '0000000000' and brand_code != 'XXX' and family_code = 
'XXX') S3,
-- Scenario 4
    (select CUST_ID,CAT_CD, ACCRUAL_RATE
     from RATE_TABLE where CUST_ID != '0000000000' and brand_code = 'XXX' and family_code = 'XXX') 
S4,
-- Scenario 5
    ( SELECT UPC_CODE, ACCRUAL_RATE
      FROM RATE_TABLE    WHERE CUST_ID = '0000000000' and UPC_CODE != 'XXXXXXXXXXXX') s5,
-- Scenario 6
    (select  brand_code, family_code, ACCRUAL_RATE
     from RATE_TABLE where CUST_ID = '0000000000' and brand_code != 'XXX'
      and family_code != 'XXX' and UPC_CODE = 'XXXXXXXXXXXX' and CS_UPC_VRSN_NR_CD = 'XX') S6,
-- Scenario 7
    (select CAT_CD,brand_code, ACCRUAL_RATE
     from RATE_TABLE where CUST_ID = '0000000000' and brand_code != 'XXX' and family_code = 'XXX') 
S7,
-- Scenario 8
    (select CAT_CD, ACCRUAL_RATE
 from RATE_TABLE where CUST_ID = '0000000000' and brand_code = 'XXX' AND family_code = 'XXX') S8
  where x.CUST_ID   = S1.CUST_ID (+)
    and x.UPC_CODE         = S1.UPC_CODE(+)
--
    AND x.CUST_ID   = S2.CUST_ID (+)
    and x.BRND_CD        = S2.brand_code (+)
    and x.family_code   = S2.family_code (+)
--
    and x.CUST_ID   = S3.CUST_ID (+)
    and x.BRND_CD        = S3.brand_code (+)
--
    and x.CUST_ID   = S4.CUST_ID (+)
    and x.cat_cd  = S4.CAT_CD (+)
--
    and x.UPC_CODE      = S5.UPC_CODE (+)
--
    and x.BRND_CD        = S6.brand_code (+)
    and x.family_code   = S6.family_code (+)
--
    and x.BRND_CD        = S7.brand_code (+)
--
    and x.CAT_CD  = S8.CAT_CD (+)
  group by account_num
; 


Followup   October 4, 2005 - 6pm Central time zone:

ops$tkyte@ORA9IR2> create table t ( x int, y int, z int );
 
Table created.
 
ops$tkyte@ORA9IR2> insert into t values ( null, null, 1 );
 
1 row created.
 
ops$tkyte@ORA9IR2> insert into t values ( null, 2, 1 );
 
1 row created.
 
ops$tkyte@ORA9IR2> insert into t values ( 3, 2, 1 );
 
1 row created.
 
ops$tkyte@ORA9IR2> insert into t values ( null, null, null );
 
1 row created.
 
ops$tkyte@ORA9IR2> select coalesce(x,y,z,0) from t;
 
COALESCE(X,Y,Z,0)
-----------------
                1
                2
                3
                0
 
(nicer than decode...)


 

3 stars coalesce   October 5, 2005 - 12pm Central time zone
Reviewer: Ian from trumbull,ct
thanks for the input. 

looks like a useful function, unfortunately I forgot to mention that we are running on 8.1.7 and 
COALESCE isn't available until 9.0.1.....my bad. 


4 stars Procedural vs set-based   October 25, 2005 - 4pm Central time zone
Reviewer: A reader from Florida
Tom has proven many times that rewriting procedural code into set-based code is faster. 

I was wondering... how does Oracle implement/process the set-based approach? Any chance that Oracle 
internally breaks it up into procedural code? As fas as I know, you need special hardware (e.g. 
vector computer) to actually process arrays as if they are scalars. How does Oracle process a 
statement such as:

SELECT * FROM TABLE_A
MINUS
SELECT * FROM TABLE_B

This construct is much more efficient that trying to do it procedurally. Still, how can Oracle 
execute the query without looking at the tables row by row?

Hopefully, the answer doesn't violate any proprietary laws. :-)

 


Followup   October 26, 2005 - 11am Central time zone:

The point is - everything is procedural code at the lowest level.

We can just optimize the overall SQL / data processing request - if you give us a single sql 
statement.

We cannot optimize your procedure code that executes thousands or billions of sql statements.


Ask us to do all of the work - instead of you trying to do the work in your own code. 

4 stars Using dbms_alert to signal data changes   July 18, 2006 - 6am Central time zone
Reviewer: malcolm 
Hi Tom,

I'm using pl/sql associative arrays to perform lookups on nearly static data - perhaps an update 
every hour or so.  On 9iR2.

I completely agree with you that it's better to join any lookup tables, and let the optimiser do 
its stuff, but it's not really relevent to what am I doing here.  The data I'm am caching in arrays 
is just config data - just to control the flow of processing.

The (age-old) problem is that I need to detect when the cached data has changed, and reload.  This 
is especially relevent as some sessions could be members of connection pools, and almost never 
disconnect.

In a followup a third of way down the page, you suggest using a trigger and populating a index 
organised table. What do you think of the trigger using dbms_alert.signal(alert), and each time the 
cache is checked, using dbms_alert.waitone(alert, message, status, 0) instead?

I've traced dbms_alert.waitone and it uses a table called sys.dbms_alert_info under the scenes.  I 
thought this might be more efficient, because if the table is owned by sys, it goes into the 
dictionary cache, and the dictionary cache must be more efficient than the buffer cache, because it 
caches formatted rows, not blocks.

Or is my assumption that all tables owned by sys go into the dictionary cache incorrect?  In which 
case, exactly which tables are cached in the dictionary cache?

 


Followup   July 18, 2006 - 8am Central time zone:

it does not go into the dictionary cache.  It is just a table.  in the dictionary cache goes our 
metadata - not every block from sys owned tables.

 

4 stars   December 5, 2006 - 2am Central time zone
Reviewer: Kumar 
There is a limitation in using the PL/SQL arrays. In my case the maximum subscript that an 
associative array have is 2147483648. It seems to be the limitation of OS & database version 
(9.2.0.7). 

I would like to know How I can overcome this Limitation.
I am trying to use PL/SQL table lookup for loading fact records. I have 10 dimensions. These 
dimension tables will act as lookup table. Each dimension will have maximum of 10,000 records 
(eventhough I have only 10000 records,few of the my subscript values are more that 2GB). So when 
loading the fact records I would like to avoid reading dimension tables for each records and 
instead use PL/SQL table as lookup. Please advise.
== My script
declare

type  NumberType  is table of number index by pls_integer;

  l_doper_nkid      NumberType;
  l_doper_skid      NumberType;
  l_doper_lkup      NumberType;

  v number :=8888888888;

Begin
  select sk_destination_operator_id,destination_operator_id_nk bulk collect 
         into l_doper_skid,l_doper_nkid 
  from dt_destination_operator_v;
  for i in 1 .. l_doper_nkid.count
  loop
       dbms_output.put_line('test '||l_doper_skid(i));
       l_doper_lkup(l_doper_nkid(i)) := l_doper_skid(i);
  end loop;
  dbms_output.put_line('sk_operator_id value for  '||v||' :'||l_doper_lkup(v));
Exception
  when no_data_found then
       dbms_output.put_line('Operator Surrogate Key not found for '||v);     
End;
=================


 


Followup   December 5, 2006 - 9pm Central time zone:

you know, if you have more than 2gb of subscripts - you really ought to....

well.....

rethink?

that would not be a "smart" thing.  

(databases JOIN really good, you might consider using them to do that?) 

4 stars Hard-coded CASE vs SQL lookup   March 9, 2007 - 11am Central time zone
Reviewer: Sunj from NJ
As compared to table lookup and PL/SQL lookup, I find hard-coded CASE statements are the fastest. I do not like the idea of hard-coding, but I have to pick performance over maintainability.

My example is for a country lookup for a web-service application. Although per-search the timing is not significant, but in case the user uses my web-service in a batch, I need to my service as lean as possible.
My country table has been relatively static.

I generated this block of CASE statement of 250 conditions.

CREATE OR REPLACE FUNCTION map_iso_country_cd (iso_code VARCHAR2)
   RETURN NUMBER
IS
   vcountryid   PLS_INTEGER;
BEGIN
   CASE iso_code
         WHEN 'VCT' THEN vcountryID  := 63;
         WHEN 'SAU' THEN vcountryID  := 64;
         WHEN 'SCO' THEN vcountryID  := 65;
         WHEN 'USA' THEN vcountryID  := 1;
         WHEN 'CAN' THEN vcountryID  := 2;
         WHEN 'AFG' THEN vcountryID  := 7;
         -- about 200 more country codes
         WHEN 'ZWE' THEN vcountryid := 250;
   END CASE;

   RETURN vcountryid;
END;

This is my lookup version of function:
CREATE OR REPLACE FUNCTION map_iso_country_cd_TAB (iISO_code VARCHAR2)
   RETURN NUMBER
IS
   vcountryid   PLS_INTEGER;
BEGIN

      SELECT country_id 
        INTO vcountryid
        FROM country c
       WHERE iso_code = iISO_code;

   RETURN vcountryid;
END;


A million executions, came back in .06 seconds for the first one, whereas the second one came back in 50 seconds.


Could you suggest a more civilized approach please. :)
thanks


Followup   March 12, 2007 - 1pm Central time zone:

My idea is.....
do away with the lookup all together....

JUST JOIN

Databases where BORN TO JOIN, just join

see my comments above, for example:

... I betcha using a single INSERT -- removing the procedural code all together -- would get rid of the
rest of the time. .....


REMOVE the CODE


More food for thought.... YAWTDI

yet another way to do it.......


ops$tkyte%ORA10GR2> /*
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> drop table lookup;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table lookup
ops$tkyte%ORA10GR2> ( c_code varchar2(3) primary key,
ops$tkyte%ORA10GR2>   c_id   number
ops$tkyte%ORA10GR2> )
ops$tkyte%ORA10GR2> organization index
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into lookup
ops$tkyte%ORA10GR2> select *
ops$tkyte%ORA10GR2>   from (
ops$tkyte%ORA10GR2> select substr( object_name, 1, 3 ), max(object_id)
ops$tkyte%ORA10GR2>   from all_objects
ops$tkyte%ORA10GR2>  group by substr(object_name,1,3)
ops$tkyte%ORA10GR2>  order by dbms_random.random
ops$tkyte%ORA10GR2>        )
ops$tkyte%ORA10GR2>  where rownum <= 225
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> declare
ops$tkyte%ORA10GR2>     l_case long := 'case p_iso_code ' || chr(10);
ops$tkyte%ORA10GR2> begin
ops$tkyte%ORA10GR2>     for x in ( select * from lookup )
ops$tkyte%ORA10GR2>     loop
ops$tkyte%ORA10GR2>         l_case := l_case || ' when ''' || x.c_code || ''' then l_c_id := ' || 
x.c_id || ';' || chr(10);
ops$tkyte%ORA10GR2>     end loop;
ops$tkyte%ORA10GR2>     l_case := l_case || 'end case;' || chr(10);
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>     execute immediate q'|
ops$tkyte%ORA10GR2> create or replace function map_iso_country_cd( p_iso_code in varchar2 )
ops$tkyte%ORA10GR2> return number
ops$tkyte%ORA10GR2> as
ops$tkyte%ORA10GR2>     l_c_id number;
ops$tkyte%ORA10GR2> begin |' || l_case || '
ops$tkyte%ORA10GR2> return l_c_id;
ops$tkyte%ORA10GR2> end;';
ops$tkyte%ORA10GR2> end;
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create or replace function map_iso_country_cd_tab( p_c_code in varchar2 )
ops$tkyte%ORA10GR2> return number
ops$tkyte%ORA10GR2> as
ops$tkyte%ORA10GR2>     l_c_id number;
ops$tkyte%ORA10GR2> begin
ops$tkyte%ORA10GR2>     select c_id into l_c_id
ops$tkyte%ORA10GR2>       from lookup
ops$tkyte%ORA10GR2>      where c_code = p_c_code;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>     return l_c_id;
ops$tkyte%ORA10GR2> end;
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create or replace package my_pkg
ops$tkyte%ORA10GR2> as
ops$tkyte%ORA10GR2>     type array is table of number index by varchar2(3);
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>     g_lookup array;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>     function map_iso_country_cd( p_c_code in varchar2 ) return number;
ops$tkyte%ORA10GR2> end;
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create or replace package body my_pkg
ops$tkyte%ORA10GR2> as
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> function map_iso_country_cd( p_c_code in varchar2 )
ops$tkyte%ORA10GR2> return number
ops$tkyte%ORA10GR2> is
ops$tkyte%ORA10GR2> begin
ops$tkyte%ORA10GR2>     return g_lookup(p_c_code);
ops$tkyte%ORA10GR2> end;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> begin
ops$tkyte%ORA10GR2>     for x in (select * from lookup)
ops$tkyte%ORA10GR2>     loop
ops$tkyte%ORA10GR2>         g_lookup(x.c_code) := x.c_id;
ops$tkyte%ORA10GR2>     end loop;
ops$tkyte%ORA10GR2> end;
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2> */
ops$tkyte%ORA10GR2> connect /
Connected.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> set timing on
ops$tkyte%ORA10GR2> declare
  2      type array is table of varchar2(3) index by binary_integer;
  3      l_data array;
  4      l_num number;
  5  begin
  6      select c_code bulk collect into l_data from lookup;
  7
  8      for i in 1 .. 1000
  9      loop
 10          for i in 1 .. l_data.count
 11          loop
 12              l_num := map_iso_country_cd( l_data(i) );
 13          end loop;
 14      end loop;
 15  end;
 16  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.66
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> declare
  2      type array is table of varchar2(3) index by binary_integer;
  3      l_data array;
  4      l_num number;
  5  begin
  6      select c_code bulk collect into l_data from lookup;
  7
  8      for i in 1 .. 1000
  9      loop
 10          for i in 1 .. l_data.count
 11          loop
 12              l_num := map_iso_country_cd_tab( l_data(i) );
 13          end loop;
 14      end loop;
 15  end;
 16  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:08.00
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> declare
  2      type array is table of varchar2(3) index by binary_integer;
  3      l_data array;
  4      l_num number;
  5  begin
  6      select c_code bulk collect into l_data from lookup;
  7
  8      for i in 1 .. 1000
  9      loop
 10          for i in 1 .. l_data.count
 11          loop
 12              l_num := my_pkg.map_iso_country_cd( l_data(i) );
 13          end loop;
 14      end loop;
 15  end;
 16  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.25
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>



3 stars Hard-coded CASE vs SQL lookup   March 21, 2007 - 5pm Central time zone
Reviewer: Sunj from NJ
Thanks Tom, using a PL/SQL array/table is a cleaner alternative.
But .25 seconds for a thousand look-ups still does not come close to the hard-coded CASE statement's timing of .06 seconds for a million lookups.

Also each time a new user session calls the Package, wouldn't the array be loaded all over again. I agree this will be definitely efficient, if you are still within same Oracle session, especially in PL/SQL LOOPs.

Followup   March 21, 2007 - 8pm Central time zone:

you seem to have missed the crux here.

case was SLOWER than associative array.

My machine is not your machine
My times are not your times.

CASE:
12        l_num := map_iso_country_cd( l_data(i) );
Elapsed: 00:00:01.66

TABLE:
12        l_num := map_iso_country_cd_tab( l_data(i) );
Elapsed: 00:00:08.00

ASSOCIATIVE ARRAY:
12        l_num := my_pkg.map_iso_country_cd( l_data(i) );
Elapsed: 00:00:00.25


and how many sessions does it take to process this?!?! if the overhead of loading the table becomes an issue - you have a seriously bad architecture problem that we need to fix first!!!
5 stars Memory Lookups vs. Database Lookups   May 9, 2007 - 8am Central time zone
Reviewer: Towhidun Nabi from Dhaka, Bangladesh.
Hello,

I was wondering if there is anything wrong with memory lookups.

I must say I am a novice programmer comparing to some of the discussions above.

I am working with a selling company where we have right now 13 depots around the country. We have about 250+ items to sell, and about 20,000+ customers per depots, and per day 1000-1500 invoices per depots. We have a defined Price for each products, Some special prices for a few customers, Some bonus Products, various discount policies and few other tid bits.

I am developing an Invoicing Program with Oracle Forms 6i, Normal form (not web based since my lack of knowledge in web base programs). On the Form Init (new form instance), I load the Item table, Regular Price Table, Bonus Products Table, Special Party, Sale Stop Products Table....all in the clients memory, as I create several form level populated Record Groups to hold the data.

When the user enters an Item code, it is verified (looked-up) against the memory based Record Groups, I did develop a binary searching module to search the record group, as the group is sorted on Item Codes. Since the No of products is very few (< 300). It is working quite ok.

My point is, in the whole program, I never looked up at the database for the name of a product, as everything is in the memory already, as memory is getting cheaper, we have 256MB ram in each Client PCs, so pre-loading some 10-20k data in the memory does not seem to cause much trouble, Only trouble is when one or two new products are entered, the user gets an Invalid Item Code, they have a magin button at the corner of the form saying "Refresh Code", and they are ok again.

I do all the look up for Product Name, Bonus Infos, Special Prices in the Memory with Binary search as most of this are static for a period of time(Like a Month), and is is refreshed when the user run the program and very few time the user needs to refresh.

Only thing I look up in the database in the course of the entry system, is the STOCK of a product, off course.

As I thought, I do least required database access, least network round trips, faster response at the user end...blah blah blah. I do call a stored procedure to get the STOCK, by the way.

I was wondering if theres anything seriously wrong with the approach.

Towhidun Nabi.
IT Executive
Aristopharma Ltd.
Dhaka
Bangladesh.

Followup   May 11, 2007 - 9am Central time zone:

you still NEED to have the database constraints

so you have data integrity.

there is nothing inherently wrong with client side validation - it is just a fact that when you post to the database you MUST validate the data, using the database

else you will have data integrity problems.
5 stars Re: Memory Lookups vs. Database Lookups   May 12, 2007 - 5am Central time zone
Reviewer: Towhidun Nabi from Dhaka, Bangladesh
Dear Sir,

Thank you for your response. Don't get me wrong, I am very much a fan of this site, of you. This is by so far, one of the best, and genuine site, only it is kind of not easy for new - comers.

I do have database constraints on the database as foreign keys, So it is not possible for the user to enter an invalid item code in the database, as when we post the data, it is validated against the foreign keys.

Only I try to minimize the chance of invalid data before even posting the data, as the Pre-loaded client side Record Groups (sort of arrays) are usually valid, chance of data being invalidated after data load is very very minimal. There is the chance of missing code, that is often minimized, as the client usualy gets a message before hand as New Products are launching, All they have to do is refresh Code, or re-enter the program again, and this is only one / two times per whole month.

So, I confidently use memory lookups for all the client level validation, as I know it will be validated against the Database upon data POST / insert /update.

For queries and report, Most of time I try my level best NOT TO JOIN for complex queries, but to supply data from Memory.

So Instead of

Select i.prod_code, p.prod_name, c.customer_name,
  .......
  .. 
  ..
  ..
from 
  Issue_Table i,
  Product_table p,
  Customer_table c,
  Market_table m,
  Territorie_table t,
  sr_names sr, -- delivery personal
  ..
  .
where i.inv_date >= to_date('01-MAY-2007', 'dd-MON-rrrr')
and i.inv_date < to_date('01-JUN-2007', 'dd-MON-rrrr')
and i.prod_code = p.prod_code
and i.cust_code = c.cust_code
and c.market_code = m.market_code
and m.territorie_code = t.territorie_code
and i.sr_code = sr.sr_code
...
...


I simply do


show_status('Initializing...');

rg_id := find_group('Query_Data');
if not id_null(rg_id) then
  delete_group(rg_id);
end if;

rg_id := create_group('Query_Data');
gc_id := add_group_column(rg_id, 'Prod_Code', CHAR_COLUMN, 9);
gc_id := add_group_column(rg_id, 'Prod_Name', CHAR_COLUMN, 50);
gc_id := add_group_column(rg_id, 'Customer_Name', CHAR_COLUMN, 100);

gc_id := add_group_column(rg_id, 'Market_Name', CHAR_COLUMN, 100);

gc_id := add_group_column(rg_id, 'Territory_MR_Name', CHAR_COLUMN, 100);

gc_id := add_group_column(rg_id, 'SR_Name', CHAR_COLUMN, 100);
..
..
..
all the fields necessary.....
..
..

show_status('Loading Issue Data...');

rg_issue := find_group('issue');
if not id_null(rg_issue) then
  delete_group(rg_issue);
end if;

M_Issue_Table := 'Issue_Table';
-- change issue table name for history data
if history_data then
  M_Issue_Table := 'history_Issue_Table_' || month_year;
end if;

rg_issue := create_group_from_query('issue',
  'select * from ' || M_Issue_Table || ' '
  || 'where inv_date >= to_date(''' || M_StartDateStr || ''', ''dd-MON-rrrr'') '
  || 'and inv_date < to_date(''' || M_EndDateStr2 || ''', ''dd-MON-rrrr'') '
  || 'order by invoiceno', FORM_SCOPE, <b>100</b>);

if id_null(rg_issue) then
  message('Error: ' || dbms_error_text);
 ..
  raise form_trigger_failure;
end if;
 
M_ErrCode := populate_group(rg_issue);

M_NoOfIssue := get_group_row_count(rg_issue);

M_No_OF_Query_Data := 0;

for i in 1..M_NoOfIssue loop
  if mod(i, 100) = 0 then
    show_status('Processing Issue Data...' || i || ' of ' || M_NoOfIssue);
  end if;

  M_Prod_Code := get_group_char_cell('issue.prod_code', i);
  M_Prod_Name := ' ';

  M_Pos := bin_search_char('Products', M_NoOfProducts, M_Prod_Code);
  if M_Pos <> 0 then
    M_Prod_Name := get_group_char_cell('Products.prod_name', M_Pos);
  else
<b>    show_error ('Product Code: ' || M_Prod_Code || ' NOT found, may need to refresh');</b>
   return;
  end if; 

  .. Territories
...


... SR Name

  .. Market Name..


  M_No_OF_Query_Data := M_No_OF_Query_Data + 1;
  add_group_row('Query_Data', End_OF_Group);

  set_group_char_cell('query_data.prod_code', M_No_OF_Query_Data, M_Prod_Code);

  set_group_char_cell('query_data.prod_name', M_No_OF_Query_Data, M_Prod_Name);

  .. market name

  .. territorie name

  .. sr name

  .. delivery schedule

  ....  
end loop;

-- finaly we've got query_data to work with
-- just send it to the report as data_parameter


A temporary non-query client-side record group is created, and the data is populated through loop/binary search. I never saw that error: Prod code ... NOT found, may need to refresh, as it is refreshed on form init(new form instance). I asked the user to sue me (or click refresh) when they see it.

I could have done it with a global temporary table, but again, it would disturb the database as we have to select it from again. Inserting into temporary table, still required SQL <=> PLSQL switching, rollback segment(I don't know for sure, may be), network round trip(off course).

As I thought, I am issueing a very simple, straight forward
'select * from Issue_Table where inv_date between start_date and end_date', in the whole report, Inv_date is indexed, where as it should have been a very heavy loaded query with 9 to 10 joines, most of then with (+) sign, for safety from inner/outer join, as I am sure all the data in issue_table in the date range, not something is missing as a result of missing sr_code as not all invoices will have a sr_code.

The 'Issue_Table' is variable, the user may want to view history data, every month, we put all the data into the history with MONTH_YEAR as suffix.

And also, I try to avoid

 
  for rec in (select * from table) loop
    ...
  end loop;

as I have seen if the table have 1000 data, there will be 1000 fetchs, 1000 PLSQL to SQL switch, and eventually 1000 network round trips.

So I try to do Record Groups with 100 Records, so there will be about 1 PLSQL to SQL switch, 10(more or less, depending on the size of the records) network round trips
for 1000 record.

I also have a show_status(msg) to see where I am now running...99.99% it shows loading data, only a very fraction of time (< 1%) it shows Processing...

I don't know, you guys would be laughing or making fun, of what I am doing, doing it the complex way..

As a first thought, I am doing a lot of work just to speed things up.

But I am happy for the outcome, as I think my hard work came to play when I see a report running and so. All my life I tried to minimize Network Round Trips, I don't want to select the same table twice in a same Report, and whenever I could pre-load a table, I do it.

I am not an Oracle/DBA expert, but I learned my lessons through practical experiences. If Oracle can do everything, what will I do????


Towhidun Nabi.
IT Executive
Aristopharma Ltd.
Dhaka
Bangladesh.
naeeym@hotmail.com


Followup   May 14, 2007 - 12pm Central time zone:

and as I said - there is nothing wrong with doing client side validation

you just have to do it in the database as well - and you seem to be saying you do that

so - you are all good.
4 stars load small tables to memory   July 8, 2007 - 2pm Central time zone
Reviewer: Alex Callaghan 
Hi

In your first response (almost 5 years ago), you said

"Did you know that databases were born to be queried? It is one of the things they do best."

Is that still true now days?

Look what Oracle PErformance Tuning Guide suggests (both 9i and 10g):

If possible, avoid requerying the same data, by caching frequently accessed data in the client program or middle tier.

Oracle 10g Performance tuning Course (Oracle University):

Data blocks and header blocks can be accessed by only one process at time. Several process trying to access the same blocks will create contention.


I am pointing this out because I have had several experience with database approach where you have packages querying small lookup table (from small as 10 rows to a couple hundred of rows) all the time in batch jobs and online users (more than 5000 connected users and active from 100 to 200). There were quite a bit problems with hot blocks, when the tables were load to pl/sql arrays the problems went away.

May be I come from c programming world but I do think if you can load things up to memory (and not shared) is a good way to improve things....




Followup   July 8, 2007 - 6pm Central time zone:

how about this:

JUST JOIN

why are you procedurally looking stuff up when you can JUST JOIN, JUST JOIN

databases where born to query
they were born to join
and they permit you to write to them as well (and are very good at doing that)

stop writing procedural code, start writing SQL.
2 stars PL/SQL Table question   August 2, 2007 - 11am Central time zone
Reviewer: A reader from Arlington, VA
I hope you can help.  I am trying to load a PL/SQL table with some data and I want it to be ordered 
by one of the columns.  My example is below that does not work.

create table pet_preference (sid varchar2(1), person_name varchar2(10), pet_type varchar2(10));
ALTER TABLE pet_preference 
   ADD (CONSTRAINT person_name_PK PRIMARY KEY (person_name));
   
insert into pet_preference values ('1', 'lisa', 'cat');
insert into pet_preference values ('2', 'ken', 'dog');
insert into pet_preference values ('3', 'jeff', 'fish');
insert into pet_preference values ('4', 'steve', 'bird');
commit;

set serveroutput on size 1000000
declare
      TYPE pet_t IS TABLE OF pet_preference%ROWTYPE
         INDEX BY pet_preference.person_name%type;
      pet pet_t;
   BEGIN
      for pet_rec in (select * from pet_preference)
      loop
         pet (pet_rec.sid) := pet_rec;
      end loop;  
      for indx1 in pet.first .. pet.last
      loop
         dbms_output.put_line (indx1||' '||pet(indx1).person_name|| ' '||pet(indx1).pet_type);
      end loop;      
   end;
The output is:

1 lisa cat
2 ken dog
3 jeff fish
4 steve bird

but what I want is:

3 jeff fish
2 ken dog
1 lisa cat
4 steve bird

Can you please tell me what I am doing wrong?  I have searched the site and Oracle documentation 
but can't seem to find anything specific to this issue.  Any help would be greatly appreciated.  
Thank you in advance.  This is a great site and you are a great help to many people!


Followup   August 5, 2007 - 12pm Central time zone:

umm, arrays are - arrays

I don't see the logic in having "jeff fish" assigned to slot number 3 - why not slot #1


use "select * from pet_preferences ORDER BY SOMETHING"

to retrieve the data sorted and then print it out in order.
1 stars   August 5, 2007 - 6pm Central time zone
Reviewer: a reader 
The order by did not work as I had tried it.  I printed out the output that I wanted.  The first 
column is an id that gets loaded.  The point was that I want them sorted by the name not the id 
which is what is happening no matter I try,.


Followup   August 6, 2007 - 11am Central time zone:

they are in an array.

if you want them sorted in that array - you SELECT THEM OUT ORDERED AND PLACE THEM INTO THE ARRAY SORTED

Or, you SORT THE ARRAY.

Look at your code:

declare
      TYPE pet_t IS TABLE OF pet_preference%ROWTYPE
         INDEX BY pet_preference.person_name%type;
      pet pet_t;
   BEGIN
      for pet_rec in (select * from pet_preference)
      loop
         pet (pet_rec.sid) := pet_rec;
      end loop;  
      for indx1 in pet.first .. pet.last
      loop
         dbms_output.put_line (indx1||' '||pet(indx1).person_name|| ' '||pet(indx1).pet_type);
      end loop;      
   end;


you retrieve the data UTTERLY RANDOM - so, just put an ORDER BY on that query and you are *done*

I'm not getting it.

The order by *does work*, but you'll get as output:

1 jeff fish
2 ken dog
3 lisa cat
4 steve bird


which is just dandy since that 1,2,3,4 are array indices and rather *meaningless* (since you don't have an order by - the data is returned to you from the database in any order we fell like - not in the order of insertion or anything like that - so your random assignment of 1,2,3,4 to the rows is just that - rather random)

Just use ORDER BY and you are *done*

If you claim "order by doesn't work" we'll need to see your attempt so we can correct it.


the first column you are printing IS NOT AN ID, it is an index into an array, randomly assigned to the data each time it is fetch, it is not persistent data, it is rather meaningless.
5 stars arrays of different sizes   February 2, 2008 - 9pm Central time zone
Reviewer: Kevin from CT USA
Nice thread, I keep on refering to it.  Recently had this question come along and I dont' have the 
indepth to answer it.  Here it is:

What are the differences to expect (memory use?) between these two code snippets, recognizing that 
the only difference between themthe size of the varchar2 item at the root of each associative 
array?

declare
   ta is table of varchar2(2000) index by binary_integer;
   aa ta;
   rc1 sys_refcursor;
begin
   open rc1 for select object_name from dba_objects;
   fetch rc1 bulk collect into aa;
   close rc1;
end;


declare
   ta is table of varchar2(32000) index by binary_integer;
   aa ta;
   rc1 sys_refcursor;
begin
   open rc1 for select object_name from dba_objects;
   fetch rc1 bulk collect into aa;
   close rc1;
end;

I am wondering if there is benefit in defining multiple arrays in a particilar utility package I am 
working on, and then selecting the array that best fits the size of the data element I am working 
with.

OR

If one large array would be good for all purposes.  The code is simpler to write with just one 
array definition, but not worth it if there are extreme memory requirements and/or performance 
changes when using arrays defined with wide string definitions.

Thanks, Kevin


Followup   February 4, 2008 - 3pm Central time zone:

large varchar2 sizes are dynamically allocated to "best fit", they are not preallocated.

this shows they are just about the same.

ops$tkyte%ORA10GR2> connect /
Connected.
ops$tkyte%ORA10GR2> set echo on
ops$tkyte%ORA10GR2> declare
  2     type ta is table of varchar2(2000) index by binary_integer;
  3     aa ta;
  4     rc1 sys_refcursor;
  5  begin
  6     open rc1 for select object_name from dba_objects;
  7     fetch rc1 bulk collect into aa;
  8     close rc1;
  9  end;
 10  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> @mystat pga
ops$tkyte%ORA10GR2> set echo off

NAME                        VALUE
---------------------- ----------
session pga memory         978516
session pga memory max    5762644

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> connect /
Connected.
ops$tkyte%ORA10GR2> set echo on
ops$tkyte%ORA10GR2> declare
  2     type ta is table of varchar2(32000) index by binary_integer;
  3     aa ta;
  4     rc1 sys_refcursor;
  5  begin
  6     open rc1 for select object_name from dba_objects;
  7     fetch rc1 bulk collect into aa;
  8     close rc1;
  9  end;
 10  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> @mystat pga
ops$tkyte%ORA10GR2> set echo off

NAME                        VALUE
---------------------- ----------
session pga memory         978516
session pga memory max    5697108


5 stars Just what I was looking for   February 5, 2008 - 7am Central time zone
Reviewer: Kevin from CT USA
Sorry, I should have checked the memory allocations myself (doh!).

Kevin


3 stars   February 14, 2008 - 2pm Central time zone
Reviewer: Christian 
I like to process everything in a loop in pl/sql - is this a good programming habit ?

Christian


Followup   February 15, 2008 - 7am Central time zone:

not if it could have been done in a single sql statement.

bad:


for x in  (select * from t)
loop
    x.c1 := lower(x.c1);
    update t set c1 = x.c1 where pk = x.pk;
end loop;


good:

update t set c1 = lower(c1);



for a really simple example.

Never write procedural code

Until you have to
5 stars Hard-coded CASE vs SQL lookup   March 21, 2008 - 7pm Central time zone
Reviewer: Sunj from NJ
Tom,
A very belated thanks (1 year old) for the response last year on March 21, 2007.

Your Associative Array solution was definitely the best.
I guess it was faster than SQL solution, because of lack of context switch between SQL and PL/Sql 
engines. Am I right?

Here's my experience:
1. SELECT col1 FROM my_table;
2. SELECT col1, my_function(col1) FROM my_table;
3. Enclose SQL#2 within dynamic SQL.

#1 was obviously the fastest
#2 was slowest
#3, even though is the same as #2, it performed much faster, because it was running fully within 
PL/SQL engine.


Followup   March 24, 2008 - 11am Central time zone:

I do not understand #3 at all


but I would add #4

select my_table.col1, my_lookup.col2 from my_table, my_lookup where <join>

as being the fastest of them all. (as stated by my time and time again above :) )
3 stars Fetch values from pl_sql table using column index instead of column_names   April 7, 2008 - 4pm Central time zone
Reviewer: Maverick 
Tom, I have been trying to do something with pl/sql Tables and got stuck at one place.

Here is the scenario:

  declare
      Type mytype is table of emp%rowtype;
      mytype1 mytype;
      k integer:=0;
      v_sql varchar2(4000):='Select * from emp ';
  begin
      execute immediate v_sql bulk collect into mytype1;

      for i in 1..mytype1.count
      loop
         dbms_output.put_line(mytype1(i).empno);
      end loop;
  end;


My problem is while outputting the data using DBMS_OUTPUT, is there any way to reference array with an index for column name instead of actual column name
like
dbms_output.put_line(mytype1(i).1stcolumn||','||mytype1(i).2ndcolumn);
?

Please don't ask me why I want to do this way , but I am interested to see if there is a way to do that?
oh..btw, I am using Oracle 10g [10.2.0] if it helps.

Thanks

Followup   April 9, 2008 - 9am Central time zone:


Please don't ask me why I want to do this way

fine, I won't - but that just means I won't be able to do anything useful for you. If I knew what your GOAL was, I might be able to suggest something useful.

But since you just want me to be a "answer my question and nothing but the question" guy - so be it.


The answer is "nope"
2 stars I was just comparing different languages   April 9, 2008 - 9am Central time zone
Reviewer: Maverick 
Tom, I was just trying to do something [I did it using PERL]in PL/SQL with Arrays. In PERL I can 
loop through a row and seperate each column of row and manipulate any way I want to. I don't have 
to know what are the actual columns in it. 
I was just wondering if we can do the same in PL/SQL.

Trying to see differences between pl/sql and PERL or other programming languages.

So, I can still take the answer as no. correct?


Followup   April 9, 2008 - 2pm Central time zone:

correct, the answer is no, plsql is a static compiled language.
3 stars RE: just comparing different languages   April 9, 2008 - 11am Central time zone
Reviewer: Duke Ganote from Amelia (named in honor of Amelia Bowdoin, not Earhart), Ohio USA
I think Tom's asking "what is your goal" because PL/SQL is an auxiliary to SQL. Hence his mantra:
  if you can do it in a single sql, do it.
  if not do it in as little plsql as possible...
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:5408706816128#30690569723524

Just as there's an "impedence mismatch" between "object oriented design" and "relational design", there's a similar mismatch between PERL (which seems to procedural and string-oriented), and SQL (which is set- and database-oriented).

There are similar mismatches between folks who say "In SQL Server I use local temp tables a lot, so how do I do use local temp tables in Oracle?" The answer is: temp tables are a crutch, don't use local temp tables in Oracle, there are better Oracle constructs for achieving your goal; e.g.
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1164655862293


2 stars   April 9, 2008 - 1pm Central time zone
Reviewer: Maverick 
Dear Duke Ganote , I think we are talking two different things here.you are talking about Database 
or SQL [structured query language], but I am talking about Programming language [PL/SQL]. so, we 
can compare Pl/SQL with PERL or C or C# etc..[I might be wrong there!!]

But I was just trying to see if there is a way to do this in PL/SQL. and Tom, kind of mentioned we 
cannot [I guess].

Thanks 

Thanks,


3 stars RE: compare Pl/SQL with PERL or C or C# etc   April 9, 2008 - 2pm Central time zone
Reviewer: Duke Ganote from Amelia (not Bedelia), Ohio USA
Maverick-- It just reminds my of one of my early software classes where I learned LISP, SNOBOL, SIMSCRIPT and a mathematical symbol manipulation package (it would manipulate differential equations) in one quarter. Each had its strengths and weaknesses, but they weren't usually comparable in functionality.

I did annoy my professor by using LISP (a list processing language used for artificial intelligence) to calculate some outrageously big Fibonacci number. LISP did it, but not very efficiently, and I think he was annoyed at how much CPU I'd chewed up overnight on the university's IBM 360 mainframe :)


Write a Review
 


All information and materials provided here are provided "as-is"; Oracle disclaims all express and implied warranties, including, the implied warranties of merchantability or fitness for a particular use. Oracle shall not be liable for any damages, including, direct, indirect, incidental, special or consequential damages for loss of profits, revenue, data or data use, incurred by you or any third party in connection with the use of this information or these materials.

About Oracle | Legal Notices and Terms of Use | Privacy Statement