I think this was published in Oracle Professional
Scott Watson, September 16, 2002 - 2:46 pm UTC
I beleive I read this in Oracle Professional written by Steve Feuerstein. Here is a link to the article. Basically it is for 9iR2 and uses associative arrays.
</code>
http://www.oracleprofessionalnewsletter.com/op/OPmag.nsf/0/F2E754B65828DEF985256C1A0056DF3F <code>
I can rationlize this article by saying that the PL/SQL engine does not have to perform a context switch to the SQL layer, however, using this method would require that each connection maintain a lookup table which may require more memory if the number of users using this approach is large. Like any good solution it requires testing in your enviroment etc. etc.
HTH
Scott.
September 16, 2002 - 8:13 pm UTC
In 9iR2 (not 817 and not before) this may be true with the new associative arrays which are b-tree indexed RAM data structures (very efficient for searching). This is "brand new" at this point. A quick and dirty test will show that they are in fact faster and probably more scalable but at the cost of RAM (tradeoffs as usual).
This will NOT work in 817 (note to the original questioner, this will not work for you)
ops$tkyte@ORA920.US.ORACLE.COM> create table t ( code varchar2(5) constraint t_pk primary key, data varchar2(40) )
2 organization index
3 /
Table created.
ops$tkyte@ORA920.US.ORACLE.COM> exec gen_data( 'T', 5000 );
PL/SQL procedure successfully completed.
ops$tkyte@ORA920.US.ORACLE.COM> select count(*) from t;
COUNT(*)
----------
5000
ops$tkyte@ORA920.US.ORACLE.COM>
ops$tkyte@ORA920.US.ORACLE.COM> declare
2 type lookupTable is table of varchar2(40) index by varchar2(5);
3 type indexByTable is table of varchar2(40) index by binary_integer;
4
5 l_lookupTable lookupTable;
6 l_codes indexByTable;
7 l_values indexByTable;
8 l_value varchar2(40);
9
10 l_start number;
11 l_run1 number;
12 l_run2 number;
13 begin
14 select code, data BULK COLLECT into l_codes, l_values from t;
15
16 insert into run_stats select 'before', stats.* from stats;
17
18 l_start := dbms_utility.get_time;
19 for i in 1 .. l_codes.count
20 loop
21 select data into l_value from t where code = l_codes(i);
22 end loop;
23 l_run1 := (dbms_utility.get_time-l_start);
24 dbms_output.put_line( l_run1 || ' hsecs' );
25
26 insert into run_stats select 'after 1', stats.* from stats;
27 l_start := dbms_utility.get_time;
28
29
30 for i in 1 .. l_codes.count
31 loop
32 l_lookupTable( l_codes(i) ) := l_values(i);
33 end loop;
34 for i in 1 .. l_codes.count
35 loop
36 l_value := l_lookupTable( l_codes(i) );
37 end loop;
38 l_run2 := (dbms_utility.get_time-l_start);
39 dbms_output.put_line( l_run2 || ' hsecs' );
40 dbms_output.put_line
41 ( 'run 1 ran in ' || round(l_run1/l_run2*100,2) || '% of the time' );
42
43 insert into run_stats select 'after 2', stats.* from stats;
44 end;
45 /
74 hsecs
3 hsecs
run 1 ran in 2466.67% of the time
PL/SQL procedure successfully completed.
ops$tkyte@ORA920.US.ORACLE.COM> select a.name, b.value-a.value run1, c.value-b.value run2,
2 ( (c.value-b.value)-(b.value-a.value)) diff
3 from run_stats a, run_stats b, run_stats c
4 where a.name = b.name
5 and b.name = c.name
6 and a.runid = 'before'
7 and b.runid = 'after 1'
8 and c.runid = 'after 2'
9 and (c.value-a.value) > 0
10 and (c.value-b.value) <> (b.value-a.value)
11 order by abs( (c.value-b.value)-(b.value-a.value))
12 /
NAME RUN1 RUN2 DIFF
------------------------------ ---------- ---------- ----------
STAT...recursive calls 5031 50 -4981
STAT...calls to get snapshot s 5010 16 -4994
cn: kcmgss
STAT...execute count 5005 7 -4998
STAT...index fetch by key 5000 1 -4999
LATCH.shared pool 5127 123 -5004
LATCH.cache buffers chains 10167 291 -9876
STAT...session logical reads 10056 103 -9953
STAT...consistent gets 10012 24 -9988
STAT...consistent gets - exami 10000 2 -9998
nation
LATCH.library cache pin 10105 101 -10004
LATCH.library cache 10258 248 -10010<b>
STAT...session pga memory 0 476412 476412</b>
62 rows selected.
ops$tkyte@ORA920.US.ORACLE.COM>
All in all, if you got the RAM and you got 9iR2, this makes sense. Before then -- I question it.
Tried and Tested
Adrian Billington, September 16, 2002 - 3:58 pm UTC
Tom
Initialising package global index-bys with data from lookup tables, using the numeric-PK as the index (can be varchar2 now with associative arrays of course) is a tried and tested technique of returning lookup values much quicker than repeated database access. I took an hour off a one hour fifteen minutes INSERT using this method and I have a couple of demos "proving" them to be quicker (both methods have also been through your run_stats harness as well to great effect).
I am off work at the moment so can't get to the demos, but if no-one beats me to it, then I'll post something next week.
The SF article in Oracle Professional this month takes it from an associative array context, but I already have a demo of that as well and again, they are much more efficient than repeated lookup table access.
BUT, the only thing I have not tested against is repeated table access with a BUFFER POOL KEEP / KEEP CACHE, so maybe that's for someone else to try and report on. But hey, SQL can't ALWAYS win, surely ;).
Regards
Adrian
September 16, 2002 - 8:17 pm UTC
I betcha using a single INSERT -- removing the procedural code all together -- would get rid of the rest of the time.
I rarely see codes that are 1, 2, 3, 4, 5, .... N -- making the index by tables of some what limited use. It is usually something like
AK Alaska
VA Virginia
...
What about the other way around
Scott Watson, September 17, 2002 - 8:46 am UTC
Hi Tom,
I was thinking performing lookups in PL/SQL is faster than doing the same lookup in sql but what about the other way around. (I don't have 9i so I cannot test this myself) What would be faster in this case.
Select a.col1, b.col2
from a, b
where a.id = b.a_id;
or
select a.col1, lookup_in_pl_sql(a.id)
from a
Thanks,
Scott.
September 17, 2002 - 12:34 pm UTC
Select a.col1, b.col2
from a, b
where a.id = b.a_id;
Try it and see.
That is not what I meant
Scott watson, September 17, 2002 - 1:31 pm UTC
Funny!!!
I know I can do the table lookup but what about the lookup using a PLSQL function in terms of the perfomance we have beed discussing. We have already asserted that these types of lookups are faster when executing within PLSQL than SQL but what about when we access PLSQL from our sql statement.
select name_en, convert_to_french(name_en) PLSQL_FUNCTION
from tableA
This is what I was refering to when I said I did not have 9i and could not test it.
September 17, 2002 - 8:03 pm UTC
well, you could have tested with an INDEXED BY BINARY_INTEGER table (eg: number to string lookup)....
but here you go -- sql will be FASTER here:
ops$tkyte@ORA920.LOCALHOST> /*
DOC>drop table t;
DOC>
DOC>create table t ( code varchar2(5) constraint t_pk primary key, data varchar2(40) )
DOC>organization index
DOC>/
DOC>
DOC>exec gen_data( 'T', 5000 );
DOC>
DOC>select count(*) from t;
DOC>
DOC>pause
DOC>*/
ops$tkyte@ORA920.LOCALHOST>
ops$tkyte@ORA920.LOCALHOST>
ops$tkyte@ORA920.LOCALHOST> create or replace package demo_pkg
2 as
3 function get_value( p_code in varchar2 ) return varchar2;
4 end;
5 /
Package created.
ops$tkyte@ORA920.LOCALHOST> create or replace package body demo_pkg
2 as
3 type indexByTable is table of varchar2(40) index by binary_integer;
4 type lookupTable is table of varchar2(40) index by varchar2(5);
5
6 l_lookupTable lookupTable;
7
8 function get_value( p_code in varchar2 ) return varchar2
9 as
10 l_codes indexByTable;
11 l_values indexByTable;
12 begin
13 if nvl( l_lookupTable.count,0 ) = 0
14 then
15 dbms_output.put_line( 'filling table' );
16 select code, data BULK COLLECT into l_codes, l_values from t;
17 for i in 1 .. l_codes.count
18 loop
19 l_lookupTable( l_codes(i) ) := l_values(i);
20 end loop;
21 end if;
22 return l_lookupTable( p_code );
23 exception
24 when no_data_found then
25 raise program_error;
26 end;
27 end;
28 /
Package body created.
ops$tkyte@ORA920.LOCALHOST>
ops$tkyte@ORA920.LOCALHOST>
ops$tkyte@ORA920.LOCALHOST> declare
2 l_start number;
3 l_run1 number;
4 l_run2 number;
5 begin
6 insert into run_stats select 'before', stats.* from stats;
7
8 l_start := dbms_utility.get_time;
9 for x in ( select a.code, b.data from t a, t b where a.code = b.code )
10 loop
11 null;
12 end loop;
13 l_run1 := (dbms_utility.get_time-l_start);
14 dbms_output.put_line( l_run1 || ' hsecs' );
15
16 insert into run_stats select 'after 1', stats.* from stats;
17 l_start := dbms_utility.get_time;
18
19
20 for x in ( select a.code, demo_pkg.get_value( a.code ) from t a )
21 loop
22 null;
23 end loop;
24 l_run2 := (dbms_utility.get_time-l_start);
25 dbms_output.put_line( l_run2 || ' hsecs' );
26 dbms_output.put_line
27 ( 'run 1 ran in ' || round(l_run1/l_run2*100,2) || '% of the time');
28
29 insert into run_stats select 'after 2', stats.* from stats;
30 end;
31 /
23 hsecs
filling table
65 hsecs
run 1 ran in 35.38% of the time
PL/SQL procedure successfully completed.
ops$tkyte@ORA920.LOCALHOST>
ops$tkyte@ORA920.LOCALHOST>
ops$tkyte@ORA920.LOCALHOST> select a.name, b.value-a.value run1, c.value-b.value run2,
2 ( (c.value-b.value)-(b.value-a.value)) diff
3 from run_stats a, run_stats b, run_stats c
4 where a.name = b.name
5 and b.name = c.name
6 and a.runid = 'before'
7 and b.runid = 'after 1'
8 and c.runid = 'after 2'
9 and (c.value-a.value) > 0
10 and (c.value-b.value) <> (b.value-a.value)
11 order by abs( (c.value-b.value)-(b.value-a.value))
12 /
NAME RUN1 RUN2 DIFF
------------------------------ ---------- ---------- ----------
....
LATCH.cache buffers chains 25135 10515 -14620
LATCH.library cache pin 95 30215 30120
LATCH.shared pool 112 35210 35098
LATCH.library cache 236 50445 50209
STAT...session uga memory 0 589176 589176
STAT...session pga memory -131072 589824 720896
70 rows selected.
ops$tkyte@ORA920.LOCALHOST>
Also, I'll REVISE my earlier answer.
Instead of doing it PROCEDURALLY, lets do the lookup where it belongs! In SQL in the first place. EG: I would not be caught reading a table and then doing another query to LOOKUP data in the first place. Only if I was doing some data load would I do that -- else, I would have JOINED in the first place...
What I mean by that -- is that the basic PREMISE that you should
for x in ( select * from t1 )
loop
select value into l_var from t2 where t2.code = t1.code;
.....
is flawed in the FIRST place, you should have JOINED (or selected a select)
Now that will outperform using a index by table (associated array or not)...
Strange behaviour...
Padders, September 18, 2002 - 4:31 pm UTC
Since when did collection.COUNT return NULL for empty collection? I think you'll find you never even populate your collection.
This raises another issue (and I am testing on 9.2, btw) - if the collection is never populated, RETURN l_lookuptable (p_code) should raise NO_DATA_FOUND, and yet we get no error from test script, although if we call function from PL/SQL, we get NO_DATA_FOUND raised as expected, e.g.
SQL*Plus: Release 9.2.0.1.0 - Production on Wed Sep 18 21:12:43 2002
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.1.0 - Production
SQL> SELECT demo_pkg.get_value ('Rubbish')
2 FROM dual;
DEMO_PKG.GET_VALUE('RUBBISH')
---------------------------------------------------------------------
SQL> BEGIN
2 DBMS_OUTPUT.PUT_LINE (demo_pkg.get_value ('Rubbish'));
3 END;
4 /
BEGIN
*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at "SCOTT.DEMO_PKG", line 28
ORA-06512: at line 2
Bizarrely, it has nothing to do with collection, since we can create function which just raises NO_DATA_FOUND and get the same behaviour, e.g.
SQL> CREATE OR REPLACE FUNCTION ndf
2 RETURN VARCHAR2
3 IS
4 BEGIN
5 RAISE NO_DATA_FOUND;
6 END;
7 /
Function created.
SQL> SELECT ndf
2 FROM dual;
NDF
----------------------------------------------------------------------
SQL> BEGIN
2 DBMS_OUTPUT.PUT_LINE (ndf);
3 END;
4 /
BEGIN
*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at "SCOTT.NDF", line 5
ORA-06512: at line 2
Note that the same does not appear to apply to other exceptions, and no I haven't tried them all, but VALUE_ERROR, for example, appears to work as expected, e.g.
SQL> CREATE OR REPLACE FUNCTION ve
2 RETURN VARCHAR2
3 IS
4 BEGIN
5 RAISE VALUE_ERROR;
6 END;
7 /
Function created.
SQL> SELECT ve
2 FROM dual;
SELECT ve
*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at "SCOTT.VE", line 5
SQL> BEGIN
2 DBMS_OUTPUT.PUT_LINE (ndf);
3 END;
4 /
BEGIN
*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at "SCOTT.NDF", line 5
ORA-06512: at line 2
Please confirm - have I lost the plot or is this expected behaviour (not the me losing the plot bit, the other bit)?
September 18, 2002 - 6:53 pm UTC
whoops -- corrected that. surprisingly, the results were not materially affected. see above
Still wondering...
Padders, September 19, 2002 - 4:10 am UTC
Thanks for correcting.
But why is NO_DATA_FOUND exception ignored by SQL, and NULL returned instead?
September 19, 2002 - 7:55 am UTC
I guess cause the answer is "unknown"
just like if I
select ( select * from dual where 1=0 ) from dual;
that does not fail, even though the subquery (function) returns NO_DATA_FOUND. or
select * from dual where 1 = ( select 0 from dual where 1=0 )
the subquery returns NULL - well, actually the subquery doesn't return anything, it is "unknown" what it returns. the function is treated in the same way.
Some results...
Adrian Billington, September 19, 2002 - 5:12 am UTC
Tom
I agree with your comment about SQL being faster when the lookup table is available (i.e. just join to it), but I don't think that was ever disputed. We wouldn't write a PL/SQL function to do such simple SQL work in the case of a straight insert. But when you don't have access to that lookup table (i.e. during a data load), the use of an index-by (or extended assoc. array in 9i) beats the hell out of a database-accessing function.
On a simple 10,000 row insert demo, looking up the description from a lookup table for each FK row in the staging table, the results were as follows (consistently - ran about 4 times):-
SQL with join to lookup table........ 0 secs 16 hsecs
PL/SQL function using index-by....... 0 secs 67 hsecs
PL/SQL function using lookup table...10 secs 66 hsecs
Which I guess validates the assertions.
I would have thought that from an OLTP perspective, the use of index-bys for the common lookup tables should be encouraged, to save all those repeated accesses to the numerous lookup tables that OLTP systems seem to have, which is possibly relevant the original poster...
Regards
Adrian
September 19, 2002 - 7:57 am UTC
Yes, that is exactly what i said tho:
...
Instead of doing it PROCEDURALLY, lets do the lookup where it belongs! In SQL
in the first place. EG: I would not be caught reading a table and then doing
another query to LOOKUP data in the first place. Only if I was doing some data
load would I do that -- else, I would have JOINED in the first place...
.....
What I see CONSTANTLY, continously and all of the time is code like
for x in ( select * from t )
loop
select ... into ... from t2 where t2.code = t.code;
......
that should be a join, if you see yourself doing that, stop.
During a data load -- you are 100% correct.
(consider an OLTP application a loader and the premise continues ;)
illustrate
Nag, September 19, 2002 - 8:44 pm UTC
"
What I see CONSTANTLY, continously and all of the time is code like
for x in ( select * from t )
loop
select ... into ... from t2 where t2.code = t.code;
......
"
Tom, can you give us an example and show us , how a loop like the above can be converted into sql
September 20, 2002 - 7:55 am UTC
It is called a join. join t to t2.
Have a look at this , and tell us wha tyou have to say
Famy, September 26, 2002 - 3:43 pm UTC
this is from quest-pipelines , a very well respected resource..
July's Tip of the Month
Loading a Lookup Table in Memory to Speed Up Queries
Compliments of Dan Clamage, PL/SQL Pipeline SYSOP (danielj@clamage.com)
Recently, I worked with a developer who faced the challenge of speeding up his Pro*C application. It was taking about 2 hours to run. After examining his application, I discovered that he had one table on which every row returned in the driving loop required a random lookup. Furthermore, he could expect nearly every row in this lookup table to be accessed at some point over the life of the program. I instructed him to preload this moderately sized table (~50k rows) into a persistent PL/SQL table. In order to be persistent, the PL/SQL table had to be declared in the package body, outside of any routine:
CREATE OR REPLACE
PACKAGE BODY lookup
IS
TYPE typ_val_data IS TABLE OF tbl_lookup.val_data%TYPE
INDEX BY BINARY_INTEGER;
tab_val_data typ_val_data;
I explained to him that the numeric primary key could be used as the index offset, and the column value he needed could then be stored at that offset. For example:
Lookup Table
OID NUMBER(10) VAL_DATA VARCHAR2(4)
3 'GHGL'
27 'ACBF'
101 'LEFP'
157 'KPOI'
PL/SQL Table (also known as index-by table or array)
Array Offset Array Value
3 'GHGL'
**no data in array elements 4-26**
27 'ACBF'
**no data in array elements 28-100**
101 'LEFP'
**no data in array elements 158+**
157 'KPOI'
This would result in a sparsely populated array. He populated the array up front, as part of his program initialization, using a packaged procedure. This procedure simply read every row in his lookup table in a cursor FOR loop, and used the primary key as the element offset into the PL/SQL table. For example:
PROCEDURE populate_lookup
IS
CURSOR get_all_data IS
SELECT oid, val_data
FROM tbl_lookup;
BEGIN
FOR rec IN get_all_data LOOP
tab_val_data(rec.oid) := rec.val_data;
END LOOP;
END populate_lookup;
Then he provided a packaged function to interface with this array. He could use this function in PL/SQL or in SQL. For example:
FUNCTION get_val_data(p_oid IN tbl_lookup.oid%TYPE)
RETURN tbl_lookup.val_data%TYPE
IS
BEGIN
RETURN(tab_val_data(p_oid));
EXCEPTION
WHEN NO_DATA_FOUND THEN
RETURN(NULL);
END get_val_data;
Inside his main loop, he coded something like this:
INSERT INTO tbl_target (val_data, ...)
VALUES (lookup.get_val_data(fk_oid), ...);
He had to fully qualify the function, since it's being called from SQL. He reported that his program now runs in 20 minutes, a 5x speed improvement.
September 26, 2002 - 4:40 pm UTC
And he could have speed it up 10x more by simply coding:
insert into tbl_target ( val_data, .... )
select max(t.val_data), :bv1, :bv2, :bv3, :b4, ....
from tbl_lookup t
where oid = :bv0;
or 80x faster by using array binds and the above!!!!
Here is the output from my testing (totally consistent, ran it many many times):
ops$tkyte@ORA920.US.ORACLE.COM> !./test
Connected to ORACLE as user: /
Using SQL 2002-09-26 16:38:11.041532
+000000000 00:00:02.851382000
Using plsql 2002-09-26 16:38:13.902126
+000000000 00:00:04.002790000
Using SQL 2002-09-26 16:38:17.915989
+000000000 00:00:00.534158000
ops$tkyte@ORA920.US.ORACLE.COM> select count(*) from t2;
COUNT(*)
----------
9000
ops$tkyte@ORA920.US.ORACLE.COM>
so, SQL -- no PLSQL = 2.8 seconds.
PLSQL = 4 seconds
SQL with array binds of 100 = 0.58 seconds......
Here is my code:
drop table t;
create table t ( oid primary key, val_data )
organization index
as
select rownum oid, object_name val_data from all_objects;
drop table t2;
create table t2 ( val_data varchar2(30), data varchar2(400) );
create or replace package pkg
as
function f( n in number ) return varchar2;
end;
/
create or replace package body pkg
as
type lookup is table of varchar2(30) index by binary_integer;
g_data lookup;
function f( n in number ) return varchar2
is
begin
return g_data(n);
exception
when no_data_found then return null;
end;
begin
for x in ( select * from t )
loop
g_data(x.oid) := x.val_data;
end loop;
end;
/
#include <stdio.h>
#include <string.h>
static char * USERID = "/";
#define SQLCA_INIT
EXEC SQL INCLUDE sqlca;
static void sqlerror_hard( int lineno)
{
EXEC SQL WHENEVER SQLERROR CONTINUE;
printf("\nORACLE error detected (%d):", lineno);
printf("\n% .70s \n", sqlca.sqlerrm.sqlerrmc);
EXEC SQL ROLLBACK WORK RELEASE;
exit(1);
}
static void process1( void )
{
EXEC SQL BEGIN DECLARE SECTION;
varchar data[400];
varchar time[125];
int i;
EXEC SQL END DECLARE SECTION;
EXEC SQL WHENEVER SQLERROR DO sqlerror_hard(__LINE__);
memset( data.arr,'*',400 );
strncpy( data.arr, "process1", strlen("process1") );
data.len = 400;
EXEC SQL SELECT to_char(systimestamp,'YYYY-mm-dd hh24:mi:ss.ff')
INTO :time
FROM DUAL;
printf( "Using SQL %.*s\n", time.len, time.arr );
for( i = 0; i < 3000; i++ )
{
EXEC SQL insert into t2
select max(val_data), :data
from t
where oid = :i;
}
EXEC SQL SELECT SYSTIMESTAMP-to_timestamp(:time,'yyyy-mm-dd hh24:mi:ss.ff') INTO :time FROM DUAL;
printf( "%.*s\n", time.len, time.arr );
EXEC SQL COMMIT;
}
static void process2( void )
{
EXEC SQL BEGIN DECLARE SECTION;
varchar data[400];
varchar time[125];
int i;
EXEC SQL END DECLARE SECTION;
EXEC SQL WHENEVER SQLERROR DO sqlerror_hard(__LINE__);
memset( data.arr,'*',400 );
strncpy( data.arr, "process2", strlen("process2") );
data.len = 400;
EXEC SQL SELECT to_char(systimestamp,'YYYY-mm-dd hh24:mi:ss.ff')
INTO :time
FROM DUAL;
printf( "Using plsql %.*s\n", time.len, time.arr );
for( i = 0; i < 3000; i++ )
{
EXEC SQL insert into t2
values ( pkg.f(:i), :data );
}
EXEC SQL SELECT SYSTIMESTAMP-to_timestamp(:time,'yyyy-mm-dd hh24:mi:ss.ff') INTO :time FROM DUAL;
printf( "%.*s\n", time.len, time.arr );
EXEC SQL COMMIT;
}
static void process3( void )
{
EXEC SQL BEGIN DECLARE SECTION;
varchar data[100][400];
varchar time[125];
int oids[100];
int i;
EXEC SQL END DECLARE SECTION;
EXEC SQL WHENEVER SQLERROR DO sqlerror_hard(__LINE__);
for( i = 0; i < 100; i++ )
{
memset( data[i].arr,'*',400 );
strncpy( data[i].arr, "process3", strlen("process3") );
data[i].len = 400;
}
EXEC SQL SELECT to_char(systimestamp,'YYYY-mm-dd hh24:mi:ss.ff')
INTO :time
FROM DUAL;
printf( "Using SQL with Array binds %.*s\n", time.len, time.arr );
for( i = 0; i < 3000; i++ )
{
oids[ i%100 ] = i;
if ( i%100 == 99 )
{
EXEC SQL insert into t2
select max(val_data), :data
from t
where oid = :oids;
}
}
EXEC SQL SELECT SYSTIMESTAMP-to_timestamp(:time,'yyyy-mm-dd hh24:mi:ss.ff') INTO :time FROM DUAL;
printf( "%.*s\n", time.len, time.arr );
EXEC SQL COMMIT;
}
main( argc, argv )
int argc;
char * argv[];
{
EXEC SQL BEGIN DECLARE SECTION;
VARCHAR oracleid[50];
EXEC SQL END DECLARE SECTION;
strcpy( oracleid.arr, USERID );
oracleid.len = strlen(oracleid.arr);
EXEC SQL WHENEVER SQLERROR DO sqlerror_hard(__LINE__);
EXEC SQL CONNECT :oracleid;
printf("\nConnected to ORACLE as user: %s\n\n", oracleid.arr);
process1();
process2();
process3();
exit(0);
}
Then why use pl/sql tables?
Kashif, October 04, 2002 - 1:14 pm UTC
Hi Tom,
Great discussion, thanks for the invaluable input. Just a quick couple of questions. Firstly, you mention:
"
What I see CONSTANTLY, continously and all of the time is code like
for x in ( select * from t )
loop
select ... into ... from t2 where t2.code = t.code;
......
"
and you mention this is not the correct way of doing lookups. Does this apply when there is the need to outer join the driving table in the main query, i.e.
select t.*
from t, t2
where t.code (+) = t2.code
Is it not more efficient to read the main table and then read the lookup table, and return nulls (using exception when no_data_found then return null) when the lookup table doesn't retrieve anything?
Secondly, your response to the original poster seemed to indicate that pl/sql tables aren't always the best solution when it comes to lookup tables. Can you give examples of when they are the best solution? Thanks in advance.
Kashif
October 04, 2002 - 6:54 pm UTC
I would outer join or select a select (eg:
select dept.*, (select count(*) from emp where emp.deptno = dept.deptno )
from dept
never procedurally loop running a select based on other selects if I can.
A plsql table to do some lookups might make sense when doing a flat file data load and you don't have external tables (eg: pre 9i) in 9i and beyond, maybe NEVER.
Whoops...
Kashif, October 04, 2002 - 4:32 pm UTC
I meant the outer join should be on the lookup table (t2), i.e. the query would be something like
select t.*
from t, t2
where t.code = t2.code (+)
Wouldn't it be more efficient to simply read the rows needed from the driving table (t) and then do inner selects on the lookup (t2) table as needed? THanks.
Kashif
October 04, 2002 - 7:20 pm UTC
if you can do it in a single query -- do it, it's almost always faster, easier, more maintainable, faster to type, etc etc etc
Decode on select (select)?
Kashif, October 11, 2002 - 12:04 pm UTC
Hi Tom,
Thanks for your tip, it works wonders in my query! Quick question, if I can control when I execute the 'select a select' somehow, for example using a decode, will that be more efficient versus executing the 'select a select' for each row? Example:
select deptno, decode (deptno, null, null, (select deptname from dept where dept.deptno = emp.deptno)
from emp;
versus
select deptno, (select deptname from dept where dept.deptno = emp.deptno)
from emp
assuming the deptno in emp is not always populated. Thanks for your feedback in advance.
Kashif
P.S. Hope you're having fun at the VOUG conference!
October 11, 2002 - 8:37 pm UTC
Well, lets see....
DOC>create table emp ( deptno int );
DOC>
DOC>create table dept as select * from scott.dept;
DOC>alter table dept add constraint dept_pk primary key(deptno);
DOC>
DOC>insert into emp
DOC>select decode( mod(rownum,5), 0, null, mod(rownum,5)*10 )
DOC>from all_objects;
DOC>*/
ops$tkyte@ORA920.US.ORACLE.COM>
ops$tkyte@ORA920.US.ORACLE.COM> set autotrace traceonly
ops$tkyte@ORA920.US.ORACLE.COM> select deptno, (select dname from dept where dept.deptno = emp.deptno)
2 from emp
3 /
29284 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (FULL) OF 'EMP'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2012 consistent gets
0 physical reads
0 redo size
570620 bytes sent via SQL*Net to client
21971 bytes received via SQL*Net from client
1954 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
29284 rows processed
ops$tkyte@ORA920.US.ORACLE.COM> select deptno, decode( deptno, null, null, (select dname from dept where dept.deptno = emp.deptno) )
2 from emp
3 /
29284 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (FULL) OF 'EMP'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2012 consistent gets
0 physical reads
0 redo size
570645 bytes sent via SQL*Net to client
21971 bytes received via SQL*Net from client
1954 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
29284 rows processed
ops$tkyte@ORA920.US.ORACLE.COM> select deptno from emp
2 /
29284 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (FULL) OF 'EMP'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2006 consistent gets
0 physical reads
0 redo size
347959 bytes sent via SQL*Net to client
21971 bytes received via SQL*Net from client
1954 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
29284 rows processed
ops$tkyte@ORA920.US.ORACLE.COM> set autotrace off
ops$tkyte@ORA920.US.ORACLE.COM>
I threw in that last query to show you that there is "magic" happening with select of selects. See the number of consistent gets on that last query? Very very close to the others
There are optimizations happening here that go beyond what we can see.
exec gen_data( 'T', 5000 );
A reader, October 12, 2002 - 4:23 am UTC
Hi Tom,
I've seen this procedure executed in the response above,
can you tell what the procedure is doing, Of course it is populating the table with test data, but how?
thanks in advance.
October 12, 2002 - 9:14 am UTC
Comments on this solution?
andrew, October 25, 2002 - 4:14 pm UTC
October 26, 2002 - 12:12 pm UTC
its the same as the other discussions here.
if you CANNOT DO IT in a single sql statement and must loop procedurally (which many people do when they SHOULD NOT), this can help.
If you can do it in a join in the first place, rather then lookup lookup lookup all of the time -- do it that way.
Whoops - nevermind...
Andrew, October 25, 2002 - 4:28 pm UTC
Sorry Tom - I just saw your comments on the link above.
Lookups in SGA?
Clark, July 31, 2003 - 8:22 am UTC
Tom, my first time, so please be gentle! And please excuse (and explain) any misunderstandings I may have of Oracle.
To summarise the whole thread, I think, and a common point you make in all discussions, do it in SQL unless it HAS to be PLSQL - so do a(n sql) JOIN to a lookup table and not an x-ref to a (plsql) memory structure pre-loaded with a lookup. (Tho, exception from 9iR2 noted, but I'm on 817.)
However, I often have the need to perform a lookup within plsql but without a join to another table, e.g to validate input. Right, now I think I've got this. This is an OLTP system, so it should be considered as a data loader and therefore keeping these lookups in plsql memory is an ok way of doing it. Yes?
So -- is there not some way of sticking them in some part of the SGA and avoiding duplicating UGA for each session? Something along the lines of a SERIALLY_REUSABLE package that doesn't lose state between calls would do the trick -- it would only ever need a SINGLE copy of the package, globals wouldn't need re-initialising on every call, etc.
Or am I worrying too much about memory/processing usage? Maybe there's so little overhead in doing a genuine sql lookup, from plsql, and letting Oracle store the lookups in the KEEP pool (thus managing SGA usage for me too) that I should just do it that way -- but comments about sql-plsql context switching lead me to believe that this SHOULD be avoided.
I could/can do straight performance tests for speed, but I am not experienced enough to understand and find out the wider implications regarding SGA usage etc. Speed is not always the primary concern in an OLTP system -- an extra 2 millisecs per user query, say, would not be of concern if you can shave 20% from the memory you need.
Many thanks for your time.
Cheers,
Clark.
July 31, 2003 - 8:56 am UTC
give me a concrete example whereby plsql would have to perform a validation PRIOR to inserting the data to work with. Me, I would let SQL do the validation itself.
If I needed to lookup some data,
insert into t
select plsql_variable_a, plsql_variable_b, plsql_variable_c, column_x_from_table
from lookup_table
where lookup_table.key = plsql_variable_d;
that would do my "lookup" for me straight away. SQL%rowcount would tell me if the lookup "failed"
If this was a data validation -- well, then I would use declaritive RI and let the database tell me when the data was garbage.
lookup x-ref
Clark, July 31, 2003 - 9:43 am UTC
Sorry, put it wrongly, I do use RI constraints, it wasn't for validation it was for x-ref.
The example I have just implemented uses a table of response codes (ids) with associated meanings and error-levels, eg.
create table response_codes as
( id NUMBER
, descr VARCHAR2(100)
, errlevel NUMBER
);
Crucially, MULTIPLE "detail" response codes are possible, which get stored in a child table to an "overall" response -- the overall being dictated by the worst-case detail, the highest response_codes.errlevel.
So, during validation, I log a detail response when a problem is found and keep a global g_worst_errlevel var which is a reflection of the worst response_codes.errlevel that has been found.
I could, of course, do an sql-query each time I have a detail response, but I have implemented the worst-lookup by pre-storing the lookup errlevels in a plsql table indexed by the id-col from response_codes. Here's the key proc:
-- - - - - - - - - - - - - - - - - - - - - - - -
PROCEDURE pr_detail_response
-- - - - - - - - - - - - - - - - - - - - - - - -
-- Acts on a 'detail' response
( pi_detail_response IN s_o_TYPE )
IS
BEGIN
g_detresps_TAB( pi_detail_response ) := NULL ;
g_worst_err :=
GREATEST
( g_worst_err
, g_det_errlvls_TAB( pi_detail_response ) );
END ;
g_detresps_TAB stores all detail responses that I've had**.
g_det_errlvls_TAB is the pre-stored lookup to the response_codes.errlevel col.
**In case you're wondering, it's not sensible to INSERT these at this point because the overall response-ID is the parent -- but I don't know my overall response until I've found out all my detail responses.
Thanks,
Clark
Suri, July 31, 2003 - 4:32 pm UTC
Loading Data into PL/SQL table
Ketan. Bengali, July 31, 2003 - 4:57 pm UTC
Hi Tom,
My data is like this:
Level 1.1:
Level 2.1 (Parent = Level1.1, Previous Sibling=NULL)
Level 3.1.1(Parent = Level2.1, Prev Sibling=NULL)
Level 3.1.2(Parent = Level2.1, Prev Sibling=Level 3.1.1)
Level 2.2(Parent = Level1.1, Previous Sibling=Level2.1)
Level 3.2.1(Parent = Level2.2, Prev Sibling=NULL)
Level 3.2.2(Parent = Level2.2, Prev Sibling=Level 3.2.1)
Basically, it is hierarchial and ordered data. To display this data to the users, have created all the records in pl/sql table and provided view on top of it.
Package Test_Pack is
-- rec tree_hierarchy_array;
function get_tree return tree_hierarchy_array;
--
end test_pack;
--
View is :
SELECT *
FROM TABLE(CAST(test_pack.get_tree() AS tree_hierarchy_array))
Today, I recreate pl/sql table whenever user selects from the view because variable, which stores pl/sql table, is local variable declared inside the get_tree function. I could store this variable in a session by declaring it in a package body.
If I store it to session level then the PL/SQL table has to be recreated only when the data is changed in the database. This will improve the performance. How do I find out whether database is changed since last run?
Creating a package and displaying it using a view, is it a right approach?
Thanks,
Ketan.
July 31, 2003 - 7:20 pm UTC
to discover that the data has changed is "hard" to do in a scalable fashion, unless you add auditing to the base table.
If you added a trigger that
create table last_update( x int primary key ) organization index;
create sequence s;
create trigger t_trigger after insert or update or delete of t
begin
insert into last_update values ( s.nextval );
end;
your routine can now keep a package variable that was the max(x) from t and every time it is called, it simply selects max(x) from t again to see if it changed -- if so, it must requery.
if modifications are really really "light" on this table, you could update a single row instead of inserting a new row each time. that will cause all modifications of the table to serialize by session though.
is creating a package and displaying as view the right approach? depends -- if it CANNOT be done in pure sql, maybe. if it could be done in pure sql, probably not.
The Questions.
Clark, August 01, 2003 - 8:38 am UTC
Hi Tom,
wrt my comments yesterday - (a) sorry for the low review grade, that was mainly because I asked the wrong q.. and (b) the questions I still have are:
1.
Is there anything inherently wrong/ bad practice with using a plsql index-by table to store values from the db this way - would you personally do it with a query each and every time you needed to x-ref the errlevel? If so, what are your reasons?
2.
Is there any way of storing common data (available to plsql) in the SGA to avoid duplicating 'application-static' data across all sessions? Are you aware of any plans to introduce such a concept if not?
Cheers,
Clark.
P.S. Keep up the good work!
August 01, 2003 - 9:45 am UTC
1) if I had procedural code that I could not turn into a query (a single query, not a single query executed 500 times, a single query executed once) I would be looking at using lookup tables. Yes, I do that.
2) there have been talks about it -- it is a hard problem to solve actually. introduces a new address space, concurrency issues (session 1 updates a variable at the same time session 2 does) and so on.
right now, you would have to do it "per session"
Perfect
Clark, August 01, 2003 - 10:21 am UTC
Thanks Tom, at least I know I'm not missing something.
Point 2 tho - my concept was simply as somewhere to store absolutely-static data, which you know absolutely will not change; or alternatively, if it does change, every session needs the current values. (Why? To provide for different installations of the same application with, say, different application parameters.)
Thanks again.
Clark.
August 01, 2003 - 11:13 am UTC
and we call that a database table ;)
it'll be in the cache if
a) it is small
b) it is used.
insert select for load (best practices)
Joe, September 12, 2003 - 4:00 pm UTC
Hi tom,
best practices question related to this thread
during a load into a normalized schema, i have to determine if the dimension row of the values I am loading already exists in the dim table. if it does, than use the key assigned to that set of values in my fact table.
did I understand all the posts correctly that I should
insert into t
select pk_from_dimension_table,plsql_var_a,plsql_var_b
FROM dimension_table
where dim_col_a = 'plsql_variable_d;
and dim_col_b='plsql_variable_e'
and if I get no rows inserted, assume that combo didn't exist and go ahead and insert it into the dim table?
This single line approach potentially followed by an insert would be better than a lookup query, followed by one or two inserts. (fact row and dim row)
In this case, after loading starts, I would assume that almost all combinations will be exist after a fairly short time.
September 12, 2003 - 7:55 pm UTC
how would you populate the dim table? if the dim table is just "number" to "value" and nothing else -- it is not very useful?
give me a real case to work with.
good, but one 'problem'
joachim mayer, September 22, 2003 - 11:05 am UTC
Hi tom,
i spend some time with this article before i started to use lookup-tables myself – which turned out to be very helpful and improved the performance of my applications a lot. But there is still one “problem” – or better unpleasant thing – in it.
I will try to describe it as good as I can. Sorry, I am not an native English speaker!
Most of my oracle-applications are called by other programs which stay connected to the database and therefore keep their sessions. When the first time call one of my package it reads (inialize) the lookup-table(s) for the package and it keeps going as long as it takes. Sometimes for hours or longerÂ… very rare, but it happens I would like to chance one value in the lookup-table and want this change to sow effect at once. But I can only change the value in the table where the process did get its original data but of course it wont get reread.
So, my question is – is there a way to inform this package to this session it should now reread its table and renew the lookup-table. (function-call, signal?)? At the moment, we must restart the process which is connected to the DB and uses my packages – which is sometimes cumbersome.
Thx
hannibal
September 22, 2003 - 11:56 am UTC
you would have to poll the database using something like dbms_alert, or AQ or just a simple single row table that contains the date/time of last update to see if you needed to "refresh"
Associative arrays
Zoran Martic, December 01, 2003 - 12:10 pm UTC
Tom,
This is very nice discussion.
I want to add one more case in this story to have complete picture.
I want to cover this situation:
You have the application (forms, c, c++, ...) and you have all lookup tables in Oracle database.
It is commonly known that doing lookups one by one from the client side is going to be slow (or to say not fast enough) in some cases because of SQL*Network roundtrip (TCP, IPC or even BEQ). On the newest hardware you can do to say up to few thousands lookups from C or C++. If you do this with somekind of in-process-memory structure (as associative arrays, hash table, ...) you can get 10 times more lookups. In this case the biggest overhead in doing Oracle lookups are roundtrips (including context switches) then executing lookup SELECT.
Of course this is the case only if you cannot load the data into Oracle with bulk inserts and after that do the magic in PL/SQL. You need to do validation and other things on the client side and in this case perform lookups one by one.
In this scenario you need somekind of client cache.
It looks to me that Oracle filled the gap with Oracle Forms if Oracle Forms can use associative arrays on the client side PL/SQL engine.
Associative arrays are much faster because it is in-process-memory, no need for locking, .....
For example if you use associative array on the Forms client side (if possible) you can get much better performances then doing lookups on a table.
This is from my perspective the biggest gap that I cannot solve with Oracle yet.
What with C, C++, Java, .... Oracle 9iAS.
Are there any way that Oracle is going to address this issue on the client side (as in PL/SQL with associative arrays)?
What you can tell about OCI object cache ? Is it to do anything with this story?
Thanks for any response.
Regards,
Zoran
December 02, 2003 - 7:55 am UTC
question -- why are you doing lookups on the client.
Why to do lookups on the client?
Zoran Martic, December 02, 2003 - 10:52 am UTC
In some special cases we need (or want) to do lookups on the client because of the needed speed.
The main idea is that it is faster because you do not have networking overhead (as SQL in this case), everything is in your process memory (as for associative arrays).
If you need to respond to some "real-time" (if something like that exists) event faster then you want to spend on the everhead on communicating between the client and the database (OCI) you need to consider somekind of client cache.
All other cases I covered by doing bulk insert and later on join with other tables and process that.
You saw how much faster is associative array then SQL lookup. I tested simple 2 column IOT lookup with 1M rows versus associative array lookup based on the same table. The difference was 25 times.
Of course this is limited to just one value lookup and it is unique lookup.
Also I tested some so called in-process or in-memory databases (Berkley Sleepycat) and they are about 5 times faster for simple 1-column lookup then PL/SQL, but much more faster if it is OCI call from the client even on the same server (SQL*Network time).
This is the only case when I cannot find the solution for the problem in Oracle.
From the results I have associative arrays are faster then B*Tree implementations in those fastest in-memory databases (they are simple, small, not reliable and robust like Oracle), even 5 times, but I cannot use PL/SQL in C or C++.
I want to use Oracle 100% because I am Oracle DBA/Dev, but I am facing problems to be fast enough in these special cases.
OCI object cache is looking like to simekind of caching mechanism, but hard to apply to normal tables and it is probably not like associative arrays (B*Tree) in PL/SQL.
Also 9iAS needs to be fast with dealing with data. They do not want for any validation to go to the server.
I found network roundtrips plus type conversion between C or Java and Oracle types very expencive.
I am asking you because I know that you are curious about this as I am.
I was thinking a few months ago that there is nothing faster then standard SQL lookup from IOT (or hash table or normal table). But as you saw with associative arrays this is not true (also by similar implementations in some small in-process databases like Sleepycat and TimesTen).
One-task process is not supported by Oracle anymore, that means no way to be fast as PL/SQL if you are using OCI.
What are you thinking about all of this?
Any new things in 10g with associative arrays or something similar on the OCI client side?
Regards,
Zoran
December 02, 2003 - 11:07 am UTC
I would encourage you to post your simple test -- I'll betcha we can change the algorithm to go the OTHER WAY (iot creams associative array ;)
it is all about thinking in sets, if you go slow by slow (opps -- meant row by row), the assoc array might win but if you do set based processing (single sql), the DB will cream you.
I blame the "procedural mindset", not the database here (i prototype this in my latest book effective oracle by design if you are interested).
I don't do lookups on the client side. to me -- the database IS the client side (mod_plsql).
For OCI, you'd be looking at a C subroutine (binary tree, avl tree, something like that).
Java's got hash tables...
for the 10g database -- its all about "the server", wouldn't really do "the client"
Client lookups
Zoran Martic, December 02, 2003 - 11:42 am UTC
Tom,
Thanks for your answer.
I have read all your books.
I agree that one SQL join per 50,000 rows in set is better then doing 50,000 lookups.
I am using that idea for a long time (as you described it in your books):
1. simple SQL
2. PL/SQL
3. everything else
I tested stupid associative lookups here because I wanted to know the difference between these fastest algorithms (or databases, TimesTen, Sleepycat, ...) and Oracle pure B*Tree implementation.
I was happy to see that even doing lookups on associative arrays was 5 times faster in my case then using these "fast" databases (algorithms).
You know that the most app clients are written in something diffeernt then PL/SQL and because of that, in my case C, you need to perform somekind of lookups (database, C hash table or whatever).
For you the client is PL/SQL (and me if not looking the company I am working for :), for many of us is something else.
Just that example about these lookups and associative arrays (to confirm that maybe my test is not wrong):
SQL> desc x2
Name Null? Type
----------------------------------------- -------- -------- X NOT NULL VARCHAR2(100)
Y VARCHAR2(100)
Populated with 1M rows with this script:
declare
x varchar2(100);
y varchar2(100);
i number;
begin
for j in 1..1000000 loop
insert into x2 values(to_char(j)||'abcdffgghhghg','abcdffgghhghg');
if mod(j,10000) = 0 then
commit;
end if;
end loop;
commit;
end;
/
*************** Test ************************************
declare
type lookupTable is table of varchar2(100) index by varchar2(100);
type indexByTable is table of varchar2(100) index by binary_integer;
l_lookupTable lookupTable;
l_codes indexByTable;
l_values indexByTable;
l_value varchar2(100);
l_start number;
l_run1 number;
l_run2 number;
begin
dbms_output.put_line(dbms_utility.get_time);
select x, y BULK COLLECT into l_codes, l_values from x2;
dbms_output.put_line(dbms_utility.get_time);
for i in 1 .. l_codes.count
loop
l_lookupTable( l_codes(i) ) := l_values(i);
end loop;
dbms_output.put_line(l_codes.count);
dbms_output.put_line(dbms_utility.get_time);
for i in 1 .. 1000000
loop
l_value := l_lookupTable( '501abcdffgghhghg' );
end loop;
dbms_output.put_line(dbms_utility.get_time);
for i in 1 .. 1000000
loop
select y into l_value from x2
where x = '501abcdffgghhghg';
end loop;
dbms_output.put_line(dbms_utility.get_time);
end;
/
Result:
411879663
411886578
1000000
411898013
411898682
411912955
You can see the difference between last 3 numbers.
I fixed the string I am searching for because I did not want to have possibility using the disk (the table is about 100M I think and also PGA was 180M for this test).
I did it for 100000 also. It was also much faster.
I did not try more columns or anything else.
I know that could be a way different.
Let me know if my test is wrong.
If not please give us some example where the table lookup is faster then associative array (no big JOIN technique please).
I agree again in PL/SQL is probably stupid to do lookups when you have joins, but sometimes you need to process event by event and not the bunch of events together (Imagine processing connect request or something similar in real world, you cannot batch that).
Thanks,
Zoran
December 02, 2003 - 12:31 pm UTC
i would have just joined, once again, that original bulk collect -- 0% chance I would have looked up on the client, 1000% chance I would have looked up on server (even if I had to create a global temporary table and shoot the data over first!)
In PLSQL, just use sql.
In C, -- well -- call PLSQL :)
(or look to that LANGUAGE to provide you a lookup tool -- C has that nifty bsearch routine)
Join
Zoran Martic, December 03, 2003 - 4:41 am UTC
Tom,
If you read carefully, there are not N records. There is not set of rows.
You are dealing with one event you need to respond as fast as possible.
Imagine the situation where you need to respond to some network event immediately, or even to some database event, you cannot use joins, you need to respond immediately. You cannot batch it.
That is the point, no way to have a batch.
You have "real-time" event you need to respond to.
In this case this server call (from C) from my perspective is very expensive.
In the same way you can look from PL/SQL you do not want sometimes to go and look the table, you will use local variables or associative arrays. That is the point.
My example was to show that looking into associative array is much faster then looking into IOT for direct hit lookups.
You owe me an example where IOT is faster then associative array as you said previously (just for PK matches).
I am pretty confident that even if you have the multicolumn primary key (doing concatenation in associative array to make the index), the lookup will be faster.
That means if you need to do something faster then your tables can do, you will opt for associative arrays.
At the end why should Oracle put associative arrays in the product if you can do everything with join :)
Are you saying to us that associative arrays are not needed?
I am sure that you want to use it in some special cases where you need to do things record by record (or event by event).
Oracle put it to fullfil PL/SQL speed.
I asked you in the first place is Oracle going to make something on the client side similar to the OCI object cache but more advanced.
I want to rely on Oracle, do not want to use other products.
I saw (tested) that Oracle B*tree search in associative arrays are faster then coresponding "fastest" in-process in-memory databases.
Thanks for all your time.
Zoran
December 03, 2003 - 7:07 am UTC
then, as I keep saying, you need to look to your LANGUAGE to help you out. be it c, java, whatever. the LANGUAGE will supply this feature
then you have to figure out how to make it so that when the database changes the client knows to refresh its cache.
you would not be using a plsql stored procedure in your case -- there would be no need to have it do a lookup as plsql stored procedures are running in the database and have no UI.
you would not be using forms for a real time interface either (i wouldn't imagine you would)
we put lots of stuff in the database cause people ask for it. its not always the best. i've demonstrated the best, most performant way.
I've yet to use an associative array in code, for real.
Associative arrays
Zoran Martic, December 03, 2003 - 10:41 am UTC
Tom,
Thanks a lot for your time.
That is what I am doing at the moment.
I am impressed with the speed of associative arrays.
I mean 1M lookups in 6 seconds on 1M table (simple two columns but..) is very good. The server is just 450MHz Sparc.
That is proving that Oracle has crazy good B*Tree algorithm, just all code on top of it (optimizer, SQL, locking, ....) are having initial overhead per SQL.
At the end this is why joins are fast, the overhead is too small in comparison to memory operations performed to read database blocks from the buffer cache (about 5k - 10k LIO's per 100MHz CPU).
Regards,
Zoran
This was incredible!
Roland G. Bender, December 10, 2003 - 12:25 pm UTC
validation and auditing
dxl, June 30, 2004 - 5:35 am UTC
Tom
We have a requirement to bulk load some data from a legacy system into oracle. The data needs to be validated. The validation is not particularly complicated but we do have to audit any rows failing validation.
So first the data is passed from a java web app via jdbc using oracle batch updating into a staging table (possible gtt) that just contains a load of varchar2 fields in order to get the data into oracle.
Then we need to get move the data from the staging table into the real table with the correct datatypes, validating the data as we go and logging any rows which don't pass but also sometimes preventing these rows from getting into the real table and in other cases where the validation fails as a warning we would let the row get into the table but just log it as a warning.
Hence we have something like: (this is a very simplified view of our tables)
create table stage (
forename varchar2(100),
surname varchar2(100),
dateofbirth varchar2(100),
postcode varchar2(100),
codeA varchar2(100));
create table myTable (
forename varchar2(30),
surname varchar2(30),
dateofbirth date,
postcode varchar2(7),
codeA number(2));
My question is what will be the most efficient method of loading into myTable? I would like to use
INSERT INTO myTable select ..validate.. from stage;
because using this in the past has proved to be amazingly quick.
However a lot of our fields require validation using lookup tables from the database. I belive the validation is bulky and awkward enough not to do it using CASE and decodes etc, plus it also uses some lookup tables, so I was thinking that i could write lots of functions to do the validation hence we would have:
begin
INSERT INTO myTable
select function1(forename),
function2(surname),
function3(dateofbirth),
function4(codeA),
..
.. etc
..
from stage;
end;
1) would this involve a lot of context switching? How efficient could this be?
2) Doing it this way i can put the auditing of each field into the functions BUT how do i prevent the rows failing validation from getting into the table without the whole insert statement failing? Should i let them get in and then run another procedure afterwards to delete them??
3) Maybe my whole approach is wrong, should i just go for a cursor loop so that i can validate as i go along and easily audit any errors and warnings. This would give me complete control over which rows get into myTable, but i don't like the thought of row by row processing since we may be dealing with a lot of rows.
4) Would loading up plsql lookup tables be quicker instead of using the real db tables and then doing something like
INSERT INTO myTable
select (select a from plsql_tble1 where b = forename),
(select a from plsql_tble2 where b = surname),
(select a from plsql_tble3 where b = dateofbirth),
(select a from plsql_tble4 where b = codeA),
..
.. etc
..
from stage;
I would still have the same problem of logging the errors and warnings and contolling which rows get into myTable.
Any advice here would be greatly appreciated.
Thanks
June 30, 2004 - 10:12 am UTC
is this an ongoing thing or one time?
how big is the input data set (rows and megabytes)?
why are you loading from a middle tier like that? seems like external tables would be more than sufficient.
thanks for reply
dxl, June 30, 2004 - 12:27 pm UTC
Forgot to say we are on 8.1.7.4 so no external tables.
This is an ongoing process which will be created in a stored procedure.
The data is coming from a java app on a middle tier app server, although we may at some point try and connect direct to the legacy system using HS, but not yet, it is coming from the java app for now.
As for size it will be done on a batch basis and i'm expecting it to be on average in the order of between 10,000 and 50,000 rows, i know this is vague but thats all i know right now, i've been told it "shouldn't" ever be more than 100,000 rows but that isn't definite either. There are probably about 40 - 50 columns to validate and the requirement is to make it go as quick as we can!
June 30, 2004 - 1:15 pm UTC
sqlldr then.. but anyway.
bulk process the data 100 rows at a time. bulk collect limit 100 into plsql arrays, process the data, forall i insert the "good" data and log the bad data (eg: process the one array you bulk collected into into two sets of arrays -- one good and one bad)
please give me some more details
dxl, July 01, 2004 - 5:34 am UTC
I'm not sure that i totally understand the whole process you've suggested. Please can you give me more details with an example.
At the moment i'm thinking that you mean something like:
1) bulk collect table (of varchar columns) into a scalar object of varchars with a 100 limit
2) then loop through each element of the array and do the validation
3) if any elements pass validation then add to the "good" array else add to the "bad" array
4) load each good and bad array into myTable or the error table using forall
For example:
(please excuse the pseudo code, i'm just interested in the process rather than correcting syntax at the moment, although any comments on whether to use scalartypes, arraytypes or rowtypes etc would also be helpful)
create or replace type myScalarType as object
(forename varchar2(100),
surname varchar2(100),
dateofbirth varchar2(100),
postcode varchar2(100),
codeA varchar2(100))
/
create or replace type myArrayType as table of myScalarType
/
declare
cursor c is select myScalarType( forename, surname, dateofbirth, postcode, codeA ) from stage;
l_data myArrayType;
l_good_data myTable%rowtype;
l_bad_data myTable%rowtype;
type good_Array is table of l_good_data index by binary_integer;
type bad_Array is table of l_bad_data index by binary_integer;
begin
open c;
loop
l_data := myArrayType();
fetch c bulk collect into l_data limit 100;
for i in 1 .. l_data.count loop
-- then should i do the validation here??
-- I need to do things like, for example :
select code into v1 from lookup table where code = l_data(i).codeA;
if sql%rowcount = 1 then
-- pass_validation
else
-- fail_validation
end if;
..
..
.. etc (lots more validation like this)
..
..
-- After the validation decide if it is a good row
if pass validation then
--
l_good_data := l_data(i);
else
l_bad_data := l_data(i);
end if;
..
.. etc
..
-- now build good_array and bad_array??
if good row then
good_array(n):= l_good_data ;
if bad row then
bad_array(m):= l_bad_data ;
n:= n + 1;
m:= m + 1;
end loop;
if ( good_array.count is not null )
then
forall i in 1 .. good_Array.count
insert into myTable values
( good_array(i) );
end if;
if ( bad_array.count is not null )
then
forall i in 1 .. bad_Array.count
insert into myErrorTable values
( bad_array(i) );
end if;
bad_Array := empty;
good_Array := empty;
exit when c%notfound;
end loop;
close c;
end;
Is this the kinda of processing that you meant?? or have i got the wrong end of the stick!?
July 01, 2004 - 10:59 am UTC
in this example -- mod() is my function to determine goodness or badness.
this is 9ir2.
ops$tkyte@ORA9IR2> create table t as select * from big_table.big_table where rownum < 200;
Table created.
ops$tkyte@ORA9IR2> create table good as select * from t where 1=0;
Table created.
ops$tkyte@ORA9IR2> create table bad as select * from t where 1=0;
Table created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> declare
2 type array is table of t%rowtype index by binary_integer;
3 l_source array;
4 l_good array;
5 l_bad array;
6
7 cursor c is select * from t;
8 begin
9 open c;
10 loop
11 fetch c bulk collect into l_source limit 100;
12 l_good.delete;
13 l_bad.delete;
14 for i in 1 .. l_source.count
15 loop
16 if (mod( l_source(i).id,2 ) = 0 )
17 then
18 l_good(l_good.count+1) := l_source(i);
19 else
20 l_bad(l_bad.count+1) := l_source(i);
21 end if;
22 end loop;
23 forall i in 1 .. l_good.count
24 insert into good values l_good(i);
25 forall i in 1 .. l_bad.count
26 insert into bad values l_bad(i);
27 exit when c%notfound;
28 end loop;
29 end;
30 /
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select count(*) from t;
COUNT(*)
----------
199
ops$tkyte@ORA9IR2> select count(*) from good;
COUNT(*)
----------
100
ops$tkyte@ORA9IR2> select count(*) from bad;
COUNT(*)
----------
99
excellent
dxl, July 02, 2004 - 4:15 am UTC
Thanks very much for that.
I am on 8.1.7.4 so will your method work or do i just need to replace the record types with object types??
July 02, 2004 - 9:51 am UTC
with an array per column I would suggest. You didn't have record operations in forall in 8i -- you need an array per column.
Bulk collecting into an associative array?
A reader, July 23, 2004 - 7:14 pm UTC
How can I bulk collect into an associative array?
I have a lookup table containing code (vc 10) and value (vc2 100).
Can I
type assoc_t is table of varchar2(100) index by varchar2(10);
mytab assoc_t;
select code,value BULK COLLECT into mytab from lookup_table;
How can I do this without resorting to "slow-by-slow" processing?
Thanks
[Sorry, I dont have my database in front of me right now to try this]
July 23, 2004 - 8:19 pm UTC
you cannot right now, you bulk collect the "subscripts" into array1, the "subscriptee" into array2 and then you have to build array3 by assigning array2 to array3:
for i in 1 .. array1.count
loop
associative_array3(array1(i)) := array2(i);
Magic?
A reader, August 02, 2004 - 11:14 pm UTC
Refer to one of your earlier followups...
"I threw in that last query to show you that there is "magic" happening with select of selects. See the number of consistent gets on that last query? Very very close to the others. There are optimizations happening here that go beyond what we can see"
Would you please explain what is going on here?
How is it possible for a FTS of EMP to have the same consistent gets as
select deptno, (select dname from dept where
dept.deptno = emp.deptno)
from emp
Basically, the DEPT access is free!! How is this possible?
Thanks
August 03, 2004 - 8:09 am UTC
it is not free -- it is however cached.
the scalar subquery -- the (select dname from dept where dept.deptno=emp.deptno) is "stripped out" and turned into:
select dname from dept where dept.deptno = :bv
the server then starts fetching data -- gets the first row from EMP, sees "DEPTNO 10" and runs that scalar subquery. It then "saves this result in a small cache" -- so in a small cache we have "bv=10, dname=<xxxxx>".
The next time, it fetched 20, then 30, then 40, then bv gets NULL. Then back to 10, 20, 30, 40, null and so on (because of the way I loaded the data in this example).
This small cache is able to hold the results of the 10, 20, 30, 40, null results so the scalar subquery is fired off just 5 times -- resulting in few logical IO's.
with scalar subqueries -- the "order of the data" can be very relevant. consider:
ops$tkyte@ORA9IR2> set echo on
ops$tkyte@ORA9IR2> set linesize 121
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> define nvalues=100
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace function f(x in number) return number
2 as
3 begin
4 dbms_application_info.set_client_info( userenv('client_info')+1 );
5 return 42;
6 end;
7 /
Function created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> drop table t;
Table dropped.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create table t
2 as
3 select mod(rownum,&nvalues) r from all_objects;
old 3: select mod(rownum,&nvalues) r from all_objects
new 3: select mod(rownum,100) r from all_objects
Table created.
<b>in this table, the data will be fetched 1,2,3,4,...99,0,1,2,3...,99,0,1...
that is, the r values do not repeat from row to row but rather every 100 rows. Now:</b>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_application_info.set_client_info(0);
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> set autotrace traceonly
ops$tkyte@ORA9IR2> select r, (select f(r) from dual) from t;
32160 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (FULL) OF 'DUAL'
2 0 TABLE ACCESS (FULL) OF 'T'
Statistics
----------------------------------------------------------
22 recursive calls
0 db block gets
25556 consistent gets
49 physical reads
0 redo size
516751 bytes sent via SQL*Net to client
24072 bytes received via SQL*Net from client
2145 SQL*Net roundtrips to/from client
5 sorts (memory)
0 sorts (disk)
32160 rows processed
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2> select userenv('client_info') from dual;
USERENV('CLIENT_INFO')
----------------------------------------------------------------
7783
<b>we can see our function was called 7,783 times there. Taking the same exact set of data but "having it reside on disk in sorted order":</b>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> truncate table t;
Table truncated.
ops$tkyte@ORA9IR2> insert into t select mod(rownum,&nvalues) r from all_objects order by mod(rownum,&nvalues);
old 1: insert into t select mod(rownum,&nvalues) r from all_objects order by mod(rownum,&nvalues)
new 1: insert into t select mod(rownum,100) r from all_objects order by mod(rownum,100)
32160 rows created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_application_info.set_client_info(0);
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> set autotrace traceonly
ops$tkyte@ORA9IR2> select r, (select f(r) from dual) from t;
32160 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (FULL) OF 'DUAL'
2 0 TABLE ACCESS (FULL) OF 'T'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2506 consistent gets
0 physical reads
0 redo size
420930 bytes sent via SQL*Net to client
24072 bytes received via SQL*Net from client
2145 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
32160 rows processed
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2> select userenv('client_info') from dual;
USERENV('CLIENT_INFO')
----------------------------------------------------------------
103
<b>only 103 calls (the cache worked really well this time as we basically only needed to call the function pretty much once per bind variable value).
question would then be "does data have to be sorted on disk for this to work?" and the answer is no, an inline view that sorts the data by the values you will be binding into the scalar subquery can do this:</b>
ops$tkyte@ORA9IR2> truncate table t;
Table truncated.
ops$tkyte@ORA9IR2> insert into t
2 select mod(rownum,&nvalues) r from all_objects;
old 2: select mod(rownum,&nvalues) r from all_objects
new 2: select mod(rownum,100) r from all_objects
32160 rows created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_application_info.set_client_info(0);
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> set autotrace traceonly
ops$tkyte@ORA9IR2> select r, (select f(r) from dual)
2 from (select r from t order by r);
32160 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (FULL) OF 'DUAL'
2 0 VIEW
3 2 SORT (ORDER BY)
4 3 TABLE ACCESS (FULL) OF 'T'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
353 consistent gets
0 physical reads
0 redo size
420891 bytes sent via SQL*Net to client
24072 bytes received via SQL*Net from client
2145 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
32160 rows processed
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2> select userenv('client_info') from dual;
USERENV('CLIENT_INFO')
----------------------------------------------------------------
100
<b>and here is the challenge du-jour..... why did the consistent gets go way way down on this last query as opposed to the one right before :)</b>
Challenge du-jour!
A reader, August 03, 2004 - 9:32 am UTC
Because the 'select r from t order by r' filled up the cache? Argh, I give up!
This was simply brilliant. You are single-handedly changing the world in how Oracle technologies are effectively used.
Keep up the good work (and do post the solution to your challenge!)
Challenge Du-jour
Bill, August 03, 2004 - 9:51 am UTC
Is it possible the consistent gets go way down because of the order by clause in the subquery? Still trying to get the hang of how this works, but very educational!
August 03, 2004 - 10:06 am UTC
Here is a "hint" of sorts:
ops$tkyte@ORA9IR2> create table t as select rownum r from all_objects;
Table created.
ops$tkyte@ORA9IR2> set autotrace traceonly statistics;
ops$tkyte@ORA9IR2> set arraysize 15
<b> /* the default... */</b>
ops$tkyte@ORA9IR2> select * from t;
32174 rows selected.
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets<b>
2197 consistent gets</b>
49 physical reads
0 redo size
448960 bytes sent via SQL*Net to client
24083 bytes received via SQL*Net from client
2146 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
32174 rows processed
ops$tkyte@ORA9IR2> select * from t <b>order by r;</b>
32174 rows selected.
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets<b>
52 consistent gets</b>
0 physical reads
0 redo size
448960 bytes sent via SQL*Net to client
24083 bytes received via SQL*Net from client
2146 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
32174 rows processed
<b>
ops$tkyte@ORA9IR2> set arraysize 5000</b>
ops$tkyte@ORA9IR2> select * from t;
32174 rows selected.
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets<b>
59 consistent gets</b>
0 physical reads
0 redo size
203788 bytes sent via SQL*Net to client
565 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
32174 rows processed
ops$tkyte@ORA9IR2> set autotrace off
order by
A reader, August 03, 2004 - 10:26 am UTC
So, the 'order by' automagically turns on array processing on the server side? Even so, wouldnt the client (sqlplus) arraysize (15) still limit the throughput?
Still not getting it!
August 03, 2004 - 10:32 am UTC
Nope... not quite.
think about the definition of a consistent get -- were we get that data from vs what might happen with "temp data"
Challenge Du-juor - take two
Bill, August 03, 2004 - 10:47 am UTC
Hmmm....further up in this thread, you stated the following:
"I threw in that last query to show you that there is "magic" happening with
select of selects. See the number of consistent gets on that last query? Very
very close to the others".
So, maybe it has to do with the select of the select?
Challenge Du-jour Take Two yet again
Bill, August 03, 2004 - 10:56 am UTC
Oops - just realized that that would not explain the last example you gave. Back to the drawing board...
Challenge Du-jour : I think I am on the right track now!
Bill, August 03, 2004 - 11:05 am UTC
It has to do with the array fetches. They reduce the consistent gets.
Challenge du-jour!
A reader, August 03, 2004 - 1:30 pm UTC
"think about the definition of a consistent get -- were we get that data from vs what might happen with "temp data""
I give up. consistent get is a point-in-time current-mode read. It comes from the buffer cache, it may have to incur a PIO in order to get the block into the cache in the first place.
temp data comes from tempfiles (if you are using them).
i still dont see how adding a order by to the inline view had the same effect as increasing the client's arraysize! or was that just a red herring?!
August 03, 2004 - 5:46 pm UTC
when we fetched 15 rows by 15 rows without sorting.... we would
a) get a block
b) get 15 rows from block
c) give up block
d) goto a)
problem was -- there were about 500 or 600 rows per block here! We would get block "1" 40 times, block "2" 40 times and so on.
setting the arraysize to 5000 meant we would get block 1 once, block two 2 and so on (in order to get the first 5000 rows). We might have gotten block 10 or 11 twice -- but in essence, each block would be "consistent gotten" once -- instead of 40 or more times.
that is why the difference with array fetch of 15 vs 5000.
So, how did the sort affect that? In order to get the FIRST row, we had to get ALL rows and sort them. So the first row got all of the blocks -- put them into temp and sorted them. We returned the rows from there without having to use consistent gets on each and every fetch call -- since they were copied on the first fetch into temp.
Great!
A reader, August 03, 2004 - 8:46 pm UTC
Hm, great, that was very useful, thanks, but that raises a few more questions
"put them into temp and sorted them. We returned the rows from there without having to use consistent gets on each and every fetch call"
1. So the table had about 59 blocks and they were all pulled into the buffer cache by the sort operation. the "fetch" phase of the query simply returned them from there instead of doing more consistent gets for them. So the sort essentially "subsidized" the cost of the fetch operation! Nice.
This is a (good) side-effect of all sort-operations, right?
i.e. whenever you sort, you already get all the data you need before starting to output the first row.
"that is why the difference with array fetch of 15 vs 5000"
2. you have shown numerous times the benefit of using higher arraysizes.
If higher arraysizes are always beneficial, why doesnt Oracle make that the default behaviour for all "fetches", sqlplus or otherwise? (well, lets stick to sqlplus for this discussion).
I mean, getting a block, returning 15 rows and "throwing away" the block isnt terribly smart. If the user has set arraysize 15, he doesnt know what is good for him. Just bump it up without telling the user. Its not exactly like "Open the pod bay doors Hal. I'm sorry Dave, I'm afraid I can't do that", is it?
3. Why does the example with the 59 consistent gets have
203788 bytes sent via SQL*Net to client
I can understand that higher arraysizes will reduce the "SQL*Net roundtrips to/from client" number, but the answer is the same, so the "bytes sent via SQL*Net to client" should be the same, right?
Thanks
August 04, 2004 - 8:37 am UTC
1) the sort pulled all 50 some blocks from the cache into the "sort area size" allocated in the UGA -- once in the sort area, they never needed another "consistent get", they were already there in the sessions memory. If the sort area exceeded the sort area retained OR the sort area was not large enough -- we would have swapped to disk.
So, caveat emptor - do not run out and throw bogus order by's on queries that do not need them. Also -- a sort performed by an index access (skipping the sort actually) won't have this effect.
2) they are not always beneficial. I find 100 to be sort of a sweet spot. sqlplus is "just a very very simple tool" and has the method to change the default. The real problem is code like VB or Java -- there the default is "i don't know" for VB and 10 for jdbc - 10 is probably too small and the preFetch should change. For PLSQL, in 10g they made a default of 100 for prefetching for implicit cursors (was 1 -- unless you used bulk collect)
3) every round trip incurs protocol layers (metadata so we understand what we just sent to ourselves), less round trips, less protocol. Here because the data was so darn "small" (a simple number) the protocol was as large as the data itself
PGA usage by assoc arrays
A reader, December 13, 2004 - 2:09 pm UTC
Following the thread on this page...Lets say I prove that my system would benefit by the associative array approach i.e. use a package instantiation code to bulk collect 2 arrays, load up your assoc array and then simply use that instead of making database calls.
What are the memory implications of doing this?
Would all sessions using this package use as much memory as needed by the assoc array? Or do some optimizations kick in?
If I am using dedicated server and pga_aggregate_target, where is all this memory used? In each session's PGA? Would that be on the server or on the client?
Any way to reduce the memory usage?
Thanks
December 13, 2004 - 2:40 pm UTC
all sessions have their own data segment, they would each get a copy of the array in their address space.
it would be in the UGA in the PGA under dedicated server and in the UGA in the SGA under shared server and is not subject to the pga_aggregate_target which only really affects dynamic workareas like sort areas, hash areas.
the only way to reduce memory usage it to not use memory ;)
if at all possible, incorporate the "lookup" right in your SQL itself, that'll be faster still and not consume ram in the dedicated servers.
Sorry I have a large backlog right now, please ask a question later
Tanweer, December 22, 2004 - 6:38 am UTC
Sorry I am not able to send the Fresh message bCOZ I have gotten the message (I have a large backlog right now, please ask a question later )
How execute the Package,Procedures & Function on the system time.
Suppose I want to execute the (1) Package on time 3:00 A.M, (2) Procedures on time 4:00 A.M & (3) Functions on time 4:30 a.m.
Can you give me the Syntax/Code to take my Procedure/Package name?
I have given the Package & Procedure name for example:
(1)create or replace package lob_client_operations as
function get_lob_size(p_tablename in varchar2,
p_pkcolname in varchar2,
p_textcolname in varchar2,
p_docid in varchar2) return number;
procedure get_lob_chunk ( p_tablename in varchar2,
p_pkcolname in varchar2,
p_textcolname in varchar2,
p_docid in varchar2,
amount IN OUT NUMBER,
offset IN NUMBER,
buff OUT RAW);
procedure insert_new_record ( p_tablename in varchar2,
p_pkcolname in varchar2,
p_textcolname in varchar2,
p_docid in varchar2);
end;
/
-- ---------------------------- ---
-- ---------------------------- ---
(2)
(a)CREATE or REPLACE PROCEDURE SimpleTextMessage IS
(b) create or replace procedure html_email( p_to in varchar2, p_from in varchar2, p_subject in varchar2,
p_text in varchar2 default null) is
-- ---------------------------- ---
-- ---------------------------- ---
(3) create or replace function send(
p_from in varchar2,
p_to in varchar2,
p_subject in varchar2,
p_body in varchar2) return number as
-- ---------------------------- ---
-- ---------------------------- ---
Thanks in Adv
December 22, 2004 - 9:53 am UTC
hmm, do you understand the meaning of that message?
search this site for dbms_job.
doesn't matter if it is a package, procedure, function whatever -- you just need a job schedulers.
When is it necessary to usE PL/SQL tables ?Version ORACLE 8I
Stivi, December 23, 2004 - 1:56 pm UTC
Tom,hello!
As i understood it is more efficient to use
in procedural code SQL statements with joins(because it works faster) than lookup in arrays,but when is it good to use PL/SQL tables instead of SQL'S? May you show the example where PL/SQL table is preferable than SQL statement.
Thank you.
Stivi
December 23, 2004 - 2:43 pm UTC
I don't have any :)
If you feel you MUST code procedurally, that you MUST do it in procedural code -- then a PLSQL lookup table is pretty darn efficient.
my mantra:
if you can do it in a single sql, do it.
if not do it in as little plsql as possible...
QUESTION ABOUT INSERT IN SQL STATEMENT
DAV, December 23, 2004 - 2:55 pm UTC
Thank you Tom,
I have an insert statement that i do in procedural code:
insert into table_try (select 1,cstmr_no,func(1,ctmr_no)day
from customers
where bsnss_no=:param_code
and retuirning_no!=1);
i prefer sql for insert and not plsql code because you told it is faster.But i don't want to insert record if
func(1,ctmr_no) return null or select doesn't return anything.How can i do it?
Thank you very much.
DAV
December 23, 2004 - 3:48 pm UTC
if select doesn't return anything -- what could possibly be inserted? :) that is OK.
add a where clause to select only those rows you actually want to insert.
Insert_continue
A reader, December 23, 2004 - 3:57 pm UTC
Tom,thank you very much,
in (select 1,cstmr_no,func(1,ctmr_no)day
from customers
where bsnss_no=:param_code
and retuirning_no!=1);
FUNC(1,cstmr_no) acceses to another table and
and if func(1,cstmr_no) is null i don't want it to be inserted but if i use in where func(1,cstmr_no) is not null
this func will be executed twice in select and in where.
am i right?
if yes,how to improve it?
Have a nice day.
December 23, 2004 - 4:36 pm UTC
might be, might not be
select 1, cstmr_no, day
from (select 1,cstmr_no,func(1,ctmr_no)day, ROWNUM
from customers
where bsnss_no=:param_code
and retuirning_no!=1)
where day is not null;
ops$tkyte@ORA9IR2> create or replace function f( x in number, y in number ) return number
2 as
3 begin
4 dbms_application_info.set_client_info(userenv('client_info')+1);
5 return x;
6 end;
7 /
Function created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_application_info.set_client_info(0)
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> select empno,f(sal,comm)
2 from emp
3 where f(sal,comm) is not null;
EMPNO F(SAL,COMM)
---------- -----------
7369 800
7499 1600
7521 1250
7566 2975
7654 1250
7698 2850
7782 2450
7788 3000
7839 5000
7844 1500
7876 1100
7900 950
7902 3000
7934 1300
14 rows selected.
ops$tkyte@ORA9IR2> select userenv('client_info') from dual;
USERENV('CLIENT_INFO')
----------------------------------------------------------------
28
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_application_info.set_client_info(0)
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> select empno, f
2 from (
3 select empno,f(sal,comm) f, rownum r
4 from emp
5 )
6 where f is not null;
EMPNO F
---------- ----------
7369 800
7499 1600
7521 1250
7566 2975
7654 1250
7698 2850
7782 2450
7788 3000
7839 5000
7844 1500
7876 1100
7900 950
7902 3000
7934 1300
14 rows selected.
ops$tkyte@ORA9IR2> select userenv('client_info') from dual;
USERENV('CLIENT_INFO')
----------------------------------------------------------------
16
ops$tkyte@ORA9IR2>
Procedural mindset
Bob, January 07, 2005 - 12:10 pm UTC
Tom,
Very helpful and insightful information. I've been reading your book and I am now trying to breakout of the procedural mindset. The one question this thread doesn't answer for me is this; I SQL*Load a table, then I need to validate each record from that table against two other tables to determine if it should be inserted into the main table. If the validation fails, an error needs to be logged in a seperate table. The structure of all tables is different and it's version 8.1.7. The idea of using a good and bad table covers most of it, but how do you log the error associated with the bad records?
January 08, 2005 - 3:41 pm UTC
what is the validation? I always try to let the database do my work for me, why can't it do it here?
Procedural mindset - resolved
Bob, January 08, 2005 - 8:30 am UTC
After further reading and testing it's starting to make sense.
Thanks for the execellent site.
You can't think of one reason?
bob mourning, February 19, 2005 - 3:57 pm UTC
You could not come up with a single reason to ever use a PL/SQL table over the SQL?
How about when the table one must read does not belong to the reader and it is poorly indexed. Then it might make sense to do the select one time and efficiently access the data where possible via PL/SQL binary_integer.
There are other solutions of course, but poorly performing SQL run 10M times would not work.
Also, it may be possible to limit the candidate values up front. For example, if my transaction data has only 12 lookup keys and the lookup table has 2M rows, it might make sense to go against 12 rows in a PL/SQL table versus use a non-unique index.
BTW, I think your posts are generally excellent, but sometimes you are just arrogantly dogmatic.
February 19, 2005 - 5:26 pm UTC
Hmm, lets see, dogmatic is a recursively defined thing
Main Entry: dogmatic
Pronunciation: dog-'ma-tik, dg-
Variant(s): also dogmatical /-ti-k&l/
Function: adjective
1 : characterized by or given to the use of dogmatism <a dogmatic critic>
2 : of or relating to dogma
so, we must dig deeper to find the meaning:
Main Entry: dogmatism
Pronunciation: 'dog-m&-"ti-z&m, 'dg-
Function: noun
1 : positiveness in assertion of opinion especially when unwarranted or arrogant
2 : a viewpoint or system of ideas based on insufficiently examined premises
If you choose number 1, then your use of arrogantly would be redundant. It cannot be "unwarranted", for I tend to put the proof of what I talk about out there for all to see. It definitely cannot be #2.
but actually, wait, it cannot be #1, because that includes "opinion". All I asked was "where is the proof here". I'm not offering opinion, but rather hard facts -- with numbers, with code.
Anyway, I did give a bunch of them in "Effective Oracle by Design" as well -- spent the better part of a chapter on this topic. The answer came down to "just join" (that would make a nice bumper sticker I think -- it would mean different things to different people but it has a nice ring to it. Maybe "databases were born to join", but then the print would have to be pretty small to fit it all)
but hey -- I even gave a case, an example of using a plsql based lookup table versus querying repeatedly -- I agree, that can be faster, HOWEVER, I also (and will keep doing so over and over) point out that you've only gone a tiny fraction of the way to the right answer which in most cases is "just join darn it, databases were born to join"
Maybe you did not read the entire answer, the entire discussion, the give and take.
So, you can drop the arrogantly -- it is sort of "inferred" in the definition of the term you are using, but I do believe the term was used incorrectly, especially if you take into consderation the definition of
Main Entry: dogma
Pronunciation: 'dog-m&, 'dg-
Function: noun
Inflected Form(s): plural dogmas also dogmata /-m&-t&/
Etymology: Latin dogmat-, dogma, from Greek, from dokein to seem -- more at DECENT
1 a : something held as an established opinion; especially : a definite authoritative tenet b : a code of such tenets <pedagogical dogma> c : a point of view or tenet put forth as authoritative without adequate grounds
2 : a doctrine or body of doctrines concerning faith or morals formally stated and authoritatively proclaimed by a church
"established opinion" -- nope.
"without adequate grounds" -- nope
and while database conversations sometimes take on a religious fervor -- i don't think #2 applies either.
Just my 2cents -- and yes, I'll keep saying the same things over and over and over and maybe they'll stick (use binds, use binds, just join, databases are really good at inserting too -- audit trails are good, every other line of code should be DEBUG (trace, instrumentation) code -- all of these things people don't do, i'll just keep plugging away at it)
(yes, this answer is tongue in cheek -- actually, I just read "arrogantly dogmatic" as "consistent and committed to provide the information with proof")
Which would be faster
hawk, July 06, 2005 - 12:03 pm UTC
Dear Tom
Please advise which of the 2 methods would be effecient and faster
Aim: Need to insert data into a table on daily basis (table P) and also I need to insert that data in form of into other table (t)
I would be using data from table p for certain other purposes
method 1:
---------
For curref in (select a.*, b.* from a, b, c, d, e, f
where a.column = b.column
and some join conditions among all 6 tables)
loop
insert into p values (abv selected)
insert into t values (abv selected some cals)
insert into t values (abv selected but some diff cals)
end loop;
Method 2:
---------
insert into p (select a.*, b.* from a, b, c, d, e, f
where a.column = b.column
and some join conditions among all 6 tables);
for cur_ref in (select * from p where date_time = sysdate)
loop
insert into t values (abv selected)
insert into t values (abv selected but some diff cals)
end loop;
Please advise which of the abv 2 methods is efficient and faster.
Thankyou.
July 06, 2005 - 12:46 pm UTC
method three, multi-table insert, no code of course.
insert
when (1=1) then into p ( ... ) values ( ... )
when (whatever) then into t ( ... ) values ( ... )
.....
select a.*, b.* from a,b, c,d.......
A note
Hawk, July 06, 2005 - 1:36 pm UTC
Tom,
Great! but one thing every row in the select needs to get to all the tables hence there wouldn'nt be any when clause.
I hope still this(ur suggested method 3) is the best way to do it.
Please comment.
Thankyou
July 06, 2005 - 2:30 pm UTC
sure is, read about multi-table inserts, there is an INSERT ALL ..... as well no when clauses.
Thankyou very much TOM
Hawk, July 06, 2005 - 3:03 pm UTC
very interesting
Muhammad Ibrahim, July 06, 2005 - 10:43 pm UTC
Dear Tom,
I have a situation as below
create table t ( id number, from_date date, to_date date, amount number, expired varchar2(1));
I want to take a copy of the table t by id(column).
Process this data in the memory then write back
the same/new set of data to the table t and mark the
previous set as old. The processing includes many for
loops and validations,comparisions,lookups etc...
Here my question to process the data whether plsqltabl/
global temporary table which one is better?
If i use temp table i need to do lot of inserts and updates
for processing sometimes delete also inside that too may inside
for loops. Insert is fine but i am worried about updates and
deletes Or plsql table is better for this kind of situation?
Regards,
Ibrahim.
July 07, 2005 - 9:04 am UTC
you missed the most important bit of data
how big.
and databases are really good at doing things -- lookups for example are called joins (they do them particularly weel). validations and comparisions. You may well find you can actually do a ton of stuff in sql.
plsql table or global temp table
Muhammad Ibrahim, July 08, 2005 - 5:27 am UTC
eg:
----
main table which keeps the transactions:
---------------------------------------
create table cessions ( cession_id number(20) not null,
start_date date not null,
mode varchar2(10) not null,
end_date date not null
pol_group_id number(10),
class_id number(10),
arrangement_id number(10),
sequence_number number(10),
premium number,
exch_rate number,
overwritten varchar2(1),
apply_copy varchar2(1));
alter table cessions add ( constraint cessions_pk primary key(cession_id,start_date,ri_mode));
global temp table for processing:
--------------------------------
create global temporary table cessions_tmp ( cession_id number(20) not null,
start_date date not null,
mode varchar2(10) not null,
end_date date not null,
pol_group_id number(10),
class_id number(10),
arrangement_id number(10),
sequence_number number(10),
premium number,
exch_rate number,
overwritten varchar2(1),
apply_copy varchar2(1))
on commit preserve rows;
alter table cessions_tmp add ( constraint cessions_tmp_pk
primary key(cession_id,start_date,ri_mode));
procedure merge_old_situation(pol_group_id in number)
is
begin
delete cessions_tmp;
for rpol in ( select pol_group_id
from main_pol
where ..) -- this can be one or more
--(select criteria may return min 1000 or more)
loop
insert into cessions_tmp( .... )
select ...
from cessions
where pol_group_id = rpol.pol_group_id;
end loop;
end merge_old_situation;
procedure process_re(pol_group_id in number
Is
begin
-- i dont want to directly manipulate the transaction table that
-- is changing it directly. So i take a mirror image of the original
-- table to the temp table by some conditions(not all the data).
merge_old_situation(pol_group_id);
for rintervals in ( select start_date, end_date
from intervals )-- this is also a global temp table which
-- will have set of intervals
--(one or more may be 365 intevals per
-- year).
for rexisting_pol in ( select ...
from another_table
where rintervals.start_date between start_date
and end_date )
-- this another_table is a big table but the select
-- criteria may return min 100 or more
loop
-- here i take the new data which comes from rexisting_pol then
-- compare with cessions_tmp table
-- insert or update depends on the necessaity
update cessions_tmp
...
if sql%rowcount = 0 then
insert into cessions_tmp
...
end if;
for rsomething in (select ..
someother_bigtable -- more than a million records
where rintervals.start_date between start_date
and end_date )
loop
update cessions_tmp
...
if sql%rowcount = 0 then
insert into cessions_tmp
...
end if;
end loop;
end loop;
loop
end loop;
-- finally i do merge for the new one with the old one based
-- on the primary key columns
merge into cessions a
using cessions_tmp
on ( a.cession_id = b.cession_id
and a.start_date = b.start_date
and a.mode = b.mode )
when matched then
update set a.end_date = b.end_date
...
when not matched then
insert (a.cession_id, a.....)
values (b.cession_id,b.... );
end process_re;
-- so my question is using global temp table(cessions_tmp) in this above
-- scenario is better or plsql table is better? please tell us your
-- opinion.
-- note: this is online transaction!
July 08, 2005 - 7:54 am UTC
for rpol in ( select pol_group_id
from main_pol
where ..) -- this can be one or more
--(select criteria may return min 1000 or more)
loop
insert into cessions_tmp( .... )
select ...
from cessions
where pol_group_id = rpol.pol_group_id;
end loop;
that should be a single sql statement -- why write code?
I've no clue if you have 1 row or 1000000000 rows or something in between.
My main concern for performance would be "how to remove the procedural slow by slow code and do as much in a single sql statement as possible"
using seq can cause issues in insert all
Hawk, July 08, 2005 - 11:02 am UTC
Tom,
I am using Insert all as suggested and looks great but....
insert all
into t1 values(a,b,c...)
into t2 values (seq.nextval, d, f)
into t2 values (seq.nextval, d* -1 , f+1)
(select a, b, c, d, f
from
pp, qq
where ,....conditions)
Above, issue is the I am using a same sequence in same table as 2 different rows hence the same seq.nextval goes in both rows which are duplicates for my table t2.
Limitation: I cannot use 2 different sequences.
Please suggest.
Thankyou.
July 08, 2005 - 12:55 pm UTC
I can only suggest you revisit your "limitation" or create the sequence as
create sequnce seq increment by 2;
and use seq.nextval, seq.nextval+1
nextval is defined to be consistent that way -- constant -- if you refer to it a dozen times in a sql statement -- it'll be the same value a dozen times (same with currval). It is as if it were "bound" in -- like it was a bind variable row by row.
This works
Hawk, July 08, 2005 - 12:44 pm UTC
Dear Tom,
I hope this works
function seq_nextval(seqname in varchar2)
return number is
seq_next number;
begin
-- select 'seqname'||.nextval into seq_next from dual;
execute immediate 'select '||seqname||'.nextval from dual'
into seq_next ;
return seq_next ;
end; -- function seq_nextval
July 08, 2005 - 1:06 pm UTC
not reliably, a plan change can and will change the number of times PLSQL called from SQL is called. Not recommended, you are relying on the function to be called for each invocation and that behavior can change over time.
the dynamic sql would not be necessary.
how's this?
Tyler, July 08, 2005 - 3:26 pm UTC
a possible solution to Hawk's problem?
SQL> drop sequence test_seq ;
Sequence dropped.
SQL> create sequence test_seq start with 1 increment by 1;
Sequence created.
SQL>
SQL> create or replace function test_the_seq return number as
2 hold_it number;
3 begin
4 select test_seq.nextval into hold_it from dual;
5 return hold_it;
6 end;
7 /
Function created.
SQL>
SQL>
SQL> SELECT test_the_seq, test_the_seq from dual;
TEST_THE_SEQ TEST_THE_SEQ
------------ ------------
1 2
SQL> SELECT test_the_seq, test_the_seq from dual;
TEST_THE_SEQ TEST_THE_SEQ
------------ ------------
3 4
SQL> SELECT test_the_seq, test_the_seq from dual;
TEST_THE_SEQ TEST_THE_SEQ
------------ ------------
5 6
SQL> SELECT test_the_seq, test_the_seq from dual;
TEST_THE_SEQ TEST_THE_SEQ
------------ ------------
7 8
SQL> SELECT test_seq.nextval from dual;
NEXTVAL
---------
9
July 08, 2005 - 6:03 pm UTC
today that might work
yesterday it might not
tomorrow it might stop
the day after it could work again.
there is nothing saying how many times a function written in plsql will be called from SQL.
Nothing.
answer is same as answer immediately above
?
Tyler, July 08, 2005 - 6:39 pm UTC
Hi Tom,
Could you please elaborate on this behavior? IÂ’m not questioning your statement, but it does confuse me. Personally, IÂ’d expect SQL to execute a function call once per utilization always and forever.
Thanks!
July 08, 2005 - 6:53 pm UTC
select 1+2, 1+2
from t;
how many times do we need to evaluate 1+2?
now, add in different plans, joins, a nested loop versus hash join, a subquery materialized to temp, etc etc etc etc.
Please do not ascribe to the non-procedural language SQL anything procedural at all.
Consider:
select * from t where x = f();
How many times would you expect f() to be called?
ops$tkyte@ORA10G> create table t ( x int );
Table created.
ops$tkyte@ORA10G> insert into t select rownum from all_users ;
38 rows created.
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> create or replace function f return number
2 as
3 begin
4 dbms_application_info.set_client_info(userenv('client_info')+1);
5 return 42;
6 end;
7 /
Function created.
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> exec dbms_application_info.set_client_info(0);
PL/SQL procedure successfully completed.
ops$tkyte@ORA10G> select * from t where x = f();
no rows selected
ops$tkyte@ORA10G> select userenv('client_info') from dual;
USERENV('CLIENT_INFO')
----------------------------------------------------------------
38
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> create index t_idx on t(x);
Index created.
ops$tkyte@ORA10G> exec dbms_application_info.set_client_info(0);
PL/SQL procedure successfully completed.
ops$tkyte@ORA10G> select * from t where x = f();
no rows selected
ops$tkyte@ORA10G> select userenv('client_info') from dual;
USERENV('CLIENT_INFO')
----------------------------------------------------------------
2
why 38? (there are 38 rows in there)
but why 2?
I can show you tricks to MINIMIZE the number of times a function is called, but you would be hard pressed to make it so that a function is called exactly as many times as you want -- USING SQL.
I would not rely on this behavior (at the very least I would dummy up parameters to this function and make sure it gets called with unique inputs -- eg: pass it rowid and something else -- so that the optimizer could never say "ah hah, don't need to keep calling this puppy over and over)
Why 2?
VA, July 08, 2005 - 7:14 pm UTC
Thats very interesting, why does f() get called 2 times when you add the index on t(x)? What does the index on t(x) have to do with evaluating the predicate where t.x=f()?
Also, you confused me with your double-negative in your last paragraph. "so that the optimizer could never say "ah hah, don't need to keep calling this puppy over and over"
If I add a dummy parameter to the function like f(rowid,...), the optimizer will be forced to call the function for every row, right? Isnt this exactly what we were trying to avoid?
Thanks
July 09, 2005 - 8:47 am UTC
not really a double negative. The optimizer could say:
"ah hah, we do not have to call F() over and over, We'll call F() once and be done with it"
If you make the inputs into F() (add inputs) 'unique', the optimizer could not ever make this optimization happen.
IF you call a function AND everytime you call it, the inputs are unique, then the database would be forced to invoke it AT LEAST ONCE, eg, you have your function f() that selects the sequence.nextval. You want to call it AT LEAST twice per row:
select f( rowid, 1 ), f(rowid, 2 ), ....
from table;
rowid is unique in table, rowid+1 is unique -- rowid+2 is unique, there would be no way to NOT call f() at least twice per row. Stress words "at least", it could call it lots more if it feels like it.
Why did it call it twice? Just because it does, it could call it once, maybe in the future it will. Maybe in the future it'll call it 3 times though.
PL/SQL associative arrays
Quadro, September 21, 2005 - 12:07 am UTC
if you need to do simple lookups with small tables, PL/SQL associative arrays go far far far away from even hash clusters in terms of performance (3x-10x times better) and scalability (latches go down in orders of magnitude).
So, if you need to do lookups in procedural code - by all means use it. We use them broadly in our PL/SQL tariffication routines (too much logic to do it just in sql joins).
September 21, 2005 - 6:54 pm UTC
prove it.
if they are small, they hash in memory.
if they are small, they cannot be causing much lio.
I'll stick with Just Join, until shown otherwise.
Prove it
Quadro, September 22, 2005 - 3:04 am UTC
"prove it."
Sure. In our simple test case we will use PL/SQL table and hash cluster to lookup object_name by object_id.
SQL> select count(*) from user_objects;
COUNT(*)
----------
418
So, hash cluster with 1024 keys will be more than enougth.
SQL> create cluster hc_test
2 (
3 object_id number
4 ) size 512 hashkeys 1024;
Cluster created.
SQL> create table hash_lookup
2 (
3 object_id number,
4 object_name varchar2(30)
5 ) cluster hc_test (object_id);
Table created.
SQL> insert /*+ append */ into hash_lookup select object_id, object_name from user_objects;
420 rows created.
SQL> commit;
Commit complete.
SQL> declare
2 --type we will use for lookup table
3 type varchar2_table is table of varchar2(30) index by binary_integer;
4 l_lookup_table varchar2_table;
5
6 --table for storing object identifiers
7 type number_table is table of number index by binary_integer;
8 l_lookup_values number_table;
9
10 l_object_name varchar2(30);
11 begin
12 --Fill our lookup table
13 for cur in (select * from user_objects where object_id is not null)
14 loop
15 l_lookup_table(cur.object_id):=cur.object_name;
16 end loop;
17
18 --fill table with lookup values
19 select object_id bulk collect into l_lookup_values
20 from user_objects
21 where object_id is not null;
22
23 --first - lookup in hash cluster
24 runstats.rs_run1;
25 for i in 1 .. 100
26 loop
27 for j in 1 .. l_lookup_values.count
28 loop
29 select object_name into l_object_name
30 from hash_lookup
31 where object_id=l_lookup_values(j);
32 end loop;
33 end loop;
34 --second - lookup in associative array
35 runstats.rs_run2;
36 for i in 1 .. 100
37 loop
38 for j in 1 .. l_lookup_values.count
39 loop
40 l_object_name:=l_lookup_table(l_lookup_values(j));
41 end loop;
42 end loop;
43 runstats.rs_stop(1000);
44 end;
45 /
Run1 run in 213 hsecs
Run2 run in 3 hsecs
Run1 run in 7100% of run 2
*
STATISTIC RUN 1 RUN 2 DIFF
----------------------------------- ----------- ----------- -----------
STAT...no work - consistent read ge 41,900 0 -41,900
STAT...cluster key scans 41,900 0 -41,900
STAT...buffer is not pinned count 41,900 0 -41,900
STAT...cluster key scan block gets 41,900 0 -41,900
LATCH..library cache pin 83,810 2 -83,808
STAT...consistent gets from cache 41,907 8 -41,899
STAT...execute count 41,901 1 -41,900
LATCH..simulator lru latch 2,300 0 -2,300
STAT...recursive calls 41,902 1 -41,901
STAT...consistent gets 41,907 8 -41,899
STAT...session logical reads 41,922 24 -41,898
LATCH..cache buffers chains 83,883 84 -83,799
LATCH..simulator hash latch 2,300 0 -2,300
LATCH..library cache 83,812 2 -83,810
STAT...Cached Commit SCN referenced 40,900 0 -40,900
STAT...calls to get snapshot scn: k 41,902 1 -41,901
*
LATCH totals RUN 1 RUN 2 PCT
----------------------------------- ----------- ----------- -----------
Latch totals versus runs 256,381 110 233,073.64%
PL/SQL procedure successfully completed.
So - the difference is tremendous?
September 22, 2005 - 1:44 pm UTC
you missed my point.
"JUST JOIN"
you would not be doing this sql in a loop, you would have JOINED to this data once.
You have some query you are processing, doing a slow by slow process apparently and doing a lookup on each row.
My point is JUST JOIN, don't do a lookup for each row, just JOIN and let SQL do the look up.
I agree, if you are going to run a little single row lookup over and over (slow by slow), this helps -- however, JUST JOIN in the first place.
Need help in filtering values from pl/sql Table
Mave, September 29, 2005 - 5:09 pm UTC
I have an array [Associative array] with information like
Type rMyArray is record of (empno integer,
category integer);
Type tMyArray is table of rMyarray index by pls_integer;
Myarray tMyArray;
-- Values in this array
MyArray(1):={1234,1}
MyArray(2):={1234,2}
MyArray(3):={1234,3}
MyArray(4):={1234,4}
MyArray(5):={1234,5}
MyArray(6):={1234,6}
MyArray(7):={2345,1}
MyArray(8):={2345,2}
MyArray(9):={2345,3}
MyArray(10):={2345,4}
MyArray(11):={2345,5}
My requirement is to filter out this array for each empno [in this case 2 employees]
and dump them in another array[optional] called emparray and do some processing.
loop over first empno[=1234] and fetch all the values for this empno and process them, then
loop over second empno[=2345] and fetch all the values for this empno.
How can I acheive this?
Thanks for your time and help,
September 30, 2005 - 8:43 am UTC
you just answered your own question with the psuedo code?
Filter PL/SQL array by value
Quadro, September 29, 2005 - 8:08 pm UTC
This will have linear complexity. Pretty much unscalable and slow.
Instead of iterate use lookup value as key. In your case you can use two PL/SQL tables. First PL/SQL tables keyed by empno will point to second with categories.
declare
type number_table is table of number /* categories */ index by binary_integer;
type number_table2 is table of number_table index by binary_integer /*empno*/;
Myarray number_table2;
begin
Myarray(1234)(1):=1;
Myarray(1234)(1):=2;
Myarray(1234)(1):=3;
Myarray(1234)(1):=4;
Myarray(1234)(1):=5;
Myarray(1234)(1):=6;
Myarray(2345)(1):=1;
Myarray(2345)(1):=2;
Myarray(2345)(1):=3;
Myarray(2345)(1):=4;
Myarray(2345)(1):=5;
--So, when you need to filter by empno, you just use it as key
if (Myarray.exists(1234)) -- see, if such empno exists
then
for i in 1 .. Myarray(1234).count
loop
--process them here
end loop;
end if;
end;
FIlter PL/SQL array
Quadro, September 29, 2005 - 8:17 pm UTC
Or wait for a while - in a moment here will be the Tom himself, pointing out what the best way to get filtere data is to use SQL and *JUST WHERE* clause :-)
Thanks for the replies..
A reader, September 30, 2005 - 9:38 am UTC
Tom, I got the pseudo code but not able to put it in code. I could do like Quadro mentioned here [thanks quadro], but he said myarray.exists(1234)..how to get that value 1234?
if i have two arrays
myarray [with all the values] (empno,category)
myarray1 (empno)
can i say like this ?
for i in 1..myarray1.count
loop
if myarray.exists(myarray1(i).empno)
then
do process --
end if;
end loop
Thanks a bunch.
September 30, 2005 - 11:07 am UTC
he is saying to use an entirely different structure. do you want to do that.
you should be able to split this into two arrays easily yourself, if you can fill ONE array, you can read that array and put it into two arrays?
I can split into two arrays but..
A reader, September 30, 2005 - 11:16 am UTC
How do i check the value in an array..I am not quite following it. Could you please eloborate with a simple example?
That would be really helpful..
September 30, 2005 - 11:58 am UTC
if ( myarray(i).empno = something )
??
you created the array? have you checked out the plsql guides, they have many examples.
Falling in infinite loop??
A reader, September 30, 2005 - 1:13 pm UTC
Ok, this is what i did [just testing ]
I am trying to loop over just one empno and it's falling in infinite loop??
It's showing me when it's not equal to v_empno [1234], but not exiting.
Can you point out the reason?
Thanks for all your help..
declare
Type Emparray is record (empno integer,
ename varchar2(20));
Type temparray is table of Emparray index by pls_integer;
emp_array temparray;
v_empno integer:=0;
begin
-- select empno,ename bulk collect into emp_array from emp;
emp_array(1).empno:=1234;
emp_array(1).ename:='name1';
emp_array(2).empno:=1234;
emp_array(2).ename:='name2';
emp_array(3).empno:=1234;
emp_array(3).ename:='name3';
emp_array(4).empno:=1234;
emp_array(4).ename:='name4';
emp_array(5).empno:=2345;
emp_array(5).ename:='name5';
for i in 1..emp_array.count
loop
--v_empno:=emp_array(i).empno;
v_empno:=1234;
dbms_output.put_line(emp_array(i).empno);
loop
if emp_array(i).empno=v_empno
then
dbms_output.put_line(emp_array(i).empno);
else
--dbms_output.put_line('Not equal '||emp_array(i).empno);
exit ;
end if;
end loop;
end loop;
end;
September 30, 2005 - 2:22 pm UTC
I don't get the inner loop. it is an infinite loop for sure unless they are not equal.
why do you have an inner loop at all.
Just testing
A reader, September 30, 2005 - 2:32 pm UTC
Inner Loop is just for testing . I am trying to insert each value back into a log table for all those empno's that are equal to the value v_empno.
So, why it's an infinite loop? I am exiting if they are not equal.
Thanks,
September 30, 2005 - 2:38 pm UTC
but when the are equal.... infinite loop, you are not incrementing anything, it is the same condition over and over and over again.
loop
if emp_array(i).empno=v_empno
then
dbms_output.put_line(emp_array(i).empno);
else
--dbms_output.put_line('Not equal '||emp_array(i).empno);
exit ;
end if;
end loop;
no matter how many billions of times you execute that - (i) isn't changing.
I think i is changing from outer loop
A reader, September 30, 2005 - 3:01 pm UTC
Tom, i is coming from outer loop [for i in 1..]. isnt't it?
October 01, 2005 - 8:15 pm UTC
yes, but inner loop is inside outer loop - once inner loop entered - forget about outer loop, there is no escape!
Sorry. I got it
A reader, September 30, 2005 - 3:02 pm UTC
Sorry, I got what you are trying to say.
Thanks.
Rocketship vs PL/SQL lookup
Ian, October 04, 2005 - 1:47 pm UTC
Tom,
nice discussion and some very useful info.
Here's the skinny - i have inherited some code often referred to as "the rocketship" because of the DECODE lookup shape. Unfortunetly the rocketship could use a few boosters. Basically it creates an 8 million row flat file from a SQL query that looks up rates to apply to a sales amount. I had initially rewrote this to do the lookups in a PL/SQL cursor and then tried using PL/SQL tables but both were slower which agrees with this thread.
To simplify things i stipped out 90% of the code from the query below...its pretty ugly and has 4 DECODE rockets.
The query attempts 8 different senarios to aquire an accrual rate from the same rate table...for example it first tries to get the rate from S1 if it finds a rate based on CUST_ID and UPC_CODE...then it will try from S2 based on CUST_ID,BRAND and so on until S8.
My question is does using DECODES joined to 8 senario tables seem like a reasonable approach to doing lookups or perhaps there is another way to skin this cat. Thanks for your time.
select account_num,
trim(to_char(sum(
decode(S1.ACCRUAL_RATE,NULL,
decode(S2.ACCRUAL_RATE,NULL,
decode(S3.ACCRUAL_RATE,NULL,
decode(S4.ACCRUAL_RATE,NULL,
decode(S5.ACCRUAL_RATE,NULL,
decode(S6.ACCRUAL_RATE,NULL,
decode(S7.ACCRUAL_RATE,NULL,
decode(S8.ACCRUAL_RATE,NULL,0,S8.ACCRUAL_RATE * sales_amt ),
S7.ACCRUAL_RATE * sales_amt ),
S6.ACCRUAL_RATE * sales_amt ),
S5.ACCRUAL_RATE * sales_amt ),
S4.ACCRUAL_RATE * sales_amt ),
S3.ACCRUAL_RATE * sales_amt ),
S2.ACCRUAL_RATE * sales_amt ),
S1.ACCRUAL_RATE * sales_amt )
),'9999999990.00'))
from weely_sales_summary x,
-- Scenario 1
( SELECT CUST_ID, UPC_CODE, ACCRUAL_RATE
FROM RATE_TABLE WHERE CUST_ID != '0000000000' and UPC_CODE != 'XXXXXXXXXXXX') S1,
-- Scenario 2
(select CUST_ID, brand_code, family_code, ACCRUAL_RATE
from RATE_TABLE
where CUST_ID != '0000000000' and brand_code != 'XXX' and family_code != 'XXX') S2,
-- Scenario 3
(select CUST_ID,brand_code, ACCRUAL_RATE
from RATE_TABLE where CUST_ID != '0000000000' and brand_code != 'XXX' and family_code = 'XXX') S3,
-- Scenario 4
(select CUST_ID,CAT_CD, ACCRUAL_RATE
from RATE_TABLE where CUST_ID != '0000000000' and brand_code = 'XXX' and family_code = 'XXX') S4,
-- Scenario 5
( SELECT UPC_CODE, ACCRUAL_RATE
FROM RATE_TABLE WHERE CUST_ID = '0000000000' and UPC_CODE != 'XXXXXXXXXXXX') s5,
-- Scenario 6
(select brand_code, family_code, ACCRUAL_RATE
from RATE_TABLE where CUST_ID = '0000000000' and brand_code != 'XXX'
and family_code != 'XXX' and UPC_CODE = 'XXXXXXXXXXXX' and CS_UPC_VRSN_NR_CD = 'XX') S6,
-- Scenario 7
(select CAT_CD,brand_code, ACCRUAL_RATE
from RATE_TABLE where CUST_ID = '0000000000' and brand_code != 'XXX' and family_code = 'XXX') S7,
-- Scenario 8
(select CAT_CD, ACCRUAL_RATE
from RATE_TABLE where CUST_ID = '0000000000' and brand_code = 'XXX' AND family_code = 'XXX') S8
where x.CUST_ID = S1.CUST_ID (+)
and x.UPC_CODE = S1.UPC_CODE(+)
--
AND x.CUST_ID = S2.CUST_ID (+)
and x.BRND_CD = S2.brand_code (+)
and x.family_code = S2.family_code (+)
--
and x.CUST_ID = S3.CUST_ID (+)
and x.BRND_CD = S3.brand_code (+)
--
and x.CUST_ID = S4.CUST_ID (+)
and x.cat_cd = S4.CAT_CD (+)
--
and x.UPC_CODE = S5.UPC_CODE (+)
--
and x.BRND_CD = S6.brand_code (+)
and x.family_code = S6.family_code (+)
--
and x.BRND_CD = S7.brand_code (+)
--
and x.CAT_CD = S8.CAT_CD (+)
group by account_num
;
October 04, 2005 - 6:15 pm UTC
ops$tkyte@ORA9IR2> create table t ( x int, y int, z int );
Table created.
ops$tkyte@ORA9IR2> insert into t values ( null, null, 1 );
1 row created.
ops$tkyte@ORA9IR2> insert into t values ( null, 2, 1 );
1 row created.
ops$tkyte@ORA9IR2> insert into t values ( 3, 2, 1 );
1 row created.
ops$tkyte@ORA9IR2> insert into t values ( null, null, null );
1 row created.
ops$tkyte@ORA9IR2> select coalesce(x,y,z,0) from t;
COALESCE(X,Y,Z,0)
-----------------
1
2
3
0
(nicer than decode...)
coalesce
Ian, October 05, 2005 - 12:19 pm UTC
thanks for the input.
looks like a useful function, unfortunately I forgot to mention that we are running on 8.1.7 and COALESCE isn't available until 9.0.1.....my bad.
Procedural vs set-based
A reader, October 25, 2005 - 4:36 pm UTC
Tom has proven many times that rewriting procedural code into set-based code is faster.
I was wondering... how does Oracle implement/process the set-based approach? Any chance that Oracle internally breaks it up into procedural code? As fas as I know, you need special hardware (e.g. vector computer) to actually process arrays as if they are scalars. How does Oracle process a statement such as:
SELECT * FROM TABLE_A
MINUS
SELECT * FROM TABLE_B
This construct is much more efficient that trying to do it procedurally. Still, how can Oracle execute the query without looking at the tables row by row?
Hopefully, the answer doesn't violate any proprietary laws. :-)
October 26, 2005 - 11:31 am UTC
The point is - everything is procedural code at the lowest level.
We can just optimize the overall SQL / data processing request - if you give us a single sql statement.
We cannot optimize your procedure code that executes thousands or billions of sql statements.
Ask us to do all of the work - instead of you trying to do the work in your own code.
Using dbms_alert to signal data changes
malcolm, July 18, 2006 - 6:10 am UTC
Hi Tom,
I'm using pl/sql associative arrays to perform lookups on nearly static data - perhaps an update every hour or so. On 9iR2.
I completely agree with you that it's better to join any lookup tables, and let the optimiser do its stuff, but it's not really relevent to what am I doing here. The data I'm am caching in arrays is just config data - just to control the flow of processing.
The (age-old) problem is that I need to detect when the cached data has changed, and reload. This is especially relevent as some sessions could be members of connection pools, and almost never disconnect.
In a followup a third of way down the page, you suggest using a trigger and populating a index organised table. What do you think of the trigger using dbms_alert.signal(alert), and each time the cache is checked, using dbms_alert.waitone(alert, message, status, 0) instead?
I've traced dbms_alert.waitone and it uses a table called sys.dbms_alert_info under the scenes. I thought this might be more efficient, because if the table is owned by sys, it goes into the dictionary cache, and the dictionary cache must be more efficient than the buffer cache, because it caches formatted rows, not blocks.
Or is my assumption that all tables owned by sys go into the dictionary cache incorrect? In which case, exactly which tables are cached in the dictionary cache?
July 18, 2006 - 8:33 am UTC
it does not go into the dictionary cache. It is just a table. in the dictionary cache goes our metadata - not every block from sys owned tables.
Kumar, December 05, 2006 - 2:00 am UTC
There is a limitation in using the PL/SQL arrays. In my case the maximum subscript that an associative array have is 2147483648. It seems to be the limitation of OS & database version (9.2.0.7).
I would like to know How I can overcome this Limitation.
I am trying to use PL/SQL table lookup for loading fact records. I have 10 dimensions. These dimension tables will act as lookup table. Each dimension will have maximum of 10,000 records (eventhough I have only 10000 records,few of the my subscript values are more that 2GB). So when loading the fact records I would like to avoid reading dimension tables for each records and instead use PL/SQL table as lookup. Please advise.
== My script
declare
type NumberType is table of number index by pls_integer;
l_doper_nkid NumberType;
l_doper_skid NumberType;
l_doper_lkup NumberType;
v number :=8888888888;
Begin
select sk_destination_operator_id,destination_operator_id_nk bulk collect
into l_doper_skid,l_doper_nkid
from dt_destination_operator_v;
for i in 1 .. l_doper_nkid.count
loop
dbms_output.put_line('test '||l_doper_skid(i));
l_doper_lkup(l_doper_nkid(i)) := l_doper_skid(i);
end loop;
dbms_output.put_line('sk_operator_id value for '||v||' :'||l_doper_lkup(v));
Exception
when no_data_found then
dbms_output.put_line('Operator Surrogate Key not found for '||v);
End;
=================
December 05, 2006 - 9:55 pm UTC
you know, if you have more than 2gb of subscripts - you really ought to....
well.....
rethink?
that would not be a "smart" thing.
(databases JOIN really good, you might consider using them to do that?)
Hard-coded CASE vs SQL lookup
Sunj, March 09, 2007 - 11:06 am UTC
As compared to table lookup and PL/SQL lookup, I find hard-coded CASE statements are the fastest. I do not like the idea of hard-coding, but I have to pick performance over maintainability.
My example is for a country lookup for a web-service application. Although per-search the timing is not significant, but in case the user uses my web-service in a batch, I need to my service as lean as possible.
My country table has been relatively static.
I generated this block of CASE statement of 250 conditions.
CREATE OR REPLACE FUNCTION map_iso_country_cd (iso_code VARCHAR2)
RETURN NUMBER
IS
vcountryid PLS_INTEGER;
BEGIN
CASE iso_code
WHEN 'VCT' THEN vcountryID := 63;
WHEN 'SAU' THEN vcountryID := 64;
WHEN 'SCO' THEN vcountryID := 65;
WHEN 'USA' THEN vcountryID := 1;
WHEN 'CAN' THEN vcountryID := 2;
WHEN 'AFG' THEN vcountryID := 7;
-- about 200 more country codes
WHEN 'ZWE' THEN vcountryid := 250;
END CASE;
RETURN vcountryid;
END;
This is my lookup version of function:
CREATE OR REPLACE FUNCTION map_iso_country_cd_TAB (iISO_code VARCHAR2)
RETURN NUMBER
IS
vcountryid PLS_INTEGER;
BEGIN
SELECT country_id
INTO vcountryid
FROM country c
WHERE iso_code = iISO_code;
RETURN vcountryid;
END;
A million executions, came back in .06 seconds for the first one, whereas the second one came back in 50 seconds.
Could you suggest a more civilized approach please. :)
thanks
March 12, 2007 - 1:46 pm UTC
My idea is.....
do away with the lookup all together....
JUST JOIN
Databases where BORN TO JOIN, just join
see my comments above, for example:
... I betcha using a single INSERT -- removing the procedural code all together -- would get rid of the
rest of the time. .....
REMOVE the CODEMore food for thought.... YAWTDI
yet another way to do it.......
ops$tkyte%ORA10GR2> /*
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> drop table lookup;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table lookup
ops$tkyte%ORA10GR2> ( c_code varchar2(3) primary key,
ops$tkyte%ORA10GR2> c_id number
ops$tkyte%ORA10GR2> )
ops$tkyte%ORA10GR2> organization index
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into lookup
ops$tkyte%ORA10GR2> select *
ops$tkyte%ORA10GR2> from (
ops$tkyte%ORA10GR2> select substr( object_name, 1, 3 ), max(object_id)
ops$tkyte%ORA10GR2> from all_objects
ops$tkyte%ORA10GR2> group by substr(object_name,1,3)
ops$tkyte%ORA10GR2> order by dbms_random.random
ops$tkyte%ORA10GR2> )
ops$tkyte%ORA10GR2> where rownum <= 225
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> declare
ops$tkyte%ORA10GR2> l_case long := 'case p_iso_code ' || chr(10);
ops$tkyte%ORA10GR2> begin
ops$tkyte%ORA10GR2> for x in ( select * from lookup )
ops$tkyte%ORA10GR2> loop
ops$tkyte%ORA10GR2> l_case := l_case || ' when ''' || x.c_code || ''' then l_c_id := ' || x.c_id || ';' || chr(10);
ops$tkyte%ORA10GR2> end loop;
ops$tkyte%ORA10GR2> l_case := l_case || 'end case;' || chr(10);
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> execute immediate q'|
ops$tkyte%ORA10GR2> create or replace function map_iso_country_cd( p_iso_code in varchar2 )
ops$tkyte%ORA10GR2> return number
ops$tkyte%ORA10GR2> as
ops$tkyte%ORA10GR2> l_c_id number;
ops$tkyte%ORA10GR2> begin |' || l_case || '
ops$tkyte%ORA10GR2> return l_c_id;
ops$tkyte%ORA10GR2> end;';
ops$tkyte%ORA10GR2> end;
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create or replace function map_iso_country_cd_tab( p_c_code in varchar2 )
ops$tkyte%ORA10GR2> return number
ops$tkyte%ORA10GR2> as
ops$tkyte%ORA10GR2> l_c_id number;
ops$tkyte%ORA10GR2> begin
ops$tkyte%ORA10GR2> select c_id into l_c_id
ops$tkyte%ORA10GR2> from lookup
ops$tkyte%ORA10GR2> where c_code = p_c_code;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> return l_c_id;
ops$tkyte%ORA10GR2> end;
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create or replace package my_pkg
ops$tkyte%ORA10GR2> as
ops$tkyte%ORA10GR2> type array is table of number index by varchar2(3);
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> g_lookup array;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> function map_iso_country_cd( p_c_code in varchar2 ) return number;
ops$tkyte%ORA10GR2> end;
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create or replace package body my_pkg
ops$tkyte%ORA10GR2> as
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> function map_iso_country_cd( p_c_code in varchar2 )
ops$tkyte%ORA10GR2> return number
ops$tkyte%ORA10GR2> is
ops$tkyte%ORA10GR2> begin
ops$tkyte%ORA10GR2> return g_lookup(p_c_code);
ops$tkyte%ORA10GR2> end;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> begin
ops$tkyte%ORA10GR2> for x in (select * from lookup)
ops$tkyte%ORA10GR2> loop
ops$tkyte%ORA10GR2> g_lookup(x.c_code) := x.c_id;
ops$tkyte%ORA10GR2> end loop;
ops$tkyte%ORA10GR2> end;
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2> */
ops$tkyte%ORA10GR2> connect /
Connected.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> set timing on
ops$tkyte%ORA10GR2> declare
2 type array is table of varchar2(3) index by binary_integer;
3 l_data array;
4 l_num number;
5 begin
6 select c_code bulk collect into l_data from lookup;
7
8 for i in 1 .. 1000
9 loop
10 for i in 1 .. l_data.count
11 loop
12 l_num := map_iso_country_cd( l_data(i) );
13 end loop;
14 end loop;
15 end;
16 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.66
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> declare
2 type array is table of varchar2(3) index by binary_integer;
3 l_data array;
4 l_num number;
5 begin
6 select c_code bulk collect into l_data from lookup;
7
8 for i in 1 .. 1000
9 loop
10 for i in 1 .. l_data.count
11 loop
12 l_num := map_iso_country_cd_tab( l_data(i) );
13 end loop;
14 end loop;
15 end;
16 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:08.00
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> declare
2 type array is table of varchar2(3) index by binary_integer;
3 l_data array;
4 l_num number;
5 begin
6 select c_code bulk collect into l_data from lookup;
7
8 for i in 1 .. 1000
9 loop
10 for i in 1 .. l_data.count
11 loop
12 l_num := my_pkg.map_iso_country_cd( l_data(i) );
13 end loop;
14 end loop;
15 end;
16 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.25
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
Hard-coded CASE vs SQL lookup
Sunj, March 21, 2007 - 5:42 pm UTC
Thanks Tom, using a PL/SQL array/table is a cleaner alternative.
But .25 seconds for a thousand look-ups still does not come close to the hard-coded CASE statement's timing of .06 seconds for a million lookups.
Also each time a new user session calls the Package, wouldn't the array be loaded all over again. I agree this will be definitely efficient, if you are still within same Oracle session, especially in PL/SQL LOOPs.
March 21, 2007 - 8:06 pm UTC
you seem to have missed the crux here.
case was SLOWER than associative array.
My machine is not your machine
My times are not your times.
CASE:
12 l_num := map_iso_country_cd( l_data(i) );
Elapsed: 00:00:01.66
TABLE:
12 l_num := map_iso_country_cd_tab( l_data(i) );
Elapsed: 00:00:08.00
ASSOCIATIVE ARRAY:
12 l_num := my_pkg.map_iso_country_cd( l_data(i) );
Elapsed: 00:00:00.25
and how many sessions does it take to process this?!?! if the overhead of loading the table becomes an issue - you have a seriously bad architecture problem that we need to fix first!!!
Memory Lookups vs. Database Lookups
Towhidun Nabi, May 09, 2007 - 8:49 am UTC
Hello,
I was wondering if there is anything wrong with memory lookups.
I must say I am a novice programmer comparing to some of the discussions above.
I am working with a selling company where we have right now 13 depots around the country. We have about 250+ items to sell, and about 20,000+ customers per depots, and per day 1000-1500 invoices per depots. We have a defined Price for each products, Some special prices for a few customers, Some bonus Products, various discount policies and few other tid bits.
I am developing an Invoicing Program with Oracle Forms 6i, Normal form (not web based since my lack of knowledge in web base programs). On the Form Init (new form instance), I load the Item table, Regular Price Table, Bonus Products Table, Special Party, Sale Stop Products Table....all in the clients memory, as I create several form level populated Record Groups to hold the data.
When the user enters an Item code, it is verified (looked-up) against the memory based Record Groups, I did develop a binary searching module to search the record group, as the group is sorted on Item Codes. Since the No of products is very few (< 300). It is working quite ok.
My point is, in the whole program, I never looked up at the database for the name of a product, as everything is in the memory already, as memory is getting cheaper, we have 256MB ram in each Client PCs, so pre-loading some 10-20k data in the memory does not seem to cause much trouble, Only trouble is when one or two new products are entered, the user gets an Invalid Item Code, they have a magin button at the corner of the form saying "Refresh Code", and they are ok again.
I do all the look up for Product Name, Bonus Infos, Special Prices in the Memory with Binary search as most of this are static for a period of time(Like a Month), and is is refreshed when the user run the program and very few time the user needs to refresh.
Only thing I look up in the database in the course of the entry system, is the STOCK of a product, off course.
As I thought, I do least required database access, least network round trips, faster response at the user end...blah blah blah. I do call a stored procedure to get the STOCK, by the way.
I was wondering if theres anything seriously wrong with the approach.
Towhidun Nabi.
IT Executive
Aristopharma Ltd.
Dhaka
Bangladesh.
May 11, 2007 - 9:49 am UTC
you still NEED to have the database constraints
so you have data integrity.
there is nothing inherently wrong with client side validation - it is just a fact that when you post to the database you MUST validate the data, using the database
else you will have data integrity problems.
Re: Memory Lookups vs. Database Lookups
Towhidun Nabi, May 12, 2007 - 5:47 am UTC
Dear Sir,
Thank you for your response. Don't get me wrong, I am very much a fan of this site, of you. This is by so far, one of the best, and genuine site, only it is kind of not easy for new - comers.
I do have database constraints on the database as foreign keys, So it is not possible for the user to enter an invalid item code in the database, as when we post the data, it is validated against the foreign keys.
Only I try to minimize the chance of invalid data before even posting the data, as the Pre-loaded client side Record Groups (sort of arrays) are usually valid, chance of data being invalidated after data load is very very minimal. There is the chance of missing code, that is often minimized, as the client usualy gets a message before hand as New Products are launching, All they have to do is refresh Code, or re-enter the program again, and this is only one / two times per whole month.
So, I confidently use memory lookups for all the client level validation, as I know it will be validated against the Database upon data POST / insert /update.
For queries and report, Most of time I try my level best NOT TO JOIN for complex queries, but to supply data from Memory.
So Instead of
Select i.prod_code, p.prod_name, c.customer_name,
.......
..
..
..
from
Issue_Table i,
Product_table p,
Customer_table c,
Market_table m,
Territorie_table t,
sr_names sr, -- delivery personal
..
.
where i.inv_date >= to_date('01-MAY-2007', 'dd-MON-rrrr')
and i.inv_date < to_date('01-JUN-2007', 'dd-MON-rrrr')
and i.prod_code = p.prod_code
and i.cust_code = c.cust_code
and c.market_code = m.market_code
and m.territorie_code = t.territorie_code
and i.sr_code = sr.sr_code
...
...
I simply do
show_status('Initializing...');
rg_id := find_group('Query_Data');
if not id_null(rg_id) then
delete_group(rg_id);
end if;
rg_id := create_group('Query_Data');
gc_id := add_group_column(rg_id, 'Prod_Code', CHAR_COLUMN, 9);
gc_id := add_group_column(rg_id, 'Prod_Name', CHAR_COLUMN, 50);
gc_id := add_group_column(rg_id, 'Customer_Name', CHAR_COLUMN, 100);
gc_id := add_group_column(rg_id, 'Market_Name', CHAR_COLUMN, 100);
gc_id := add_group_column(rg_id, 'Territory_MR_Name', CHAR_COLUMN, 100);
gc_id := add_group_column(rg_id, 'SR_Name', CHAR_COLUMN, 100);
..
..
..
all the fields necessary.....
..
..
show_status('Loading Issue Data...');
rg_issue := find_group('issue');
if not id_null(rg_issue) then
delete_group(rg_issue);
end if;
M_Issue_Table := 'Issue_Table';
-- change issue table name for history data
if history_data then
M_Issue_Table := 'history_Issue_Table_' || month_year;
end if;
rg_issue := create_group_from_query('issue',
'select * from ' || M_Issue_Table || ' '
|| 'where inv_date >= to_date(''' || M_StartDateStr || ''', ''dd-MON-rrrr'') '
|| 'and inv_date < to_date(''' || M_EndDateStr2 || ''', ''dd-MON-rrrr'') '
|| 'order by invoiceno', FORM_SCOPE, <b>100</b>);
if id_null(rg_issue) then
message('Error: ' || dbms_error_text);
..
raise form_trigger_failure;
end if;
M_ErrCode := populate_group(rg_issue);
M_NoOfIssue := get_group_row_count(rg_issue);
M_No_OF_Query_Data := 0;
for i in 1..M_NoOfIssue loop
if mod(i, 100) = 0 then
show_status('Processing Issue Data...' || i || ' of ' || M_NoOfIssue);
end if;
M_Prod_Code := get_group_char_cell('issue.prod_code', i);
M_Prod_Name := ' ';
M_Pos := bin_search_char('Products', M_NoOfProducts, M_Prod_Code);
if M_Pos <> 0 then
M_Prod_Name := get_group_char_cell('Products.prod_name', M_Pos);
else
<b> show_error ('Product Code: ' || M_Prod_Code || ' NOT found, may need to refresh');</b>
return;
end if;
.. Territories
...
... SR Name
.. Market Name..
M_No_OF_Query_Data := M_No_OF_Query_Data + 1;
add_group_row('Query_Data', End_OF_Group);
set_group_char_cell('query_data.prod_code', M_No_OF_Query_Data, M_Prod_Code);
set_group_char_cell('query_data.prod_name', M_No_OF_Query_Data, M_Prod_Name);
.. market name
.. territorie name
.. sr name
.. delivery schedule
....
end loop;
-- finaly we've got query_data to work with
-- just send it to the report as data_parameter
A temporary non-query client-side record group is created, and the data is populated through loop/binary search. I never saw that error: Prod code ... NOT found, may need to refresh, as it is refreshed on form init(new form instance). I asked the user to sue me (or click refresh) when they see it.
I could have done it with a global temporary table, but again, it would disturb the database as we have to select it from again. Inserting into temporary table, still required SQL <=> PLSQL switching, rollback segment(I don't know for sure, may be), network round trip(off course).
As I thought, I am issueing a very simple, straight forward
'select * from Issue_Table where inv_date between start_date and end_date', in the whole report, Inv_date is indexed, where as it should have been a very heavy loaded query with 9 to 10 joines, most of then with (+) sign, for safety from inner/outer join, as I am sure all the data in issue_table in the date range, not something is missing as a result of missing sr_code as not all invoices will have a sr_code.
The 'Issue_Table' is variable, the user may want to view history data, every month, we put all the data into the history with MONTH_YEAR as suffix.
And also, I try to avoid
for rec in (select * from table) loop
...
end loop;
as I have seen if the table have 1000 data, there will be 1000 fetchs, 1000 PLSQL to SQL switch, and eventually 1000 network round trips.
So I try to do Record Groups with 100 Records, so there will be about 1 PLSQL to SQL switch, 10(more or less, depending on the size of the records) network round trips
for 1000 record.
I also have a show_status(msg) to see where I am now running...99.99% it shows loading data, only a very fraction of time (< 1%) it shows Processing...
I don't know, you guys would be laughing or making fun, of what I am doing, doing it the complex way..
As a first thought, I am doing a lot of work just to speed things up.
But I am happy for the outcome, as I think my hard work came to play when I see a report running and so. All my life I tried to minimize Network Round Trips, I don't want to select the same table twice in a same Report, and whenever I could pre-load a table, I do it.
I am not an Oracle/DBA expert, but I learned my lessons through practical experiences. If Oracle can do everything, what will I do????
Towhidun Nabi.
IT Executive
Aristopharma Ltd.
Dhaka
Bangladesh.
naeeym@hotmail.com
May 14, 2007 - 12:56 pm UTC
and as I said - there is nothing wrong with doing client side validation
you just have to do it in the database as well - and you seem to be saying you do that
so - you are all good.
load small tables to memory
Alex Callaghan, July 08, 2007 - 2:28 pm UTC
Hi
In your first response (almost 5 years ago), you said
"Did you know that databases were born to be queried? It is one of the things they do best."
Is that still true now days?
Look what Oracle PErformance Tuning Guide suggests (both 9i and 10g):
If possible, avoid requerying the same data, by caching frequently accessed data in the client program or middle tier.
Oracle 10g Performance tuning Course (Oracle University):
Data blocks and header blocks can be accessed by only one process at time. Several process trying to access the same blocks will create contention.
I am pointing this out because I have had several experience with database approach where you have packages querying small lookup table (from small as 10 rows to a couple hundred of rows) all the time in batch jobs and online users (more than 5000 connected users and active from 100 to 200). There were quite a bit problems with hot blocks, when the tables were load to pl/sql arrays the problems went away.
May be I come from c programming world but I do think if you can load things up to memory (and not shared) is a good way to improve things....
July 08, 2007 - 6:00 pm UTC
how about this:
JUST JOIN
why are you procedurally looking stuff up when you can JUST JOIN, JUST JOIN
databases where born to query
they were born to join
and they permit you to write to them as well (and are very good at doing that)
stop writing procedural code, start writing SQL.
PL/SQL Table question
A reader, August 02, 2007 - 11:42 am UTC
I hope you can help. I am trying to load a PL/SQL table with some data and I want it to be ordered by one of the columns. My example is below that does not work.
create table pet_preference (sid varchar2(1), person_name varchar2(10), pet_type varchar2(10));
ALTER TABLE pet_preference
ADD (CONSTRAINT person_name_PK PRIMARY KEY (person_name));
insert into pet_preference values ('1', 'lisa', 'cat');
insert into pet_preference values ('2', 'ken', 'dog');
insert into pet_preference values ('3', 'jeff', 'fish');
insert into pet_preference values ('4', 'steve', 'bird');
commit;
set serveroutput on size 1000000
declare
TYPE pet_t IS TABLE OF pet_preference%ROWTYPE
INDEX BY pet_preference.person_name%type;
pet pet_t;
BEGIN
for pet_rec in (select * from pet_preference)
loop
pet (pet_rec.sid) := pet_rec;
end loop;
for indx1 in pet.first .. pet.last
loop
dbms_output.put_line (indx1||' '||pet(indx1).person_name|| ' '||pet(indx1).pet_type);
end loop;
end;
The output is:
1 lisa cat
2 ken dog
3 jeff fish
4 steve bird
but what I want is:
3 jeff fish
2 ken dog
1 lisa cat
4 steve bird
Can you please tell me what I am doing wrong? I have searched the site and Oracle documentation but can't seem to find anything specific to this issue. Any help would be greatly appreciated. Thank you in advance. This is a great site and you are a great help to many people!
August 05, 2007 - 12:35 pm UTC
umm, arrays are - arrays
I don't see the logic in having "jeff fish" assigned to slot number 3 - why not slot #1
use "select * from pet_preferences ORDER BY SOMETHING"
to retrieve the data sorted and then print it out in order.
a reader, August 05, 2007 - 6:46 pm UTC
The order by did not work as I had tried it. I printed out the output that I wanted. The first column is an id that gets loaded. The point was that I want them sorted by the name not the id which is what is happening no matter I try,.
August 06, 2007 - 11:38 am UTC
they are in an array.
if you want them sorted in that array - you SELECT THEM OUT ORDERED AND PLACE THEM INTO THE ARRAY SORTED
Or, you SORT THE ARRAY.
Look at your code:
declare
TYPE pet_t IS TABLE OF pet_preference%ROWTYPE
INDEX BY pet_preference.person_name%type;
pet pet_t;
BEGIN
for pet_rec in (select * from pet_preference)
loop
pet (pet_rec.sid) := pet_rec;
end loop;
for indx1 in pet.first .. pet.last
loop
dbms_output.put_line (indx1||' '||pet(indx1).person_name|| ' '||pet(indx1).pet_type);
end loop;
end;
you retrieve the data UTTERLY RANDOM - so, just put an ORDER BY on that query and you are *done*
I'm not getting it.
The order by *does work*, but you'll get as output:
1 jeff fish
2 ken dog
3 lisa cat
4 steve bird
which is just dandy since that 1,2,3,4 are array indices and rather *meaningless* (since you don't have an order by - the data is returned to you from the database in any order we fell like - not in the order of insertion or anything like that - so your random assignment of 1,2,3,4 to the rows is just that - rather random)
Just use ORDER BY and you are *done*
If you claim "order by doesn't work" we'll need to see your attempt so we can correct it.
the first column you are printing IS NOT AN ID, it is an index into an array, randomly assigned to the data each time it is fetch, it is not persistent data, it is rather meaningless.
arrays of different sizes
Kevin, February 02, 2008 - 9:26 pm UTC
Nice thread, I keep on refering to it. Recently had this question come along and I dont' have the indepth to answer it. Here it is:
What are the differences to expect (memory use?) between these two code snippets, recognizing that the only difference between themthe size of the varchar2 item at the root of each associative array?
declare
ta is table of varchar2(2000) index by binary_integer;
aa ta;
rc1 sys_refcursor;
begin
open rc1 for select object_name from dba_objects;
fetch rc1 bulk collect into aa;
close rc1;
end;
declare
ta is table of varchar2(32000) index by binary_integer;
aa ta;
rc1 sys_refcursor;
begin
open rc1 for select object_name from dba_objects;
fetch rc1 bulk collect into aa;
close rc1;
end;
I am wondering if there is benefit in defining multiple arrays in a particilar utility package I am working on, and then selecting the array that best fits the size of the data element I am working with.
OR
If one large array would be good for all purposes. The code is simpler to write with just one array definition, but not worth it if there are extreme memory requirements and/or performance changes when using arrays defined with wide string definitions.
Thanks, Kevin
February 04, 2008 - 3:47 pm UTC
large varchar2 sizes are dynamically allocated to "best fit", they are not preallocated.
this shows they are just about the same.
ops$tkyte%ORA10GR2> connect /
Connected.
ops$tkyte%ORA10GR2> set echo on
ops$tkyte%ORA10GR2> declare
2 type ta is table of varchar2(2000) index by binary_integer;
3 aa ta;
4 rc1 sys_refcursor;
5 begin
6 open rc1 for select object_name from dba_objects;
7 fetch rc1 bulk collect into aa;
8 close rc1;
9 end;
10 /
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> @mystat pga
ops$tkyte%ORA10GR2> set echo off
NAME VALUE
---------------------- ----------
session pga memory 978516
session pga memory max 5762644
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> connect /
Connected.
ops$tkyte%ORA10GR2> set echo on
ops$tkyte%ORA10GR2> declare
2 type ta is table of varchar2(32000) index by binary_integer;
3 aa ta;
4 rc1 sys_refcursor;
5 begin
6 open rc1 for select object_name from dba_objects;
7 fetch rc1 bulk collect into aa;
8 close rc1;
9 end;
10 /
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> @mystat pga
ops$tkyte%ORA10GR2> set echo off
NAME VALUE
---------------------- ----------
session pga memory 978516
session pga memory max 5697108
Just what I was looking for
Kevin, February 05, 2008 - 7:36 am UTC
Sorry, I should have checked the memory allocations myself (doh!).
Kevin
Christian, February 14, 2008 - 2:07 pm UTC
I like to process everything in a loop in pl/sql - is this a good programming habit ?
Christian
February 15, 2008 - 7:44 am UTC
not if it could have been done in a single sql statement.
bad:
for x in (select * from t)
loop
x.c1 := lower(x.c1);
update t set c1 = x.c1 where pk = x.pk;
end loop;
good:
update t set c1 = lower(c1);
for a really simple example.
Never write procedural code
Until you have to
Hard-coded CASE vs SQL lookup
Sunj, March 21, 2008 - 7:00 pm UTC
Tom,
A very belated thanks (1 year old) for the response last year on March 21, 2007.
Your Associative Array solution was definitely the best.
I guess it was faster than SQL solution, because of lack of context switch between SQL and PL/Sql engines. Am I right?
Here's my experience:
1. SELECT col1 FROM my_table;
2. SELECT col1, my_function(col1) FROM my_table;
3. Enclose SQL#2 within dynamic SQL.
#1 was obviously the fastest
#2 was slowest
#3, even though is the same as #2, it performed much faster, because it was running fully within PL/SQL engine.
March 24, 2008 - 11:08 am UTC
I do not understand #3 at all
but I would add #4
select my_table.col1, my_lookup.col2 from my_table, my_lookup where <join>
as being the fastest of them all. (as stated by my time and time again above :) )
Fetch values from pl_sql table using column index instead of column_names
Maverick, April 07, 2008 - 4:22 pm UTC
Tom, I have been trying to do something with pl/sql Tables and got stuck at one place.
Here is the scenario:
declare
Type mytype is table of emp%rowtype;
mytype1 mytype;
k integer:=0;
v_sql varchar2(4000):='Select * from emp ';
begin
execute immediate v_sql bulk collect into mytype1;
for i in 1..mytype1.count
loop
dbms_output.put_line(mytype1(i).empno);
end loop;
end;
My problem is while outputting the data using DBMS_OUTPUT, is there any way to reference array with an index for column name instead of actual column name
like
dbms_output.put_line(mytype1(i).1stcolumn||','||mytype1(i).2ndcolumn);
?
Please don't ask me why I want to do this way , but I am interested to see if there is a way to do that?
oh..btw, I am using Oracle 10g [10.2.0] if it helps.
Thanks
April 09, 2008 - 9:32 am UTC
Please don't ask me why I want to do this way
fine, I won't - but that just means I won't be able to do anything useful for you. If I knew what your GOAL was, I might be able to suggest something useful.
But since you just want me to be a "answer my question and nothing but the question" guy - so be it.
The answer is "nope"
I was just comparing different languages
Maverick, April 09, 2008 - 9:59 am UTC
Tom, I was just trying to do something [I did it using PERL]in PL/SQL with Arrays. In PERL I can loop through a row and seperate each column of row and manipulate any way I want to. I don't have to know what are the actual columns in it.
I was just wondering if we can do the same in PL/SQL.
Trying to see differences between pl/sql and PERL or other programming languages.
So, I can still take the answer as no. correct?
April 09, 2008 - 2:54 pm UTC
correct, the answer is no, plsql is a static compiled language.
RE: just comparing different languages
Duke Ganote, April 09, 2008 - 11:09 am UTC
I think Tom's asking "what is your goal" because PL/SQL is an auxiliary to SQL. Hence his mantra:
if you can do it in a single sql, do it.
if not do it in as little plsql as possible...
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:5408706816128#30690569723524 Just as there's an "impedence mismatch" between "object oriented design" and "relational design", there's a similar mismatch between PERL (which seems to procedural and string-oriented), and SQL (which is set- and database-oriented).
There are similar mismatches between folks who say "In SQL Server I use local temp tables a lot, so how do I do use local temp tables in Oracle?" The answer is:
temp tables are a crutch, don't use local temp tables in Oracle, there are better Oracle constructs for achieving your goal; e.g.
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1164655862293
Maverick, April 09, 2008 - 1:00 pm UTC
Dear Duke Ganote , I think we are talking two different things here.you are talking about Database or SQL [structured query language], but I am talking about Programming language [PL/SQL]. so, we can compare Pl/SQL with PERL or C or C# etc..[I might be wrong there!!]
But I was just trying to see if there is a way to do this in PL/SQL. and Tom, kind of mentioned we cannot [I guess].
Thanks
Thanks,
RE: compare Pl/SQL with PERL or C or C# etc
Duke Ganote, April 09, 2008 - 2:02 pm UTC
Maverick-- It just reminds my of one of my early software classes where I learned LISP, SNOBOL, SIMSCRIPT and a mathematical symbol manipulation package (it would manipulate differential equations) in one quarter. Each had its strengths and weaknesses, but they weren't usually comparable in functionality.
I did annoy my professor by using LISP (a list processing language used for artificial intelligence) to calculate some outrageously big Fibonacci number. LISP did it, but not very efficiently, and I think he was annoyed at how much CPU I'd chewed up overnight on the university's IBM 360 mainframe :)
Lookup Data From Associative Arrays
A reader, June 12, 2015 - 1:40 am UTC
Hi Tom,
I need to do a production data patch fix which can be done by a simple SQL statement but this a highly regulated environment (can be audited by FDA) and the simple update will not suffice as we need to produce a listing of the case data and before and after updates for each case impacted by the update. So the bottom line unfortunately is that we cannot use update and we would have to use pl sql to loop through each case. In order to improve the performance I am caching the lookup data in a associative array table and just looking up the value during sql bulk collect.
I am not giving the original code but a similar code where I am getting the same error.
DELETE COUNTRIES;
DROP TABLE COUNTRIES;
CREATE TABLE COUNTRIES
(
COUNTRY VARCHAR2(100) NOT NULL,
COUNTRY_ID NUMBER CONSTRAINT COUNTRIES_PK PRIMARY KEY
);
INSERT INTO COUNTRIES VALUES ('USA', 1);
INSERT INTO COUNTRIES VALUES ('United Kingdom', 2);
INSERT INTO COUNTRIES VALUES ('India', 3);
COMMIT;
DROP TABLE TOURIST;
CREATE TABLE TOURIST
(
FIRST_NAME VARCHAR2(100),
LAST_NAME VARCHAR2(100),
TOURIST_ID NUMBER CONSTRAINT TOURIST_PK PRIMARY KEY,
COUNTRY_VISIT1_ID NUMBER,
COUNTRY_VISIT2_ID NUMBER,
CONSTRAINT TOURIST_FK1
FOREIGN KEY(COUNTRY_VISIT1_ID)
REFERENCES COUNTRIES(COUNTRY_ID)
ON DELETE CASCADE,
CONSTRAINT TOURIST_FK2
FOREIGN KEY(COUNTRY_VISIT2_ID)
REFERENCES COUNTRIES(COUNTRY_ID)
ON DELETE CASCADE
);
INSERT INTO TOURIST VALUES ('ABC1', 'XYZ1', 1, 1,2);
INSERT INTO TOURIST VALUES ('ABC2', 'XYZ2', 2, 2,NULL);
INSERT INTO TOURIST VALUES ('ABC3', 'XYZ3', 3, NULL,NULL);
INSERT INTO TOURIST VALUES ('ABC4', 'XYZ4', 4, NULL,3);
COMMIT;
The above small DML/DDL script will setup the objects needed for the example. The below PL/SQL anonymous block is the code that was written
DECLARE
TYPE TOURIST_TYPE_REC IS RECORD (
TNAME VARCHAR2(300),
COUNTRY_VISIT1 COUNTRIES.COUNTRY%TYPE,
COUNTRY_VISIT2 COUNTRIES.COUNTRY%TYPE
);
TYPE TOURIST_TYPE_TAB IS TABLE OF TOURIST_TYPE_REC;
TYPE COUNTRY_CACHE_TYPE IS RECORD(COUNTRY COUNTRIES.COUNTRY%TYPE);
TYPE COUNTRY_CACHE_TAB IS TABLE OF COUNTRY_CACHE_TYPE INDEX BY BINARY_INTEGER;
TOURIST_DATA TOURIST_TYPE_TAB;
COUNTRY_CACHE COUNTRY_CACHE_TAB;
L_COUNTRY_REC COUNTRY_CACHE_TYPE;
BEGIN
/* Cace the country details so that you dont have to look them up from table again*/
FOR ALL_COUNTRIES_REC IN (SELECT COUNTRY, COUNTRY_ID FROM COUNTRIES UNION SELECT 'Unknown', -1 FROM DUAL)
LOOP
L_COUNTRY_REC.COUNTRY := ALL_COUNTRIES_REC.COUNTRY;
COUNTRY_CACHE(ALL_COUNTRIES_REC.COUNTRY_ID) := L_COUNTRY_REC;
END LOOP;
SELECT LAST_NAME||', '||FIRST_NAME,
COUNTRY_CACHE(NVL(COUNTRY_VISIT1_ID,-1)).COUNTRY,
COUNTRY_CACHE(NVL(COUNTRY_VISIT2_ID,-1)).COUNTRY
BULK COLLECT INTO TOURIST_DATA
FROM TOURIST;
END;
The error coming from this program is as noted below.
Error starting at line : 1 in command -
DECLARE
TYPE TOURIST_TYPE_REC IS RECORD (
TNAME VARCHAR2(300),
COUNTRY_VISIT1 COUNTRIES.COUNTRY%TYPE,
COUNTRY_VISIT2 COUNTRIES.COUNTRY%TYPE
);
TYPE TOURIST_TYPE_TAB IS TABLE OF TOURIST_TYPE_REC;
TYPE COUNTRY_CACHE_TYPE IS RECORD(COUNTRY COUNTRIES.COUNTRY%TYPE);
TYPE COUNTRY_CACHE_TAB IS TABLE OF COUNTRY_CACHE_TYPE INDEX BY BINARY_INTEGER;
TOURIST_DATA TOURIST_TYPE_TAB;
COUNTRY_CACHE COUNTRY_CACHE_TAB;
L_COUNTRY_REC COUNTRY_CACHE_TYPE;
BEGIN
/* Cace the country details so that you dont have to look them up from table again*/
FOR ALL_COUNTRIES_REC IN (SELECT COUNTRY, COUNTRY_ID FROM COUNTRIES UNION SELECT 'Unknown', -1 FROM DUAL)
LOOP
L_COUNTRY_REC.COUNTRY := ALL_COUNTRIES_REC.COUNTRY;
COUNTRY_CACHE(ALL_COUNTRIES_REC.COUNTRY_ID) := L_COUNTRY_REC;
END LOOP;
SELECT LAST_NAME||', '||FIRST_NAME,
COUNTRY_CACHE(NVL(COUNTRY_VISIT1_ID,-1)).COUNTRY,
COUNTRY_CACHE(NVL(COUNTRY_VISIT2_ID,-1)).COUNTRY
BULK COLLECT INTO TOURIST_DATA
FROM TOURIST;
END;
Error report -
ORA-06550: line 25, column 30:
PLS-00201: identifier 'COUNTRY_VISIT2_ID' must be declared
ORA-06550: line 25, column 30:
PLS-00201: identifier 'COUNTRY_VISIT2_ID' must be declared
ORA-06550: line 25, column 12:
PL/SQL: ORA-00904: : invalid identifier
ORA-06550: line 23, column 5:
PL/SQL: SQL Statement ignored
06550. 00000 - "line %s, column %s:\n%s"
*Cause: Usually a PL/SQL compilation error.
*Action:
Can you please tell me what I am doing wrong? If you can also recommend any other modifications (except for straight update :). It's been very frustrating :( ) in the anonymous block. My idea was to minimize the context switches. As this is a data patch which would be executed once and once only and during the deployment nobody would be logged in into the database and hence have the approval for using the maximum memory :). This is the reason I did not LIMIT my bulk collected rows.
Updated
Ravi, June 12, 2015 - 1:42 am UTC
The oracle version is Oracle 11GR2.
Thanks and Regards
Ravi