March/April 2007
I must make a new table based on the data of, for example, three tables. I think that I should use a single query, something like this:
create table t as select * from t1 left join t2 on t1.field1 = t2.field1
But my boss said that it is most effective to use a cursor like this:
for cursor in (select * from t1) loop begin select * into record from t2 where t2.field1=cursor.field1; insert into t (...) values (cursor,record); exception when no_data_found then insert into t (...) values (cursor,nulls); end; end loop;
Which is the correct approach?
Your boss has provided the single most inefficient approach to loading data. I have a name for this type of approach: I call it SLOW by SLOW processing . Avoid it. In fact, this procedural approach goes beyond normal SLOW by SLOW processing and takes it to a new level—it has introduced a "do-it-yourself nested loops join." I would seriously consider getting a new mentor for yourself; you'll be better for it long-term!
I have a pretty simple mantra when it comes to developing database software, and I have written this many times over the years:{C}
You should do it in a single SQL statement if at all possible.
If you cannot do it in a single SQL statement, do it in PL/SQL.
If you cannot do it in PL/SQL, try a Java stored procedure.
If you cannot do it in Java, do it in a C external procedure.
If you cannot do it in a C external procedure, you might want to seriously think about why it is you need to do it.
If you can do it in a single SQL statement, by all means do it in a single SQL statement. Do not waste time, energy, and CPU cycles writing procedural code that will run slower than regular SQL.
This question comes up frequently on Ask Tom, and my answer has always been very consistent. If you don't have to write procedural code, don't. If you can erase many lines of procedural code and replace it with a single SQL statement, do so. Less code = fewer bugs; more code = more bugs . Remove bugs by utilizing SQL.
Analytics Versus AggregatesAre analytical functions a replacement for aggregate functions? Some of the developers here are hurrying to replace the GROUP BY aggregate functions with analytical functions, just because they learned something new. Are analytical functions always faster than aggregate functions?
I am a huge fan of analytic functions—I've often said that analytics are the greatest thing to happen to SQL since the SELECT keyword! However, it does not make any sense at all to replace aggregate functions with analytics—none at all—because
Aggregates take many rows and collapse them into one
Analytics do not collapse many rows into one—there is no row collapsing
Analytics and aggregates are orthogonal concepts, really. So, no, it is not sensible to replace aggregates with analytics. I don't even know how you would do it (short of using a DISTINCT after computing the analytic result for lots of rows, which would be a bad idea indeed). For example, to show the SUM of SAL by JOB from the EMP table, you would code
SQL> select job, sum(sal) 2 from emp 3 group by job 4 order by job 5 / JOB SUM(SAL) ---------- ------------ ANALYST 6000 CLERK 4150 MANAGER 8275 PRESIDENT 5000 SALESMAN 5600
Now, if you were to use analytics to accomplish that, you would have
SQL> select DISTINCT 2 job, 3 sum(sal) over (partition by job) sumsal 4 from emp 5 order by job 6 / JOB SUM(SAL) ---------- ------------ ANALYST 6000 CLERK 4150 MANAGER 8275 PRESIDENT 5000 SALESMAN 5600
It gives the same answer, but at a price. If you compare the plans for each query, shown in Listing 1, you'll see that the analytic function performs
Code Listing 1: Plan performance of analytics and aggregation
SQL> select DISTINCT 2 job, 3 sum(sal) over (partition by job) sumsal 4 from emp 5 order by job 6 / Execution Plan --------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 14 | 168 | 4 (25) | 00:00:01 | | 1 | SORT UNIQUE | | 14 | 168 | 3 (0) | 00:00:01 | | 2 | WINDOW SORT | | 14 | 168 | 4 (25) | 00:00:01 | | 3 | TABLE ACCESS FULL | EMP | 14 | 168 | 3 (0) | 00:00:01 | --------------------------------------------------------------------------------- SQL> select job, sum(sal) 2 from emp 3 group by job 4 order by job 5 / Execution Plan -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 60 | 4 (25) | 00:00:01 | | 1 | SORT GROUP BY | | 5 | 60 | 4 (25) | 00:00:01 | | 2 | TABLE ACCESS FULL | EMP | 14 | 168 | 3 (0) | 00:00:01 | --------------------------------------------------------------------------------
A full scan of EMP
A window sort—to partition the data by job, so it can add the SUM(SAL) to each and every row in the entire EMP table
A unique sort to remove the many duplicate rows, to return a single row per JOB
The SQL statement using aggregation, on the other hand, simply performs
A full scan of EMP
A SORT GROUP BY to add up the salaries by JOB and output the results
So please use aggregates to aggregate and do not use analytics with DISTINCT to accomplish aggregation!
Varying IN Lists"How do I handle varying IN lists?" is a frequently asked question on asktom.oracle.com, so I've decided to write up the various ways to do this.
First the problem statement: You have a string, and it looks like
or
or something similar. You would like to retrieve all rows from some table/query such that some column is in that string. That is, you would like to execute
SELECT * FROM t WHERE x IN (1,2,3,4)
or
SELECT * FROM t WHERE y IN ('A','B','C')
You would like to use bind variables (because you've heard through the grapevine that bind variables are good). However, using bind variables as shown does not seem to work for you:
SQL> variable txt varchar2(25) SQL> exec :txt := ' ''SYS'', ''SYSTEM'' ' PL/SQL procedure successfully completed. SQL> print txt TXT ----------------- 'SYS', 'SYSTEM' SQL> select * 2 from all_users 3 where username in (:txt); no rows selected SQL> select * 2 from all_users 3 where username in ('SYS','SYSTEM'); USERNAME USER_ID CREATED -------- ------- --------- SYS 0 30-JUN-05 SYSTEM 5 30-JUN-05
When you bind the IN list, the query returns no data; when you hard-code the IN list, however, the query returns data. The reason should be clear: the previous example, using the bind variable, is equivalent to the following query with literals:
SQL> select * 2 from all_users 3 where username in ( ' ''SYS'', ''SYSTEM'' ' ); no rows selected
There is a single string, a single value in that IN list. What you need to do is turn that into a set. Here are some approaches you can take:<,/span>
If you have a finite number of items in the IN list. By all means, just bind the individual elements. That is, suppose you let the user pick as many as 10 items in a pick list. I strongly encourage you to use the query
select * from all_users where username in ( :bv1, :bv2, :bv3, ... :bv10 );
And you would not use a single string; you would bind 10 inputs to this query (binding NULLs for any bind variables for which values are not set). This works well for small lists, but it would be quite tedious, obviously, for dozens or hundreds of items.
If you are in Oracle8i. You can use a function that returns a collection. You make it so that you can "query a string"—use the string as if it were a table. You'll need a collection type. I'll just use a table of VARCHAR2(4000) and a PL/SQL function that will parse a delimited string and return it as a collection, as shown in Listing 2.
Code Listing 2: Oracle8i function solution for varying IN lists
SQL> create or replace type str2tblType as table of varchar2(4000) 2 / Type created. SQL> create or replace 2 function str2tbl 3 ( p_str in varchar2, 4 p_delim in varchar2 default ',' ) 5 return str2tblType 6 as 7 l_str long default p_str || p_delim; 8 l_n number; 9 l_data str2tblType := str2tblType(); 10 begin 11 loop 12 l_n := instr( l_str, p_delim ); 13 exit when (nvl(l_n,0) = 0); 14 l_data.extend; 15 l_data(l_data.count) := ltrim(rtrim(substr(l_str,1,l_n-1))); 16 l_str := substr( l_str, l_n+1 ); 17 end loop; 18 return l_data; 19 end; 20 / Function created. SQL> column column_value format a10 SQL> exec :txt := 'SYS, SYSTEM' PL/SQL procedure successfully completed. SQL> select * 2 from TABLE( cast( str2tbl(:txt) as str2TblType ) ) 3 / COLUMN_VAL ----------------- SYS SYSTEM SQL> select * 2 from all_users 3 where username in 4 (select * 5 from TABLE( cast( str2tbl(:txt) as str2TblType ) ) 6 ) 7 / USERNAME USER_ID CREATED -------- ------- --------- SYS 0 30-JUN-05 SYSTEM 5 30-JUN-05
If you are in Oracle9i Release 2 and above. You can skip the function altogether and just use DUAL to generate rows and parse the string. Consider
SQL>> with data 2 as 3 (select level l 4 from dual 5 connect by level <= 5) 6 select * 7 from data; L ------- 1 2 3 4 5
So, you can use DUAL to generate rows, and then—using SUBSTR/INSTR—you can effectively parse the bind variable and return the ith element from it. Listing 3 shows this process, followed by the complete process with the final SELECT statement with IN (SELECT * FROM data).
Code Listing 3: Oracle9i Release 2 and later solution for varying IN lists
SQL> exec :txt := 'SYS, SYSTEM' PL/SQL procedure successfully completed. SQL> with data 2 as 3 ( 4 select 5 trim( substr (txt, 6 instr (txt, ',', 1, level ) + 1, 7 instr (txt, ',', 1, level+1) 8 - instr (txt, ',', 1, level) -1 ) ) 9 as token 10 from (select ','||:txt||',' txt 11 from dual) 12 connect by level <= 13 length(:txt)-length(replace(:txt,',',''))+1 14 ) 15 select * from data; TOKEN ---------------- SYS SYSTEM SQL> with data 2 as 3 ( 4 select 5 trim( substr (txt, 6 instr (txt, ',', 1, level ) + 1, 7 instr (txt, ',', 1, level+1) 8 - instr (txt, ',', 1, level) -1 ) ) 9 as token 10 from (select ','||:txt||',' txt 11 from dual) 12 connect by level <= 13 length(:txt)-length(replace(:txt,',',''))+1 14 ) 15 select * 16 from all_users 17 where username in (select * from data); USERNAME USER_ID CREATED -------- ------- --------- SYSTEM 5 30-JUN-05 SYS 0 30-JUN-05
Now, some people look at that "WITH DATA" bit and say, "That's too much; it's too hard to code that every time." You can use a view to hide the complexity here, use a stored procedure as a way to bind to the view (in effect, a parameterized view), and bind your query's IN list by calling MY_CTX_PROCEDURE—and the view does the rest, as shown in Listing 4.
Code Listing 4: View and procedure solution for varying IN lists
SQL> create or replace context my_ctx using my_ctx_procedure 2 / Context created. SQL> create or replace 2 procedure my_ctx_procedure 3 ( p_str in varchar2 ) 4 as 5 begin 6 dbms_session.set_context 7 ( 'my_ctx', 'txt', p_str ); 8 end; 9 / Procedure created. SQL> create or replace view IN_LIST 2 as 3 select 4 trim( substr (txt, 5 instr (txt, ',', 1, level ) + 1, 6 instr (txt, ',', 1, level+1) 7 - instr (txt, ',', 1, level) -1 ) ) 8 as token 9 from (select ','||sys_context('my_ctx','txt')||',' txt 10 from dual) 11 connect by level <= 12 length(sys_context('my_ctx','txt')) 13 - length(replace(sys_context('my_ctx','txt'),',',''))+1 14 / View created. SQL> exec my_ctx_procedure( :txt ) PL/SQL procedure successfully completed. SQL> select * 2 from all_users 3 where username in 4 (select * from IN_LIST); USERNAME USER_ID CREATED -------- ------- --------- SYSTEM 5 30-JUN-05 SYS 0 30-JUN-05Automatic PGA Memory Allocation
Can Oracle Database allocate more PGA memory than the PGA_AGGREGATE_TARGET parameter value?
Absolutely. The PGA_AGGREGATE_TARGET init.ora parameter is not a hard limit; it is a soft limit—a goal. Oracle Database attempts to keep PGA memory allocations over all sessions under this target, but there are times when it cannot.
There are two types of PGA memory: "tunable" and "untunable." Oracle Database can control the allocation size of tunable PGA memory, and it includes things such as the sort and hash areas. Untunable PGA memory is allocated by a session Oracle Database cannot control, such as memory for PL/SQL table variables.
Suppose you have 100 sessions connected, and you set the PGA target to 100MB. Now suppose each of those 100 sessions, using dedicated server connections, runs some code that fills up a PL/SQL table variable with data, so the PL/SQL table consumes 5MB of memory. You now have at least 500MB of PGA memory allocated, and Oracle Database can't do anything to keep you within the 100MB limit. This is one way to exceed the PGA target.
Next, suppose you have 100 sessions connected, the PGA target is set to 10MB, and each of those 100 sessions opens 10 cursors that sort (so you have 1,000 open SQL statements). Even if Oracle Database used a tiny sort area of 64K, that would be 64,000K of memory—or 62.5MB. You would exceed your setting once again.
If you are interested in seeing demonstrations of this, my latest book, Expert Oracle Database Architecture: 9 i and 10 g Programming Techniques (Apress, 2005), shows some.
ASK Tom DOWNLOAD |
DISCLAIMER: We've captured these popular historical magazine articles for your reference. Links etc contained within these article possibly won't work. These articles are provided as-is with no guarantee that the information within them is the best advice for current versions of the Oracle Database.