December 30, 2019
Almost every program Oracle Database developers write includes PL/SQL as well as SQL statements. PL/SQL statements are run by the PL/SQL statement executor, and SQL statements are run by the SQL statement executor. When the PL/SQL runtime engine encounters a SQL statement, it stops and passes the SQL statement over to the SQL engine. The SQL engine executes the SQL statement and returns information back to the PL/SQL engine. This transfer of control is called a context switch, and each one of these switches incurs overhead that slows down the overall performance of your programs.
This article kicks off a multipart series that drills deep into the bulk-processing features provided in PL/SQL to minimize the overhead of context switches when executing SQL statements from PL/SQL blocks. The material in this article is drawn from a tutorial I created on LiveSQL, and the article contains two types of exercises:
Fill in the blanks. I provide a lot of the code, and you fill in the missing part(s). This way, you can concentrate on the bulk-processing-related syntax.
Write all the code. This is an option for the most dedicated readers! You write the entire solution to the challenge. Run it in your own database or on LiveSQL.
All examples and exercises are based on the traditional Human Resources schema, containing the employees and departments tables. This schema is available on Github (link at the end of the article). I encourage you to read this article (and even do the exercises here) and also check out the tutorial and take full advantage of LiveSQL, where you can easily and instantly execute all code you write.
Now let’s look at a concrete example to explore context switches more thoroughly and identify the reasons why the bulk-processing features of PL/SQL—BULK COLLECT
and FORALL
—can have such a dramatic impact on performance. Suppose my manager asked me to write a PL/SQL procedure that accepts a department ID and a salary percentage increase and gives everyone in that department a specified-percentage raise. Taking advantage of PL/SQL’s elegant cursor FOR
loop and the ability to call SQL statements natively in PL/SQL, I can implement this requirement easily:
CREATE OR REPLACE PROCEDURE increase_salary ( department_id_in IN employees.department_id%TYPE, increase_pct_in IN NUMBER) IS BEGIN FOR employee_rec IN (SELECT employee_id FROM employees WHERE department_id = increase_salary.department_id_in) LOOP UPDATE employees emp SET emp.salary = emp.salary + emp.salary * increase_salary.increase_pct_in WHERE emp.employee_id = employee_rec.employee_id; DBMS_OUTPUT.PUT_LINE ('Updated ' || SQL%ROWCOUNT); END LOOP; END increase_salary;
Suppose there are 10,000 employees in department 15. When I execute this block....
BEGIN increase_salary (50, .10); ROLLBACK; -- to leave the table in its original state END;
....the PL/SQL engine will “switch” over to the SQL engine 10,000 times, once for each row being updated. Tom Kyte, of Ask TOM fame, refers to row-by-row switching like this as “slow-by-slow processing,” and it is definitely something to be avoided.
Generally, the way to improve performance over row-by-row context switching is to not perform row-by-row data manipulation language (DML) operations. This can be accomplished in one of two ways:
Implement the functionality in “pure” SQL—no PL/SQL loop
Use the bulk-processing features of PL/SQL
If you can change your implementation to avoid a loop and instead simply execute a single DML statement, you should do so. For example, I can do this with the increase_salary
procedure:
CREATE OR REPLACE PROCEDURE increase_salary ( department_id_in IN employees.department_id%TYPE, increase_pct_in IN NUMBER) IS BEGIN UPDATE employees emp SET emp.salary = emp.salary + emp.salary * increase_salary.increase_pct_in WHERE emp.department_id = increase_salary.department_id_in; END increase_salary;
Of course, it is not always this easy. You might be doing some very complex processing of each row before doing the INSERT
, UPDATE
, or DELETE
that would be hard to do in SQL. You might need to do error management that’s more nuanced than “all or nothing” SQL will allow. Or you might simply not have sufficient knowledge of SQL to do what’s needed.
LiveSQLLiveSQL offers 24/7 free access to the latest version of Oracle Database; a SQL “scratchpad” where you can write SQL and PL/SQL; and a library of hundreds of scripts and tutorials. |
In an ideal world, you would stop programming and take an advanced SQL class (the Oracle Dev Gym offers a free one on SQL analytic functions). In the real world, however, you need to get the program up and running ASAP.
Whatever your situation, the bulk-processing features of PL/SQL offer a straightforward solution—although there will be a lot to consider as you implement your conversion to BULK COLLECT
and FORALL
.
Let’s first take a look at BULK COLLECT
, which improves the performance of multirow querying and is relatively simple. In my next article, I’ll move on to FORALL
, which is used to execute the same nonquery DML statement repeatedly, with different bind variables. That feature has a lot more moving parts and issues you need to take into account (which should come as no surprise, because you are changing data, not simply querying it).
To take advantage of bulk processing for queries, you simply put BULK COLLECT
before the INTO
keyword of your fetch operation and then provide one or more collections after the INTO
keyword.
Here are some things to know about how BULK COLLECT
works:
It can be used with all three types of collections: associative arrays, nested tables, and VARRAYs.
You can fetch into individual collections (one for each expression in the SELECT
list) or a single collection of records.
The collection is always populated densely, starting from index value 1.
If no rows are fetched, the collection will be emptied of all elements.
You can use BULK COLLECT
in all of these forms:
SELECT column(s) BULK COLLECT INTO collection(s)
FETCH cursor BULK COLLECT INTO collection(s)
EXECUTE IMMEDIATE query_string BULK COLLECT INTO collection(s)
Here’s a block of code that fetches all rows in the employees table with a single context switch and loads the data into a collection of records that are based on the table.
DECLARE TYPE employee_info_t IS TABLE OF employees%ROWTYPE; l_employees employee_info_t; BEGIN SELECT * BULK COLLECT INTO l_employees FROM employees WHERE department_id = 50; DBMS_OUTPUT.PUT_LINE (l_employees.COUNT); END;
If you do not want to retrieve all the columns in a table, create your own user-defined record type and use that to define your collection. All you have to do is make sure the list of expressions in the SELECT
statement matches the record type’s fields.
DECLARE TYPE two_cols_rt IS RECORD ( employee_id employees.employee_id%TYPE, salary employees.salary%TYPE ); TYPE employee_info_t IS TABLE OF two_cols_rt; l_employees employee_info_t; BEGIN SELECT employee_id, salary BULK COLLECT INTO l_employees FROM employees WHERE department_id = 50; DBMS_OUTPUT.PUT_LINE (l_employees.COUNT); END;
You can avoid the nuisance of declaring a record type to serve as the type for the collection by using a “template cursor.” This cursor should have the same SELECT
list as the BULK COLLECT
query. You can, however, leave off the WHERE
clause and anything else after the FROM
clause, because that code will never be used for anything but a %ROWTYPE
declaration. Here’s an example:
DECLARE CURSOR employee_info_c IS SELECT employee_id, salary FROM employees; TYPE employee_info_t IS TABLE OF employee_info_c%ROWTYPE; l_employees employee_info_t; BEGIN SELECT employee_id, salary BULK COLLECT INTO l_employees FROM employees WHERE department_id = 10; END;
In the block below, replace the #FINISH#
tags with code so that the last names of all employees in department 50 are displayed.
DECLARE #FINISH# l_names names_t; BEGIN SELECT last_name #FINISH# FROM employees WHERE department_id = 50 ORDER BY last_name; END;
Write a stored procedure that accepts a department ID, uses BULK COLLECT
to retrieve all employees in that department, and displays their first names and salaries. Then write an anonymous block to run that procedure for department 100.
As with almost all other types of variables and constants you use in your code, collections consume Program Global Area (PGA) memory. If your collection gets too large, your users might encounter an error. To see this happen, run the code below. (Note: varchar2a
is a collection type of strings defined in the DBMS_SQL
package).
DECLARE l_strings DBMS_SQL.varchar2a; BEGIN FOR indx IN 1 .. 2 ** 31 – 1 LOOP l_strings (indx) := RPAD ('abc', 32767, 'def'); END LOOP; END;
When using BULK COLLECT
, you could attempt to retrieve too many rows in one context switch and run out of PGA memory. To help you avoid such errors, Oracle Database offers a LIMIT
clause for BULK COLLECT
. Indeed, when using BULK COLLECT
, it is recommended that you never, or at least rarely, use an “unlimited” BULK COLLECT
, which is what you get with a SELECT BULK COLLECT INTO
(an implicit query)—and what you saw in the previous module.
Instead, declare a cursor (or a cursor variable); open that cursor; and then, in a loop, retrieve N number of rows with each fetch.
In the block below, I set my fetch limit to just 10 rows to demonstrate how this feature works. You will likely never want to set the limit to less than 100—this topic is explored further below.
DECLARE c_limit PLS_INTEGER := 10; CURSOR employees_cur IS SELECT employee_id FROM employees WHERE department_id = 50; TYPE employee_ids_t IS TABLE OF employees.employee_id%TYPE; l_employee_ids employee_ids_t; BEGIN OPEN employees_cur; LOOP FETCH employees_cur BULK COLLECT INTO l_employee_ids LIMIT c_limit; DBMS_OUTPUT.PUT_LINE (l_employee_ids.COUNT || ' fetched'); EXIT WHEN l_employee_ids.COUNT = 0; END LOOP; END;
One thing to watch out for when switching to LIMIT
with BULK COLLECT
(in a loop) is following the same pattern for single-row fetching in a loop. I demonstrate this issue below, but first, a reminder: There are 107 rows in the employees table.
SELECT COUNT(*) FROM employees
Here’s the common way to terminate a loop in which you fetch row-by-row from an explicit cursor:
DECLARE CURSOR emps_c IS SELECT * FROM employees; l_emp emps_c%ROWTYPE; l_count INTEGER := 0; BEGIN OPEN emps_c; LOOP FETCH emps_c INTO l_emp; EXIT WHEN emps_c%NOTFOUND; DBMS_OUTPUT.put_line (l_emp.employee_id); l_count := l_count + 1; END LOOP; DBMS_OUTPUT.put_line ('Total rows fetched: ' || l_count); END;
In other words: Fetch a row, and stop if the cursor has retrieved all rows. Now let’s switch to using BULK COLLECT
and LIMIT
, fetching 10 rows at a time, using the same approach to exiting the loop.
DECLARE CURSOR emps_c IS SELECT * FROM employees; TYPE emps_t IS TABLE OF emps_c%ROwTYPE; l_emps emps_t; l_count INTEGER := 0; BEGIN OPEN emps_c; LOOP FETCH emps_c BULK COLLECT INTO l_emps LIMIT 10; EXIT WHEN emps_c%NOTFOUND; DBMS_OUTPUT.put_line (l_emps.COUNT); l_count := l_count + l_emps.COUNT; END LOOP; DBMS_OUTPUT.put_line ('Total rows fetched: ' || l_count); END;
Wait, what? Is that right? Do I see Total rows fetched: 100
? Yes, I do. And therein lies the trap. You cannot continue to use the same EXIT WHEN
statement in the same place in your loop when you switch to BULK COLLECT
with LIMIT
.
The very last fetch performed retrieved the last seven rows but also exhausted the cursor. So the %NOTFOUND
returns TRUE
, but the collection has those seven elements in it.
To terminate a loop by using BULK COLLECT
with LIMIT
, you should either
EXIT WHEN
to the bottom of the loop body.or
These two approaches are shown below.
DECLARE CURSOR emps_c IS SELECT * FROM employees; TYPE emps_t IS TABLE OF emps_c%ROwTYPE; l_emps emps_t; l_count INTEGER := 0; BEGIN OPEN emps_c; LOOP FETCH emps_c BULK COLLECT INTO l_emps LIMIT 10; l_count := l_count + l_emps.COUNT; EXIT WHEN emps_c%NOTFOUND; END LOOP; DBMS_OUTPUT.put_line ('Total rows fetched: ' || l_count); END; DECLARE CURSOR emps_c IS SELECT * FROM employees; TYPE emps_t IS TABLE OF emps_c%ROwTYPE; l_emps emps_t; l_count INTEGER := 0; BEGIN OPEN emps_c; LOOP FETCH emps_c BULK COLLECT INTO l_emps LIMIT 10; EXIT WHEN l_emps.COUNT = 0; l_count := l_count + l_emps.COUNT; END LOOP; DBMS_OUTPUT.put_line ('Total rows fetched: ' || l_count); CLOSE emps_c; END;
The employees table has 107 rows. In the block below, replace the #FINISH#
tags with code so that when the block is executed, the following text is displayed:
Rows fetched 25 Rows fetched 25 Rows fetched 25 Rows fetched 25 Rows fetched 7 Rows fetched 0 DECLARE CURSOR ids_c IS SELECT employee_id FROM employees; #FINISH# l_count INTEGER; BEGIN OPEN ids_c; LOOP FETCH ids_c #FINISH#; DBMS_OUTPUT.put_line ('Rows fetched: ' || l_count); EXIT WHEN #FINISH#; END LOOP; CLOSE ids_c; END;
Write an anonymous block that (using BULK COLLECT
) fetches only the last name and the salary from the employees table 5 rows at a time and then displays that information. Make sure 107 names and salaries are shown!
When should you convert a nonbulk query to one using BULK COLLECT
? More specifically, should you convert a cursor FOR
loop to an explicit cursor and FETCH BULK COLLECT
with LIMIT
? Here are some things to keep in mind:
As long as your PL/SQL optimization level is set to 2 (the default) or higher, the compiler will automatically optimize cursor FOR loops to retrieve 100 rows with each fetch. You cannot modify this number.
If your cursor FOR
loop is “read only” (it does not execute nonquery DML), you can probably leave it as is. That is, fetching 100 rows with each fetch will usually give you sufficient improvements in performance over row-by-row fetching.
Only cursor FOR
loops are optimized this way, so if you have a simple or WHILE
loop that fetches individual rows, you should convert to BULK COLLECT
—with LIMIT
!
If you are fetching a very large number of rows, such as might happen with data warehouse processing or a nightly batch process, you should experiment with larger LIMIT
values to see what kind of bang for the buck you will get.
If your cursor FOR
loop (or any other kind of loop, for that matter) contains one or more nonquery DML statements (INSERT
, UPDATE
, DELETE
, MERGE
), you should convert to BULK COLLECT
and FORALL
.
Run the following code to see how optimization affects cursor FOR
loop performance.
CREATE OR REPLACE PROCEDURE test_cursor_performance (approach IN VARCHAR2) IS CURSOR cur IS SELECT * FROM all_source WHERE ROWNUM < 100001; one_row cur%ROWTYPE; TYPE t IS TABLE OF cur%ROWTYPE INDEX BY PLS_INTEGER; many_rows t; last_timing NUMBER; cntr number := 0; PROCEDURE start_timer IS BEGIN last_timing := DBMS_UTILITY.get_cpu_time; END; PROCEDURE show_elapsed_time (message_in IN VARCHAR2 := NULL) IS BEGIN DBMS_OUTPUT.put_line ( '"' || message_in || '" completed in: ' || TO_CHAR ( ROUND ((DBMS_UTILITY.get_cpu_time - last_timing)/100, 2))); END; BEGIN start_timer; CASE approach WHEN 'implicit cursor for loop' THEN FOR j IN cur LOOP cntr := cntr + 1; END LOOP; DBMS_OUTPUT.put_line (cntr); WHEN 'explicit open, fetch, close' THEN OPEN cur; LOOP FETCH cur INTO one_row; EXIT WHEN cur%NOTFOUND; cntr := cntr + 1; END LOOP; DBMS_OUTPUT.put_line (cntr); CLOSE cur; WHEN 'bulk fetch' THEN OPEN cur; LOOP FETCH cur BULK COLLECT INTO many_rows LIMIT 100; EXIT WHEN many_rows.COUNT () = 0; FOR indx IN 1 .. many_rows.COUNT Loop cntr := cntr + 1; end loop; END LOOP; DBMS_OUTPUT.put_line (cntr); CLOSE cur; END CASE; show_elapsed_time (approach); END test_cursor_performance; / /* Try different approaches with optimization disabled. */ ALTER PROCEDURE test_cursor_performance COMPILE plsql_optimize_level=0 / BEGIN dbms_output.put_line ('No optimization...'); test_cursor_performance ('implicit cursor for loop'); test_cursor_performance ('explicit open, fetch, close'); test_cursor_performance ('bulk fetch'); END; / /* Try different approaches with default optimization. */ ALTER PROCEDURE test_cursor_performance COMPILE plsql_optimize_level=2 / BEGIN DBMS_OUTPUT.put_line ('Default optimization...'); test_cursor_performance ('implicit cursor for loop'); test_cursor_performance ('explicit open, fetch, close'); test_cursor_performance ('bulk fetch'); END; /
This exercise has two parts (and for it, assume that the employees table has one million rows, with data distributed equally among departments): (1) Write an anonymous block that contains a cursor FOR
loop that does not need to be converted to using BULK COLLECT
, and (2) write an anonymous block that contains a cursor FOR
loop that does need to use BULK COLLECT
(assume that it cannot be rewritten in “pure” SQL).
The RETURNING
clause is a wonderful thing. If you are inserting, updating, or deleting data and you need to get some information (such as the primary key of the newly inserted row) back after the statement completes, RETURNING
is the thing for you! Here’s an example:
CREATE TABLE t ( id NUMBER GENERATED ALWAYS AS IDENTITY, n NUMBER) / DECLARE l_id t.id%TYPE; BEGIN INSERT INTO t (n) VALUES (100) RETURNING id INTO l_id; DBMS_OUTPUT.PUT_LINE (l_id); END; /
Suppose, however, that I am changing more than one row. Can I use RETURNING
then? Let’s see....
DECLARE l_id employees.employee_id%TYPE; BEGIN UPDATE employees SET last_name = UPPER (last_name) RETURNING employee_id INTO l_id; ROLLBACK; END; /ORA-01422: exact fetch returns more than requested number of rows
Oh no!
But wait, that’s the sort of error you can get with a SELECT-INTO
that returns more than one row. Why is it showing up here?
Because the RETURNING
clause is essentially translated into a SELECT-INTO
, it gets one value and stuffs it into l_id
. But in this case, the UPDATE
statement is returning many IDs. How do you get this to work?
BULK COLLECT
to the rescue! I need to take multiple values and put them into something. What could that be? How about a collection? So, yes, if you are changing one or more rows, change INTO
to BULK COLLECT INTO
and provide a collection to hold the values.
DECLARE TYPE ids_t IS TABLE OF employees.employee_id%TYPE; l_ids ids_t; BEGIN UPDATE employees SET last_name = UPPER (last_name) WHERE department_id = 50 RETURNING employee_id BULK COLLECT INTO l_ids; FOR indx IN 1 .. l_ids.COUNT LOOP DBMS_OUTPUT.PUT_LINE (l_ids (indx)); END LOOP; ROLLBACK; END; /
Now RETURNING
works like a charm. Thanks, BULK COLLECT
!
In the block below, replace the #FINISH#
tag with code so that Deleted = 3
is displayed after execution.
DECLARE TYPE ids_t IS TABLE OF employees.employee_id%TYPE; l_ids ids_t; BEGIN DELETE FROM employees WHERE salary > 15000 #FINISH# DBMS_OUTPUT.PUT_LINE ('Deleted = ' || l_ids.COUNT); ROLLBACK; END; /
Write an anonymous block that deletes all the rows in the employees table for department 50 and returns all the employee IDs and the last names in deleted rows. Then display those values, using DBMS_OUTPUT.PUT_LINE
. Finally, you might want to roll back. That will make it easier to test your code—and continue with the tutorial.
You can also use BULK COLLECT
with native dynamic SQL queries that might return more than one row. As with SELECT-INTO
and FETCH-INTO
, just stick BULK COLLECT
before the INTO
and provide a collection (or multiple collections) to hold the multiple values returned.
DECLARE TYPE ids_t IS TABLE OF employees.employee_id%TYPE; l_ids ids_t; BEGIN EXECUTE IMMEDIATE 'SELECT employee_id FROM employees WHERE department_id = :dept_id' BULK COLLECT INTO l_ids USING 50; FOR indx IN 1 .. l_ids.COUNT LOOP DBMS_OUTPUT.put_line (l_ids (indx)); END LOOP; END;
You can even get fancy and use BULK COLLECT
in the RETURNING
clause of a dynamic DML statement:
DECLARE TYPE ids_t IS TABLE OF employees.employee_id%TYPE; l_ids ids_t; BEGIN EXECUTE IMMEDIATE 'UPDATE employees SET last_name = UPPER (last_name) WHERE department_id = 100 RETURNING employee_id INTO :ids' RETURNING BULK COLLECT INTO l_ids; FOR indx IN 1 .. l_ids.COUNT LOOP DBMS_OUTPUT.PUT_LINE (l_ids (indx)); END LOOP; END;
Write the rest of the procedure whose signature is shown below. Use BULK COLLECT
to fetch all the last names from employees identified by that WHERE
clause and return the collection. Then write an anonymous block to test your procedure: pass different WHERE
clauses and display the names retrieved.
PROCEDURE get_names ( where_in IN VARCHAR2, names_out OUT DBMS_SQL.VARCHAR2_TABLE)
So now you know: You can use BULK COLLECT
to improve the performance of multirow queries, returning multiple rows with each fetch. Unfortunately, this feature is probably not enough to turn you into a hero. That’s because when it comes to row-by-row processing, the real performance pain comes not from fetching data but, rather, from changing data through inserts, updates, and deletes.
Consequently, the next article moves on to cover the FORALL
statement, which dramatically reduces the number of context switches needed to execute the same nonquery DML statement multiple times (within a loop of some sort).
Changing data is more complicated in several ways than querying data, so I’ll cover FORALL
in multiple articles.
In the meantime, check out the tutorial.
TAKE the “Bulk Processing with PL/SQL” tutorial.
READ more about BULK COLLECT.
WORK OUT at the Oracle Dev Gym.
INSTALL the Human Resources schema.
Illustration by Wes Rowell
DISCLAIMER: We've captured these popular historical magazine articles for your reference. Links etc contained within these article possibly won't work. These articles are provided as-is with no guarantee that the information within them is the best advice for current versions of the Oracle Database.