Database, SQL and PL/SQL

A Perfect Plan

Learn how adaptive plans in Oracle Database 12c compute the perfect plan—even when statistics are imperfect.

By Arup Nanda

November/December 2014

Tom, the lead DBA at Acme Bank, has more than his usual share of irate visitors today. Dave, the application DBA, typically collects database table statistics once a day during off-peak hours. Acme’s application loads a lot of data into tables and truncates them throughout the day, or—in other words—the data patterns in the tables change dramatically. However, because the statistics are collected only once a day, the database’s query optimizer knows about the table data patterns only when statistics are collected, and those data patterns may be very different from the current data patterns in the tables. Therefore, the query execution plans the optimizer comes up with may not be the most efficient and the query performance often drops as a result. Victoria, the lead developer, is frustrated that query performance is dropping without warning, and she demands to know why the DBAs are not doing anything to prevent it. In his defense, Dave—equally upset that his competence is being questioned—says he can’t just continue to collect statistics throughout the day, because that will definitely have a negative impact on the overall database performance. The visitors all look to Tom for a solution.

The solution, Tom announces, is to use adaptive plans in Oracle Database 12c.

Change in Plans

To explain the adaptive plan concept, Tom points to GPS devices found in cars. “You punch in the destination address on your GPS device, and it plots out a route you can take to reach your destination fast,” he describes. “But when you enter the highway, you notice a huge traffic backup due to an accident ahead. Instead of the expected 55 miles per hour, your actual speed is about 5. The GPS device, sensing that your actual speed is much less than expected, draws a conclusion about traffic congestion and suggests a modified route to the destination. Sound familiar?”

Yes, it does, the group agrees. It’s exactly how the optimizer in Oracle Database 12c works, Tom explains. When the optimizer computes a query execution plan, it relies on the latest statistics on a table but it does not assume that those statistics are accurate. During query execution, if the optimizer finds that the actual conditions are different from the statistics—just as the GPS finds that the actual traffic conditions are different from what’s expected—it pursues a better plan than the originally computed one.

“Wait a minute,” Victoria interrupts. “Are you saying that the optimizer may actually follow a different plan after the query starts executing?” That’s correct, Tom confirms. The optimizer computes alternative plans and chooses the best plan based on the data obtained during query execution. And that is how the optimizer can choose a better plan based on actual conditions.

Before Adaptive Plans

To demonstrate the adaptive plan concept to his hopeful but skeptical audience, Tom starts a “before” demonstration of how the optimizer works without adaptive plans. He first creates a set of test objects, using the script shown in Listing 1. This script inserts 10 and 100,000 records into tables T1 and T2, respectively; creates indexes; and then collects statistics.

Code Listing 1: Setting up of the test tables

drop table t1 purge
/
drop table t2 purge
/
create table t1 (
   col1    number not null primary key,
   col2    number
);
begin
   for i in 1..10 loop
           insert into t1 values (i,trunc(dbms_random.value(100,10000)));
   end loop;
end;
/
commit
/
select * from t1
/
create table t2 (
   col1    number not null primary key,
   col2    number
);
declare
   ctr     pls_integer :=0;
begin
   for crec in (select col2 from t1) loop
   ctr := ctr + 1;
           for i in 1..10000 loop
                   insert into t2 values
                   (10000*ctr+i, crec.col2);
           end loop;
   end loop;
end;
/
commit
/
create index in_t1_col2 on t1 (col2)
/
create index in_t2_col2 on t2 (col2)
/
begin
	dbms_stats.gather_table_stats (
		ownname	=> 'ARUP',
		tabname	=> 'T1',
		cascade => true
	);
end;
/
begin
	dbms_stats.gather_table_stats (
		ownname	=> 'ARUP',
		tabname	=> 'T2',
		cascade => true
	);
end;
/

Next Tom examines the execution plan the optimizer comes up with by relying on the statistics alone. To check the execution plan, he issues the SQL shown in Listing 2. Tom uses a hint—gather_plan_statistics—that gathers statistics such as how many rows were retrieved, how much CPU was consumed, and so on during the actual execution of the query. He displays the execution statistics by setting the format parameter to allstats in the dbms_xplan.display function. (Tom places the gather_plan_statistics hint in the query now so that when he executes it later, the SQL_ID value for the query will be the same.) He also adds a comment—control1—to the query, which helps him identify this query later. The latter part of Listing 2, Tom explains, shows the execution plan the optimizer thinks is the correct one and will start with. There are only 10 rows, and the predicate includes COL1 > 9—which will match only 1 row out of 10, making the selectivity 10 percent. Hence the optimizer wisely decides on an index scan, shown in step 4 of the plan. Step 2 shows a nested-loop join between the two tables, based on the statistics present.

Code Listing 2: Setting the default execution plan

explain plan for
select /*+ gather_plan_statistics control1 */ min(t2.col1)
from t1, t2
where t2.col2 = t1.col2
and t1.col1 > 9
/
select * from table(dbms_xplan.display(format=>'allstats'))
/
PLAN_TABLE_OUTPUT
————————————————————————————————————————————————————————————————————————
Plan hash value: 4098285983
————————————————————————————————————————————————————————————————————————
| Id  | Operation                              | Name         | E-Rows |
————————————————————————————————————————————————————————————————————————
|   0 | SELECT STATEMENT                       |              |      1 |
|   1 |  SORT AGGREGATE                        |              |      1 |
|   2 |   NESTED LOOPS                         |              |        |
|   3 |    NESTED LOOPS                        |              |  11111 |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1           |      1 |
|*  5 |      INDEX RANGE SCAN                  | SYS_C0010628 |      1 |
|*  6 |     INDEX RANGE SCAN                   | IN_T2_COL2   |  10000 |
|   7 |    TABLE ACCESS BY INDEX ROWID         | T2           |  10000 |
————————————————————————————————————————————————————————————————————————
Predicate Information (identified by operation id):
———————————————————————————————————————————————————
   5 - access("T1"."COL1">9)
   6 - access("T2"."COL2"="T1"."COL2")

Next Tom updates the data in the tables in such a way that T1 has all the rows of T2 and T2 has the rows of T1, using the following SQL:

drop table t3 purge
/
create table t3 as select * from t2;
truncate table t2;
insert into t2 select * from t1;
commit;
truncate table t1;
insert into t1 select * from t3;
commit;

Then he checks the actual number of rows in these two tables:

select count(1) from t1;
  COUNT(1)
——————————
    100000
select count(1) from t2;
  COUNT(1)
——————————
        10

Tom deliberately does not gather statistics on the tables T1 and T2 after this operation. He re-evaluates the execution plan the optimizer will come up with by executing the SQL shown in Listing 2, and he observes that the plan has not changed (it is the same as what’s shown in the plan table output of Listing 2). But, Victoria observes, the data has become dramatically different, especially for the predicate COL1 > 9. She asks Tom to check the actual number of rows this predicate will satisfy, and Tom executes the following query:

select count(1) from t1 where col1>9;
  COUNT(1)
——————————
    100000

The predicate, she points out, returns all 100,000 rows of the table T1, so the selectivity of the predicate is no longer 10 percent. With the modified rows, it’s now 100 percent, which should have prompted the optimizer to choose a full table scan instead of an index scan, but it didn’t. Why? she wants to know.

To answer, Tom uses the following SQL to check how many rows the optimizer thinks are in each table:

select table_name, num_rows
from user_tables
where table_name in ('T1','T2')
order by 1;
TABLE_NAME NUM_ROWS
—————————— ————————
T1               10
T2           100000

The actual number of rows in the tables, Tom points out, does not match the number of rows recorded in the statistics. T1 and T2 actually have 10 and 100,000 rows, respectively, but because Tom did not collect statistics, the optimizer still thinks T1 and T2 have 100,000 and 10 rows, respectively—an incorrect assumption. For the optimizer, nothing has changed, so it still comes up with the same query plan.

Lack of accurate statistics is the most frequent cause of bad execution plans. However, Dave observes, it would be impractical to collect statistics after every change in the data pattern, but without statistics, the optimizer will come up with an inefficient plan. Tom’s visitors all want to know whether there is a way to ensure that the optimizer always uses the most efficient plan, regardless of the accuracy of the collected statistics.

Adaptive Plans

This, Tom explains, is where adaptive plans come to the rescue. During the parse phase of a query, the optimizer computes an execution plan based on the information it gathers from the statistics—just as in previous Oracle Database releases. But, in Oracle Database 12c, the optimizer does something more: it computes alternative plans called subplans. During query execution, a statistics collector collects runtime statistics, and if a subplan proves to be better than the originally computed plan, the optimizer will make a course correction and choose that subplan.

Reminding his visitors that the perfect subplan is chosen at runtime, Tom executes the query and then checks the plan the optimizer comes up with by using the dbms_xplan.display_cursor() function, as shown in Listing 3. This time he uses a special clause for the format parameter—+ADAPTIVE—which shows the decisions the optimizer has made regarding adaptive plans. The latter part of Listing 3 shows the query execution plan.

Code Listing 3: The actual execution plan, different due to adaptive plans

select /*+ gather_plan_statistics control1 */ min(t2.col1)
from t1, t2
where t2.col2 = t1.col2
and t1.col1 > 9
/
select * from table(dbms_xplan.display_cursor(format=>'allstats +adaptive'))
/
PLAN_TABLE_OUTPUT
—————————————————————————————————————————————————————————————————————————————
SQL_ID  gxubjt1p8qscf, child number 0
—————————————————————————————————————
select /*+ gather_plan_statistics control1 */ t2.col1 from t1, t2 where
t2.col2 = t1.col2 and t1.col1 > 9
Plan hash value: 3456437299
——————————————————————————————————————————————————————————————————————————————
|  Id | Operation                   | Name        | Starts | E-Rows | A-Rows |
——————————————————————————————————————————————————————————————————————————————
|    0| SELECT STATEMENT            |             |      1 |        |    100K|
| *  1|  HASH JOIN                  |             |      1 |  11111 |    100K|
|-   2|   NESTED LOOPS              |             |      1 |        |    100K|
|-   3|    NESTED LOOPS             |             |      1 |  11111 |    100K|
|-   4|     STATISTICS COLLECTOR    |             |      1 |        |    100K|
|    5|      TABLE ACCESS BY IND... | T1          |      1 |      1 |    100K|
| *  6|       INDEX RANGE SCAN      | SYS_C0010628|      1 |      1 |    100K|
|-*  7|     INDEX RANGE SCAN        | IN_T2_COL2  |      0 |  10000 |      0 |
|-   8|    TABLE ACCESS BY INDEX ...| T2          |      0 |  10000 |      0 |
|    9|   TABLE ACCESS FULL         | T2          |      1 |  10000 |     10 |
——————————————————————————————————————————————————————————————————————————————
Predicate Information (identified by operation id):
———————————————————————————————————————————————————
   1 - access("T2"."COL2"="T1"."COL2")
   6 - access("T1"."COL1">9)
   7 - access("T2"."COL2"="T1"."COL2")
Note
—————
   - this is an adaptive plan (rows marked '-' are inactive)

Tom directs everyone’s attention to some important data in the output. First, the Note section at the end of the output shows “this is an adaptive plan,” which confirms that the optimizer has evaluated the conditions at runtime to finalize the execution plan. Step 4 (Id value is 4) shows statistics collector, a step inserted by the optimizer to examine the actual number of rows returned in the previous step. The statistics collector step is crucial, Tom explains, because it lets the optimizer know the actual current pattern of data, which in this case happens to be just the opposite of what the statistics say. This new information forces the optimizer to switch to a different subplan that uses a full table scan on T2 (as indicated in step 9). Similarly, the new data calls for the use of a different type of join—a hash join instead of nested loops (as indicated in step 1). The optimizer computes that subplan as well.

All the different subplan steps are listed in the execution plan, Tom points out, but the optimizer eventually chooses one subplan. “How do you know which subplan the optimizer chooses?” asks Dave.

The answer lies in the execution plan output as well, Tom explains. The plan output shows all the steps in all the subplans, but the steps not chosen are shown as inactive and the inactive steps are marked by a “-” before the step number (in the Id column). Citing the output in Listing 3, Tom shows his visitors that steps 2, 3, 4, 7, and 8 are inactive; they were considered by the optimizer but were not chosen. For example, step 8 shows an index scan for the T2 table, which was the original plan computed by the optimizer, but inspecting the actual data (which is only 10 rows instead of the estimated 100,000), the optimizer chose a full table scan instead. Similarly, the optimizer initially computed a nested loop in step 3 but did not choose it during execution. Instead, the optimizer chose a hash join (as indicated in step 1). The optimizer started off with the execution plan computed based on the statistics present but adjusted the plan based on the actual data in the tables—just as a GPS adjusts the calculated route based on the actual traffic conditions it encounters on the road.

Tom’s visitors, visibly excited now, have a few more questions. “Do we need to enable something in the database for the optimizer to adapt a plan?” asks Dave. No, the optimizer does it automatically, responds Tom. “Well, then why is it not happening in my database now?” asks a puzzled Dave.

Tom points to an initialization parameter in the database—optimizer_adaptive_reporting_only—that controls the behavior. If the parameter is set to true, the optimizer will not adapt the execution plans (the default is false). Dave’s database was initially created before Oracle Database 12c and was later upgraded to Oracle Database 12c. The parameter was somehow set to true during the upgrade, so the optimizer plans are not being adapted. In fact, continues Tom, setting the parameter value to true is an excellent way to inspect the behavior of the optimizer when it is not using adaptive plans. To demonstrate that behavior, he re-executes the query shown in Listing 3, but before doing that, he sets the parameter to true in his session, using the following SQL:

alter session set optimizer_adaptive_reporting_only = true;

Listing 4 shows the resulting execution plan. It still shows—reports—all the subplans the optimizer considers, Tom points out. However, the inactive steps are steps 2, 5, and 10. This new plan, which did not use adaptive plans, uses nested loops instead of hash joins and an index scan instead of a full table scan. This choice, Tom explains, is what the optimizer would have come up with from the statistics present, and it would have been disastrous. But thanks to adaptive plans, the optimizer computed a better plan based on the actual data, avoiding the disaster.

Code Listing 4: The execution plan without adaptive plans

Plan hash value: 4098285983
——————————————————————————————————————————————————————————————————————————————
|  Id | Operation                   | Name        | Starts | E-Rows | A-Rows |
——————————————————————————————————————————————————————————————————————————————
|    0| SELECT STATEMENT            |             |      1 |        |      1 |
|    1|  SORT AGGREGATE             |             |      1 |      1 |      1 |
|- * 2|   HASH JOIN                 |             |      1 |  11111 |    100K|
|    3|    NESTED LOOPS             |             |      1 |        |    100K|
|    4|     NESTED LOOPS            |             |      1 |  11111 |    100K|
|-   5|      STATISTICS COLLECTOR   |             |      1 |        |    100K|
|    6|       TABLE ACCESS BY IND...| T1          |      1 |      1 |    100K|
|  * 7|        INDEX RANGE SCAN     | SYS_C0010628|      1 |      1 |    100K|
|  * 8|      INDEX RANGE SCAN       | IN_T2_COL2  |    100K|  10000 |    100K|
|    9|     TABLE ACCESS BY INDEX...| T2          |    100K|  10000 |    100K|
|-  10|    TABLE ACCESS FULL        | T2          |      0 |  10000 |      0 |
——————————————————————————————————————————————————————————————————————————————

Adaptive plans, Tom explains, also apply to parallel queries. Adaptive plans can change the method of distributing the rows to various parallel query servers by the query coordinator during execution.

Monitoring

Because adaptive plans occur automatically during the execution of the query—not during the parse phase—Dave wants to know how he can figure out which SQL statements have benefited from adaptive plans. It’s rather simple, Tom explains. A column named is_resolved_adaptive_plan in the v$sql view shows whether a SQL statement execution used an adaptive plan. He uses the control1 comment he put in the query earlier to identify the SQL statement and queries the v$sql view:

select sql_id, is_resolved_adaptive_plan
from v$sql
where sql_text like ‘% control1 */ %’
/
SQL_ID        I
————————————— —
gxubjt1p8qscf Y

The output shows Y in the IS_RESOLVED_ADAPTIVE_PLAN column (I), which indicates that the SQL statement used an adaptive plan. To know how exactly it was adapted, Tom uses the following query (whose output is similar to that in Listing 3):

select * from  
 table(dbms_xplan.display_cursor(
  sql_id=>' gxubjt1p8qscf',
  format=>'allstats +adaptive'));

Conclusion

The optimizer computes the execution plan for a query, and that plan determines the query’s performance. In the prior releases of Oracle Database, statistics drove the quality of the decisions made by the optimizer and bad or stale statistics were often the cause of performance issues. In Oracle Database 12c, the optimizer always re-evaluates the effectiveness of a plan during query execution by checking the actual data and adjusts the plan accordingly, producing more-efficient plans as a result.

Tom’s visitors, talking among themselves about the benefits of adaptive plans, thank Tom for the information and efficiently file out of his office.

Next Steps

LEARN more about adaptive plans and tuning
bit.ly/1ATCLaC
bit.ly/1r3Kcuq

WATCH Tom Kyte explain adaptive plans

DOWNLOAD Oracle Database 12c

Share

A Perfect Plan

Learn how adaptive plans in Oracle Database 12c compute the perfect plan—even when statistics are imperfect.