By Connor McDonald
In my previous article on SQL tuning, I discussed the mechanisms for identifying SQL statements that are executing poorly or, more accurately, SQL statements that have a high cost in terms of either CPU or I/O. But this identification for those SQL statements also defines “high cost” as having a negative impact on the delivery of core business functions. As I continue to stress in this series, the delivery of business functionality must be the driving force for tuning efforts. Although the focus of this article is how to proceed once a problematic SQL statement has been identified, I am assuming that the appropriate prerequisite tasks of liaising with the users, mapping a critical business function to this SQL statement, and validating that the SQL statement is syntactically and functionally correct to meet that business function have all been completed.
By way of example, I will assume that the SQL statement in Listing 1 has been identified as problematic. It provides a report to management of total employee salaries per department, based on a range of criteria from each employee’s job history.
Listing 1: SQL statement that requires analysis and improvement
SQL> select e.department_id, sum(salary) 2 from employees e, 3 job_history j 4 where e.employee_id = j.employee_id 5 and extract(year from e.hire_date) > 1985 6 and j.end_date > j.start_date + 1 7 and j.start_date >= e.hire_date 8 group by e.department_id; DEPARTMENT_ID SUM(SALARY) ————————————— ——————————— 50 7900 90 17000 30 11000 20 13000 80 17200
Let’s also assume that the task of validating that the SQL meets the business requirement correctly has been completed.
Now let’s look at a critical piece of information: the execution plan the optimizer derives to execute the SQL statement. Modern tools make it easy to run an EXPLAIN PLAN
command on the SQL statement to derive the execution plan. Listing 2 shows an example of this in Oracle's SQL*Plus, using the SET AUTOTRACE
facility to defer running the actual SQL statement and only execute EXPLAIN PLAN
on it.
Listing 2: Using AUTOTRACE TRACEONLY EXPLAIN in Oracle SQL*Plus
SQL> set autotrace traceonly explain SQL> select e.department_id, sum(salary) 2 from employees e, 3 job_history j 4 where e.employee_id = j.employee_id 5 and extract(year from e.hire_date) > 1985 6 and j.end_date > j.start_date + 1 7 and j.start_date >= e.hire_date 8 group by e.department_id; Execution Plan —————————————————————————————————————————————————————————— SQL_ID: 789950bkkyhcu Plan hash value: 2697813438 ———————————————————————————————————————————————————————————————————————————————————————— | Id | Operation | Name | Rows| Bytes| Cost (%CPU)| Time | ———————————————————————————————————————————————————————————————————————————————————————— | 0|SELECT STATEMENT | | 1| 39| 5 (20)| 00:00:01 | | 1|HASH GROUP BY | | 1| 39| 5 (20)| 00:00:01 | | 2|NESTED LOOPS | | 1| 39| 4 (0)| 00:00:01 | | 3|NESTED LOOPS | | 5| 39| 4 (0)| 00:00:01 | |* 4|TABLE ACCESS FULL |EMPLOYEES | 5| 95| 3 (0)| 00:00:01 | |* 5|INDEX RANGE SCAN |JHIST_EMPLOYEE_IX| 1| | 0 (0)| 00:00:01 | |* 6|TABLE ACCESS BY INDEX ROWID|JOB_HISTORY | 1| 20| 1 (0)| 00:00:01 | ———————————————————————————————————————————————————————————————————————————————————————— Predicate Information (identified by operation id): ——————————————————————————————————————————————————— 4 - filter(EXTRACT(YEAR FROM INTERNAL_FUNCTION("E"."HIRE_DATE"))>1985) 5 - access("E"."EMPLOYEE_ID"="J"."EMPLOYEE_ID") 6 - filter("J"."START_DATE">="E"."HIRE_DATE" AND
Unfortunately, using EXPLAIN PLAN
for tuning is the single most common mistake made by database developers.
Even in the earliest releases of the Oracle Database cost-based optimizer, there was the potential to not get to the true plan for the SQL statement execution, because
EXPLAIN PLAN
command in an optimizer environment not identical to the application runtime environment. (The optimizer modes, National Language Support [NLS] settings, optimizer parameter settings, and so on may be different.)EXPLAIN PLAN
command.As the optimizer becomes more powerful with each release of Oracle Database, many more scenarios where EXPLAIN PLAN
does not get the true execution plan are possible, such as when previous executions of the same SQL statement are “remembered” by the database and help determine the best execution plan for future executions. A full treatment of these enhancements is beyond the scope of this SQL tuning series, but the core message here is
Do not rely on the EXPLAIN PLAN
command result to be an indicator of the true execution plan.
This may seem like a contradiction, but this is only because, historically, many Oracle professionals have used the terms “execution plan” and “explain plan” interchangeably. They are not actually the same, though, and as the Oracle Database optimizer continues to be enhanced and improved, there will be more and more differences between the two.
To make the difference between the result of an EXPLAIN PLAN
command and the actual execution plan clear, the execution plan is the plan that was used at runtime during the execution of the SQL statement. Having this execution plan is critical to the SQL tuning process, so how does a developer get access to it? V$SQL_PLAN
is the performance view shown in Listing 3 that exposes the true execution plan that was used, and this can be queried for the plan of a SQL statement identified by its SQL_ID
and CHILD_NUMBER
values.
Listing 3: The V$SQL_PLAN performance view, holding execution plans for each SQL statement in the library cache
SQL> desc V$SQL_PLAN Name Null? Type ———————————————————————————— ——————— ———————————————— ADDRESS RAW(8) HASH_VALUE NUMBER SQL_ID VARCHAR2(13) PLAN_HASH_VALUE NUMBER FULL_PLAN_HASH_VALUE NUMBER CHILD_ADDRESS RAW(8) CHILD_NUMBER NUMBER TIMESTAMP DATE OPERATION VARCHAR2(30) OPTIONS VARCHAR2(30) OBJECT_NODE VARCHAR2(40) OBJECT# NUMBER OBJECT_OWNER VARCHAR2(128) OBJECT_NAME VARCHAR2(128) OBJECT_ALIAS VARCHAR2(261) OBJECT_TYPE VARCHAR2(20) OPTIMIZER VARCHAR2(20) ID NUMBER PARENT_ID NUMBER DEPTH NUMBER POSITION NUMBER SEARCH_COLUMNS NUMBER COST NUMBER CARDINALITY NUMBER BYTES NUMBER OTHER_TAG VARCHAR2(35) PARTITION_START VARCHAR2(64) PARTITION_STOP VARCHAR2(64) PARTITION_ID NUMBER OTHER VARCHAR2(4000) DISTRIBUTION VARCHAR2(20) CPU_COST NUMBER IO_COST NUMBER TEMP_SPACE NUMBER ACCESS_PREDICATES VARCHAR2(4000) FILTER_PREDICATES VARCHAR2(4000) PROJECTION VARCHAR2(4000) TIME NUMBER QBLOCK_NAME VARCHAR2(128) REMARKS VARCHAR2(4000) OTHER_XML CLOB CON_ID NUMBER
Fortunately, there is an API to act as a wrapper around the information exposed in the V$SQL_PLAN
performance view, which can simplify the process of getting a nicely formatted execution plan. That wrapper, DBMS_XPLAN.DISPLAY_CURSOR
, is a table function—that is, it can be queried as if it were a database table to return the true execution plan from an execution of a SQL statement. By default, it will output the execution plan of the most recently executed SQL statement in the current database session. Listing 4 shows an example of this (some sections of the output have been omitted for simplicity).
Execution Plan NotesHere are a couple of additional notes about the contents of execution plans:
|
Listing 4: The true execution plan for the problematic SQL statement
SQL> select e.department_id, sum(salary) 2 from employees e, 3 job_history j 4 where e.employee_id = j.employee_id 5 and extract(year from e.hire_date) > 1985 6 and j.end_date > j.start_date + 1 7 and j.start_date >= e.hire_date 8 group by e.department_id; DEPARTMENT_ID SUM(SALARY) ————————————— ——————————— 50 7900 90 17000 30 11000 20 13000 80 17200 SQL> select * from dbms_xplan.display_cursor(); ————————————————————————————————————————————————————————————————————————————————— | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ————————————————————————————————————————————————————————————————————————————————— | 0 | SELECT STATEMENT | | | | 5 (100)| | | 1 | HASH GROUP BY | | 1 | 39 | 5 (20)| 00:00:01 | |* 2 | HASH JOIN | | 1 | 39 | 4 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL| JOB_HISTORY | 1 | 20 | 3 (0)| 00:00:01 | |* 4 | TABLE ACCESS FULL| EMPLOYEES | 1 | 19 | 1 (0)| 00:00:01 | ————————————————————————————————————————————————————————————————————————————————— Predicate Information (identified by operation id): ——————————————————————————————————————————————————— 2 - access("E"."EMPLOYEE_ID"="J"."EMPLOYEE_ID") filter("J"."START_DATE">="E"."HIRE_DATE") 3 - filter("J"."END_DATE">INTERNAL_FUNCTION("J"."START_DATE")+1) 4 - filter(EXTRACT(YEAR FROM INTERNAL_FUNCTION("E"."HIRE_DATE"))>1985) Note ————— - this is an adaptive plan
Note that the execution plan that was used in Listing 4 is not the same as the plan described in Listing 2. As previously mentioned, there could be many reasons for this, but the “Note” section in Listing 4 gives a hint about one possible cause: Adaptive query optimization played a part in determining the execution plan. For more information on adaptive plans, consult the “About Adaptive Query Optimization” section in SQL Tuning Guide.
The cause of the difference between the EXPLAIN PLAN
output in Listing 2 and the true execution plan in Listing 4 is less important than knowing that the true plan has now been discovered. Armed with the true plan, we can begin analysis of the problematic SQL statement for tuning.
Since the cost-based optimizer became available, in Oracle Database 7, there have been many articles, books, and blog posts published with “best practices” for evaluating an execution plan to tune the performance of a SQL statement. Without naming and shaming sources, common advisories were
TABLE ACCESS FULL
means the plan is bad.”COST
mean the plan is bad.”MERGE JOIN CARTESIAN
means the plan is bad.”and other similar sound bites. Like many such advisories, they may occasionally be correct, but they are incorrect just as often. Worse still, these advisories remove the focus from the user experience. There is no such thing as a “good” plan or a “bad” plan—there is only a SQL statement that meets performance requirements (“good SQL”) or fails to meet them (“bad SQL”).
When it comes to analyzing an execution plan, rather than looking for particular keywords or patterns in the plan, I prefer a more human metaphor to understand the optimizer’s role, because that will explain how to dissect the plan and get more information about it in order to convert a bad SQL statement into a good one. Each week I go to my local supermarket to get food and other necessities, but like many other people, I make an occasional midweek trip to grab a few missed items or replenish a staple such as milk or bread. On such trips, I’ll have a small basket containing just a few items when I get to the checkout counters to pay and exit. At this point, I do not simply go to the closest checkout aisle. As I suspect most other people do, I’ll take a quick scan of the checkout aisles to see which has the least amount of customer traffic and then I’ll walk to that one. With any luck, it will either be empty or the people being served will have just a few items remaining to scan. That is just a commonsense approach to paying and getting my shopping completed as quickly as possible. With that strategy, I am being exactly like the database optimizer. There are several potential execution plans for running a SQL statement (the checkout aisles); the optimizer will estimate the effort each plan would entail (see how many people are in each checkout line); and the optimizer will decide to use the plan that will yield the quickest result (pick the shortest checkout line).
Returning to the shopping metaphor, sometimes things don’t work out quite as I expect. I’ll scan the checkout aisles and see a candidate aisle that has only one person in line. I’ll race down there with my small basket, confident that I’ve beaten the system, only to find that when I get there, that one person has the shopping cart that trumps all other shopping carts! Hundreds of items in the cart, compounded by many being fresh produce, each of which will need to be weighed and priced by the cashier. From my perspective, this is a disaster. I have just two or three things to purchase, and now I am going to be stuck waiting for 30 minutes. This too is the same difficulty that may befall the database optimizer. It may choose an execution plan that appears to be optimal (just one lone person in the checkout aisle), but when the plan is actually used to execute a SQL statement, that plan or parts of it that the optimizer estimated would be quick and efficient turn out to be more costly than expected (like the killer cart!).
This metaphor drives the strategy for the SQL tuning database developer. If the tuner/developer can locate where the optimizer estimates differed significantly from the reality of the SQL execution, that is the place to best target the tuning effort.
Oracle Database provides an optimizer hint to provide this estimate-versus-reality information directly to the developer. Whereas most optimizer hints instruct the optimizer to use a certain operation within the execution plan, the GATHER_PLAN_STATISTICS
hint differs, instructing the database engine to record execution plan statistics when the SQL statement is executed. Thus both the optimizer estimates and the runtime actuals are available once the execution completes. Displaying these involves calling DBMS_XPLAN.DISPLAY_CURSOR
with a modified format parameter, as shown in Listing 5.
Listing 5: Obtaining estimated versus actual statistics
SQL> select /*+ gather_plan_statistics */ e.department_id, sum(salary) 2 from employees e, 3 job_history j 4 where e.employee_id = j.employee_id 5 and extract(year from e.hire_date) > 1985 6 and j.end_date > j.start_date + 1 7 and j.start_date >= e.hire_date 8 group by e.department_id; DEPARTMENT_ID SUM(SALARY) ————————————— ——————————— 50 7900 90 17000 30 11000 20 13000 80 17200 SQL> select * from table(dbms_xplan.display_cursor(format=>'ALLSTATS LAST')); PLAN_TABLE_OUTPUT ———————————————————————————————————————————————————————————————————— ———————————————————————————————————————————————————————————————————————————————————————— |Id|Operation |Name |Str|E-Rows|A-Rows|A-Time|Buffers| OMem|1Mem |Used-Mem| ———————————————————————————————————————————————————————————————————————————————————————— | 0|SELECT STATEMENT | | 1| | 5|:00.01| 12| | | | | 1|HASH GROUP BY | | 1| 1| 5|:00.01| 12|1200K|1200K|897K (0)| |*2|HASH JOIN | | 1| 1| 6|:00.01| 12|1572K|1572K|959K (0)| |*3|TABLE ACCESS FULL|JOB_HISTORY| 1| 1| 10|:00.01| 6| | | | |*4|TABLE ACCESS FULL|EMPLOYEES | 1| 1| 107|:00.01| 6| | | | ———————————————————————————————————————————————————————————————————————————————————————— Predicate Information (identified by operation id): ——————————————————————————————————————————————————— 2 - access("E"."EMPLOYEE_ID"="J"."EMPLOYEE_ID") filter("J"."START_DATE">="E"."HIRE_DATE") 3 - filter("J"."END_DATE">INTERNAL_FUNCTION("J"."START_DATE")+1) 4 - filter(EXTRACT(YEAR FROM INTERNAL_FUNCTION("E"."HIRE_DATE"))>1985) Note ————— - this is an adaptive plan
Two critical columns for the developer here are E-Rows
and A-Rows
. These are the rows (E)stimated by the optimizer for each line of the plan and the (A)ctual rows that resulted at SQL statement runtime, respectively. In the plan output, line 4 of the “Predicate Information” section shows that the scan of the EMPLOYEES
table would be filtering those rows where the HIRE_DATE
value was more recent than 1985. The E-Rows
and A-Rows
information in the last line of the SELECT
result shows that the optimizer expected to find only a single row from EMPLOYEES
for this predicate (E-Rows
= 1) but in fact 107 rows (A-Rows
= 107) were found. This does not necessarily mean that the execution plan is a poor one, but it does help guide the developer to focus on the part of the SQL statement that most probably needs attention. The information leads to hypotheses the developer can explore, such as
EMPLOYEE
table might be incorrect.HIRE_DATE
is skewed and requires a statistics histogram to provide the optimizer with more information.HIRE_DATE
column.extract(year from e.hire_date) > 1985
needs to be altered so that there is no expression around the HIRE_DATE
column.The key thing is that now the developer has a more granular area to focus on to tune the SQL.
Sometimes it is not possible to modify the candidate SQL statement to add the GATHER_PLAN_STATISTICS
hint. In such instances, if you have access to the database session itself—for example, via a login trigger—you will be able to temporarily set the parameter STATISTICS_LEVEL
value to ALL for that session and that will also collect the additional statistics.
DBMS_XPLAN.DISPLAY_CURSOR
can then be used in a similar fashion to obtain the estimated-versus-actual SQL statement runtime comparison. Because it is unlikely in this circumstance that DBMS_XPLAN
will be run in the same session as the SQL statement, the SQL_ID
can be used to interrogate the database to determine the runtime statistics. For the problematic SQL statement run earlier, Listing 6 shows how to get the runtime statistics by locating the SQL_ID
from V$SQLSTATS
.
Listing 6: Obtaining estimated versus actual statistics for a known SQL_ID
SQL> select sql_id, sql_text from v$sqlstats 2 where sql_text like '%extract(year%'; SQL_ID SQL_TEXT ————————————— ———————————————————————————————————————————————————————————————— 1afzpsbuadkbs select sql_id, sql_text from v$sqlstats where sql_text like '%ex tract(year%' ct38j4c0rbhnj select e.department_id, sum(salary) from employees e, j ob_history j where e.employee_id = j.employee_id and extract (year from e.hire_date) > 1985 and j.end_date > j.start_date + 1 and j.start_date >= e.hire_date group by e.department_id SQL> select * 2 from table(dbms_xplan.display_cursor(sql_id=>'cx025dqycvcmy', format=>'ALLSTATS LAST')); PLAN_TABLE_OUTPUT ———————————————————————————————————————————————————————————————————— ———————————————————————————————————————————————————————————————————————————————————————— |Id|Operation |Name |Str|E-Rows|A-Rows|A-Time|Buffers| OMem|1Mem |Used-Mem| ———————————————————————————————————————————————————————————————————————————————————————— | 0|SELECT STATEMENT | | 1| | 5|:00.01| 12| | | | | 1|HASH GROUP BY | | 1| 1| 5|:00.01| 12|1200K|1200K|897K (0)| |*2|HASH JOIN | | 1| 1| 6|:00.01| 12|1572K|1572K|959K (0)| |*3|TABLE ACCESS FULL|JOB_HISTORY| 1| 1| 10|:00.01| 6| | | | |*4|TABLE ACCESS FULL|EMPLOYEES | 1| 1| 107|:00.01| 6| | | | ———————————————————————————————————————————————————————————————————————————————————————— Predicate Information (identified by operation id): ——————————————————————————————————————————————————— 2 - access("E"."EMPLOYEE_ID"="J"."EMPLOYEE_ID") filter("J"."START_DATE">="E"."HIRE_DATE") 3 - filter("J"."END_DATE">INTERNAL_FUNCTION("J"."START_DATE")+1) 4 - filter(EXTRACT(YEAR FROM INTERNAL_FUNCTION("E"."HIRE_DATE"))>1985) Note ————— - this is an adaptive plan
Even richer functionality is available with the Real-Time SQL Monitoring facility in the optional Oracle Tuning Pack for Oracle Database, Enterprise Edition. The same information that can be obtained from DBMS_XPLAN.DISPLAY_CURSOR
can be extracted and presented in graphical form with Oracle Enterprise Manager or via the DBMS_SQL_MONITOR.REPORT_SQL_MONITOR
package. For SQL statements that run for more than 10 seconds, the Real-Time SQL Monitoring infrastructure automatically collects runtime statistics without the need for special hints or alterations to the STATISTICS_LEVEL
parameter. Listing 7 shows the reporting captured and presented via HTML with Real-Time SQL Monitoring. (The reporting output can also be spooled to a file that presents a graphical view of the runtime execution.) Figure 1 shows the Real-Time SQL Monitoring report output for SQL_ID from Listing 7.
Figure 1: Real-Time SQL Monitoring report output for SQL_ID from Listing 7
Listing 7: Real-Time SQL Monitoring report
SQL> select dbms_sql_monitor.report_sql_monitor 2 (sql_id =>'77z2xc198jzda', 3 report_level =>'all', 4 type =>'HTML') report 5 from dual; <html> <head> <title> SQL Monitor Report </title> <style type="text/css"> body, table, input, select, textarea {font:normal normal 8pt Verdana,Arial;text-decoration:none; color:#000000; empty-cells:show;} .s8 {font-size:8pt;color:#006699} ...
If you can edit the SQL text directly, you can compel any SQL statement to activate monitoring—even if it runs for less than 10 seconds—using the MONITOR
hint. The big advantage of using Real-Time SQL Monitoring over gathering runtime statistics for a SQL statement in a standard way is the “Real-Time” part of the feature’s name. Real-Time SQL Monitoring can be invoked on a currently executing SQL statement so that the progress of each phase of the execution plan can be monitored. Figure 2 shows an example of a simple COUNT
query against a 20-million-row table being monitored as it executes. The “Global Information” section shows that the SQL is currently executing, and the "Rows
" column shows the rows processed so far during the execution.
Figure 2: Real-Time SQL Monitoring report output for active SQL execution
For more information on Real-Time SQL Monitoring, refer to the SQL tuning documentation.
What if all the lines in the execution plan are nicely aligned in terms of estimated versus actual? This would indicate that the statistics provided to the optimizer closely aligned with the reality of the data in the tables within the query and that the plan is very likely the best plan possible for that SQL statement. That is little comfort to the database developer who is still faced with the task of tuning the SQL statement. But knowing that the plan is optimal means that the tuning focus can switch to influencing the performance of the SQL statement outside the sphere of the optimizer. This is the time to consider structural changes to the physical database, and the changes could include options such as
So even when the optimizer plan is optimal, there is still value in the DBMS_XPLAN.DISPLAY_CURSOR
or Real-Time SQL Monitoring output. The other reporting columns, such as “Database Time” and “IO Requests,” can help you make decisions on where structural changes will decrease the execution time the most. For example, returning to the problematic EMPLOYEE
query, if most of the execution time is spent scanning the JOB_HISTORY
table, there will be little possible benefit to compressing the EMPLOYEE
table.
At this point in this series on SQL tuning, problematic SQL statements have been identified and their true execution plans have been extracted with DBMS_XPLAN.DISPLAY_CURSOR
. This is not to discount the value of a simple EXPLAIN PLAN
command during the application development process as a means of getting an indicative measure of how a SQL statement may perform, but always remember that the plan observed via EXPLAIN PLAN
is never guaranteed to be the same plan that is used at execution time when the application is released “into the wild.”
Once the true execution plan has been identified by analysis of the estimated versus actual rows for each phase of the execution plan, the correctness of the optimizer decisions can be evaluated. This helps the SQL tuner narrow the focus to those elements of the plan that will likely need attention. If the optimizer was mostly correct in its estimates, the tuning effort may need to consider structural changes such as adding indexes or altering the database design to better serve the business functional requirements.
In the next, final article in this series, I’ll explore some typical issues that lead to poorly performing SQL and how to handle them.
LEARN more about SQL tuning.
READ
“A Higher-Level Perspective on SQL Tuning.”
“A Higher-Level Perspective on SQL Tuning, Part 2.”
DOWNLOAD Oracle Database 19c .
Illustration by Wes Rowell
DISCLAIMER: We've captured these popular historical magazine articles for your reference. Links etc contained within these article possibly won't work. These articles are provided as-is with no guarantee that the information within them is the best advice for current versions of the Oracle Database.