Database, SQL and PL/SQL

Powerful Impact Analysis

Easily identify program units affected by database object changes with PL/Scope in Oracle Database 12c Release 2.

By Steven Feuerstein

January/February 2017

PL/Scope is a powerful code analysis tool for PL/SQL program units, first introduced in Oracle Database 11g. Now with Oracle Database 12c Release 2 (Oracle Database 12.2), that analysis includes the SQL statements inside your PL/SQL code base. This significant enhancement enables developers to perform comprehensive impact analysis of changes to tables, such as changing a column’s name.

Turning on PL/Scope

Prior to Oracle Database 12.2, you would start PL/Scope by executing this statement in your session:

ALTER SESSION SET plscope_settings='identifiers:all'
/

After that, whenever you compiled a PL/SQL program unit, the compile would gather information about all PL/SQL identifier usages and make that data available via the ALL_IDENTIFIERS view.

In Oracle Database 12c Release 2, to start PL/Scope and gather information about SQL statements, you execute this statement:

ALTER SESSION SET plscope_settings='identifiers:all, statements:all'
/

Then, after compilation of a program unit, you will find information about SQL statements in the ALL_STATEMENTS view, and the ALL_IDENTIFIERS view contains information about SQL identifiers, such as table and column names.

When you compile a program unit with PL/Scope enabled for statement gathering, the PL/SQL compiler places a row in ALL_STATEMENTS for each SQL statement in the program unit. The following information is recorded for those statements:

Owner. This is the owner of the program unit.
Object name. This is the name of the program unit.
Statement type. This is the type of statement: SELECT, INSERT, UPDATE, DELETE, MERGE, EXECUTE IMMEDIATE, or OPEN.
Line number. This is the line number in the program unit where the statement begins.
Column number. This is the column position on that line.
Usage ID. This is an identifier unique to this statement in this program unit.
Usage context ID. Roughly speaking, this is the parent of the usage ID. It provides the context in which the usage ID is found in the program unit.
Signature. This is the identifier for the statement. It is unique across all program units.

The Database Objects for Analysis

Before I go further into my exploration of PL/Scope, I need to create some database objects. For this article, I am going to keep track of endangered species, their location, and where research is done on those species.

So I create three tables, as shown in Listing 1.

Code Listing 1: Creating three tables

CREATE TABLE species_locations
(
   id         NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
   location   VARCHAR2 (100) UNIQUE
)
/
CREATE TABLE endangered_species
(
   id            NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
   name          VARCHAR2 (100) UNIQUE,
   location_id   NUMBER    REFERENCES species_locations (id)
)
/
CREATE TABLE research_locations
(
   id            NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
   name          VARCHAR2 (100) UNIQUE,
   location_id   NUMBER    REFERENCES species_locations (id)
)
/

In the table creation, note that GENERATED AS IDENTITY is an Oracle Database 12c Release 1 feature.

After enabling PL/Scope, I then create a package, species_mgr, to help manage data in these tables, as shown in Listing 2.

Code Listing 2: Creating the species_mgr package

CREATE OR REPLACE PACKAGE species_mgr
   AUTHID DEFINER
IS
   FUNCTION location_id_from_name (
      location_in   IN species_locations.location%TYPE)
      RETURN species_locations.id%TYPE;
END;
/
CREATE OR REPLACE PACKAGE BODY species_mgr
IS
   FUNCTION location_id_from_name (
      location_in   IN species_locations.location%TYPE)
      RETURN species_locations.id%TYPE
   IS
      l_return   species_locations.id%TYPE;
   BEGIN
      SELECT sl.id
        INTO l_return
        FROM species_locations sl
       WHERE sl.location = location_id_from_name.location_in;
      RETURN l_return;
   END;
END;
/

Next, I populate my tables, as shown in Listing 3.

Code Listing 3: Populating the three tables

BEGIN
   /* Amazon Data */
   INSERT INTO species_locations (location)
        VALUES ('Amazon');
   INSERT INTO endangered_species (name, location_id)
           VALUES (
                     'Black Spider Monkey',
                     species_mgr.location_id_from_name ('Amazon'));
   INSERT INTO endangered_species (name, location_id)
        VALUES ('Sloth', species_mgr.location_id_from_name ('Amazon'));
   INSERT INTO endangered_species (name, location_id)
           VALUES (
                     'Amazon River Dolphin',
                     species_mgr.location_id_from_name ('Amazon'));
   INSERT INTO endangered_species (name, location_id)
           VALUES (
                     'Poison Dart Frog',
                     species_mgr.location_id_from_name ('Amazon'));
   INSERT INTO endangered_species (name, location_id)
        VALUES ('Macaw', species_mgr.location_id_from_name ('Amazon'));
   INSERT INTO endangered_species (name, location_id)
        VALUES ('Jaguar', species_mgr.location_id_from_name ('Amazon'));
   INSERT INTO research_locations (name, location_id)
           VALUES (
                     'Smithsonian Tropical Research Institute',
                     species_mgr.location_id_from_name ('Amazon'));
   INSERT INTO research_locations (name, location_id)
        VALUES ('IBAMA', species_mgr.location_id_from_name ('Amazon'));
   /* Galapagos Data */
   INSERT INTO species_locations (location)
        VALUES ('Galapagos');
   INSERT INTO endangered_species (name, location_id)
           VALUES (
                     'Sea Turtle',
                     species_mgr.location_id_from_name ('Galapagos'));
   INSERT INTO endangered_species (name, location_id)
           VALUES (
                     'Leatherback Turtle',
                     species_mgr.location_id_from_name ('Galapagos'));
   INSERT INTO endangered_species (name, location_id)
        VALUES ('SEI Whale', species_mgr.location_id_from_name ('Galapagos'));
   INSERT INTO endangered_species (name, location_id)
           VALUES (
                     'Green Turtle',
                     species_mgr.location_id_from_name ('Galapagos'));
   INSERT INTO endangered_species (name, location_id)
           VALUES (
                     'Giant Tortoise',
                     species_mgr.location_id_from_name ('Galapagos'));
   INSERT INTO endangered_species (name, location_id)
           VALUES (
                     'Galapagos Penguin',
                     species_mgr.location_id_from_name ('Galapagos'));
   INSERT INTO research_locations (name, location_id)
           VALUES (
                     'Charles Darwin Foundation',
                     species_mgr.location_id_from_name ('Galapagos'));
   INSERT INTO research_locations (name, location_id)
           VALUES (
                     'Galapagos Science Center',
                     species_mgr.location_id_from_name ('Galapagos'));
   COMMIT;
END;
/

Finally, I create two additional PL/SQL procedures—add_species and show_location_info—that work with these three tables, as shown in Listing 4.

Code Listing 4: Creating the add_species and show_location_info procedures

CREATE OR REPLACE PROCEDURE add_species (NAME_IN       IN VARCHAR2,
                                         location_in   IN VARCHAR2)
   AUTHID DEFINER
IS
   l_info   VARCHAR2 (32767);
BEGIN
   INSERT INTO endangered_species (name, location_id)
        VALUES (NAME_IN, species_mgr.location_id_from_name (location_in));
   SELECT s.id || '-' || s.name || '-' || l.location
     INTO l_info
     FROM endangered_species s, species_locations l
    WHERE s.location_id = l.id;
   DBMS_OUTPUT.put_line ('Inserted ' || l_info);
END;
/
CREATE OR REPLACE PROCEDURE show_location_info (
   location_id_in   IN species_locations.id%TYPE)
   AUTHID DEFINER
IS
BEGIN
   FOR rec IN (  SELECT r.location_id, r.name
                   FROM research_locations r
                  WHERE r.location_id = show_location_info.location_id_in
               ORDER BY r.name)
   LOOP
      DBMS_OUTPUT.put_line (rec.location_id || '-' || rec.name);
   END LOOP;
END;
/

I’ve got my code base. Now I’d like to do some analysis of it.

Program Unit “Dump” of Identifiers and Statements

Let’s take a look at what PL/Scope can tell us about the add_species procedure. And let’s start with the views that will tell us what’s going on.

The two views with PL/Scope-generated information, ALL_IDENTIFIERS and ALL_STATEMENTS, share many columns, such as USAGE_ID and SIGNATURE. And there are columns unique to each, such as STATEMENT_TYPE in ALL_STATEMENTS.

The USAGE_ID value is unique across the ALL_IDENTIFIERS and ALL_STATEMENTS views for a specific program unit. That makes it easier to understand how statements and identifiers relate to each other.

The query in Listing 5 executes a union of these two views for all rows related to add_species. It uses the CONNECT BY syntax to display identifiers and statements intuitively.

Code Listing 5: Looking at ALL_IDENTIFIERS and ALL_STATEMENTS for references to add_species

WITH one_obj_name AS (SELECT 'ADD_SPECIES' object_name FROM DUAL)
    SELECT plscope_type,
           usage_id,
           usage_context_id,
           LPAD (' ', 2 * (LEVEL - 1)) || usage || ' ' || name usages,
           line,
           col
      FROM (SELECT 'ID' plscope_type,
                   ai.object_name,
                   ai.usage usage,
                   ai.usage_id,
                   ai.usage_context_id,
                   ai.TYPE || ' ' || ai.name name,
                   ai.line,
                   ai.col
              FROM all_identifiers ai, one_obj_name
             WHERE ai.object_name = one_obj_name.object_name
            UNION ALL
            SELECT 'ST',
                   st.object_name,
                   st.TYPE,
                   st.usage_id,
                   st.usage_context_id,
                   'STATEMENT',
                   st.line,
                   st.col
              FROM all_statements st, one_obj_name
             WHERE st.object_name = one_obj_name.object_name)
START WITH usage_context_id = 0
CONNECT BY PRIOR usage_id = usage_context_id

Figure 1 displays the output of the Listing 5 query from Oracle SQL Developer.

o17plsql-f1

Figure 1. Result of query for references to add_species in ALL_IDENTIFIERS and ALL_STATEMENTS

Here are some important takeaways from this report:

A USAGE_CONTEXT_ID value in ALL_IDENTIFIERS could well reference a row in the ALL_STATEMENTS table (and vice versa).
Column identifiers are not declared in the ALL_IDENTIFIERS view, as, say, a variable would be (see line 7 in Figure 1 for a variable declaration). There are simply references.
Invocations of user-defined functions in SQL statements are now tracked by PL/Scope (see line 15 in Figure 1).

Which Program Units Use ENDANGERED_SPECIES.LOCATION_ID?

Here’s a very common developer question: where and how is a table used in my PL/SQL code? With ALL_DEPENDENCIES, I can find out if a table is referenced, but that reference could be from a %ROWTYPE declaration or a SELECT or a DELETE. There’s no way of telling which. And I certainly can’t find out where and how a particular column is being used.

In addition, I could easily enough write a query against ALL_SOURCE to find all the program units that contain the string “LOCATION_ID”:

SELECT name, line, text
  FROM ALL_SOURCE
 WHERE upper (text) LIKE '%LOCATION_ID%'

Unfortunately, that query will return all kinds of false positives, including

That phrase showing up inside a comment
That phrase as part of a larger identifier, such as GET_LOCATION_ID
Columns with the same name but in other tables (such as RESEARCH_LOCATIONS)

With PL/Scope in Oracle Database 12.2, you can now definitively locate all references to the LOCATION_ID column of the ENDANGERED_SPECIES table and nothing else.

Of course, it would be nice to not have to write a special-purpose query every time you want to locate references to another column, so I’ve written the procedure in Listing 6 for just that purpose.

Code Listing 6: Package for looking for column references in ALL_IDENTIFIERS and ALL_STATEMENTS

CREATE OR REPLACE PROCEDURE show_column_usages (owner_in    IN VARCHAR2,
                                                table_in    IN VARCHAR2,
                                                column_in   IN VARCHAR2)
   AUTHID DEFINER
IS
BEGIN
   DBMS_OUTPUT.put_line (
      'References to ' || owner_in || '.' || table_in || '.' || column_in);
   DBMS_OUTPUT.put_line ('');
   FOR rec
      IN (  SELECT idt.line,
                   idt.owner || '.' || idt.object_name code_unit,
                   RTRIM (src.text, CHR (10)) text
              FROM all_identifiers idt, all_source src
             WHERE     idt.usage = 'REFERENCE'
                   AND idt.signature =
                          (SELECT idt_inner.signature
                             FROM all_identifiers idt_inner
                            WHERE     idt_inner.object_name =
                                      show_column_usages.table_in
                                  AND idt_inner.TYPE = 'COLUMN'
                                  AND idt_inner.name =
                                      show_column_usages.column_in
                                  AND idt_inner.usage = 'DECLARATION'
                                  AND idt_inner.owner = owner_in)
                   AND idt.line = src.line
                   AND idt.object_name = src.name
                   AND idt.owner = src.owner
          ORDER BY code_unit, line)
   LOOP
      DBMS_OUTPUT.put_line (
         'In ' || rec.code_unit || ' on ' || rec.line || ': ' || rec.text);
   END LOOP;
END;
/

When I execute the show_column_usages procedure as follows:

BEGIN
   show_column_usages (owner_in    => USER,
                       table_in    => 'ENDANGERED_SPECIES',
                       column_in   => 'LOCATION_ID');
END;
/

I see the following results:

STEVEN.ENDANGERED_SPECIES.LOCATION_ID References
In STEVEN.ADD_SPECIES on 7:
    INSERT INTO endangered_species (name, location_id)
In STEVEN.ADD_SPECIES on 13:
    WHERE s.location_id = l.id;
In STEVEN.SHOW_SPECIES on 8:
    INSERT INTO endangered_species (id, name, location_id)
In STEVEN.SHOW_SPECIES on 14:
    WHERE s.location_id = l.id;

Note that there are no references to the RESEARCH_LOCATIONS table. Nice!

Let’s take a look at how the show_column_usages procedure gets its job done.

First, interestingly, note that I am not even querying the new-to-Oracle Database 12.2 ALL_STATEMENTS view. That’s because I am not interested in SQL statements, per se—simply which SQL identifiers are used, and where. So ALL_IDENTIFIERS contains all the information I need.

Next, because I am looking for all occurrences of a specific identifier, I am working with the SIGNATURE column. PL/Scope assigns a unique identifier to each distinct identifier in my code base, and that’s what is stored in the SIGNATURE column.

So the first thing I need to do is find the signature of the LOCATION_ID column. The inner query inside the show_column_usages procedure does this for me:

SELECT idt_inner.signature
  FROM all_identifiers idt_inner
 WHERE     idt_inner.object_name = show_column_usages.table_in
       AND idt_inner.owner = owner_in
       AND idt_inner.name = show_column_usages.column_in
       AND idt_inner.TYPE = 'COLUMN'
       AND idt_inner.usage = 'DECLARATION'

This query says: get the signature of the declaration of the column identifier, for the names specified by the procedure parameters. I then use that inner query for an equality match on idt.signature. In the query below, for readability purposes, I use <my_column_signature> in place of the full query. Then the outer query becomes

SELECT idt.line,
       idt.owner || '.' || idt.object_name code_unit,
       RTRIM (src.text, CHR (10)) text
  FROM all_identifiers idt, all_source src
 WHERE     idt.usage = 'REFERENCE'
       AND idt.signature = <my_column_signature>
       AND idt.line = src.line
       AND idt.object_name = src.name
       AND idt.owner = src.owner

This query finds all the references to my_column_signature in ALL_IDENTIFIERS. I then join those identifier rows to the ALL_SOURCE table, so that I can display the line of source code.

Hey, that’s not so hard!

How Are My Tables Used?

The primary purpose of PL/SQL is to build APIs around data structures, controlling access to tables and ensuring data integrity and consistency. This means that PL/SQL code is usually full of SQL statements, and that can make it difficult to keep track of which tables are used where and how. With PL/Scope in Oracle Database 12.2, it is now easy to get answers to questions such as “Which program units perform an INSERT into a particular table?” The show_dml_statements_on procedure in Listing 7 provides a way to answer this and other data manipulation language (DML) use questions.

Code Listing 7: The show_dml_statements_on procedure

CREATE OR REPLACE PROCEDURE show_dml_statements_on (
   owner_in            IN VARCHAR2,
   table_in            IN VARCHAR2,
   statement_type_in   IN VARCHAR2 DEFAULT 'ALL')
IS
BEGIN
   DBMS_OUTPUT.put_line (
         'Locations of '
      || statement_type_in
      || ' DML Statements Against '
      || owner_in
      || '.'
      || table_in);
   DBMS_OUTPUT.put_line ('');
   FOR rec
      IN (SELECT idt.line,
                 idt.owner || '.' || idt.object_name code_unit,
                 RTRIM (src.text, CHR (10)) text
            FROM all_identifiers idt, all_source src, all_statements st
           WHERE     idt.usage = 'REFERENCE'
                 AND idt.TYPE = 'TABLE'
                 AND idt.name = table_in
                 AND idt.owner = owner_in
                 AND idt.line = src.line
                 AND idt.object_name = src.name
                 AND idt.owner = src.owner
                 AND idt.usage_context_id = st.usage_id
                 AND (   st.TYPE = statement_type_in
                      OR statement_type_in = 'ALL'))
   LOOP
      DBMS_OUTPUT.put_line (
         'In ' || rec.code_unit || ' on ' || rec.line || ': ' || rec.text);
   END LOOP;
END;
/

When I execute the show_dml_statements_on procedure with this block:

BEGIN
   show_dml_statements_on (USER, 'ENDANGERED_SPECIES', 'INSERT');
END;
/

I see the following output:

Locations of INSERT DML Statements Against STEVEN.ENDANGERED_SPECIES
In STEVEN.ADD_SPECIES on 7:
    INSERT INTO endangered_species (name, location_id)
In STEVEN.SHOW_SPECIES on 8:
    INSERT INTO endangered_species (id, name, location_id)

Identifying SQL Injection Vulnerabilities

SQL injection presents one of the greatest vulnerabilities for database developers, and dynamic SQL is the path followed by those who want to inject malicious code.

So it would be very helpful to be able to easily identify all program units that execute dynamic SQL or dynamic PL/SQL statements.

Prior to Oracle Database 12.2, the best you could do was to write queries against ALL_SOURCE to look for strings such as EXECUTE IMMEDIATE and OPEN <cursor> FOR and DBMS_SQL. It is, however, very difficult to make those queries cover all possible cases without, in effect, parsing PL/SQL code. For example, I could write code like this:

BEGIN
   EXECUTE /* a comment */ IMMEDIATE ...
END;

Now with the enhanced PL/Scope in Oracle Database 12.2, it’s a breeze to write a query to unambiguously locate all native dynamic SQL statements, because two possible values in the TYPE column of ALL_STATEMENTS are EXECUTE IMMEDIATE and OPEN.

In addition, if you are able to convince your DBA to recompile the SYS.DBMS_SQL package with PL/Scope turned on, you can then query the contents of ALL_IDENTIFIERS for program units that reference that built-in package in any way, as shown in Figure 2.

o17plsql-f2

Figure 2. A better search for SQL injection risks in ALL_IDENTIFIERS

In Figure 2, lines 1 through 9 show how easy it is to find all statements in your PL/SQL code base that take advantage of native dynamic SQL. Lines 11 through 25 show how to find all references to the DBMS_SQL package. Note in particular the subquery on lines 17 through 22. I get all the signatures of identifiers declared in the DBMS_SQL package specification. I then look, in line 16, for any matches for those signatures across my code base, excluding any code owned by SYS.

Getting a Handle on SQL Hints

As perhaps too many developers know, you can add hints to your SQL statements to direct the optimizer to alter the plan it comes up with for the execution of your statement. Tom Kyte has called hints a “path of last resort,” because (1) there are usually ways to change the SQL statement or update statistics to fix the problem, (2) your hints might block the optimizer from coming up with a better plan, and (3) data changes over time and could cause a hint that works today to cause big problems tomorrow.

Unfortunately, when developers are working hard on deadline, they will take the shortest, quickest path to a solution that works right now. Over time, applications can end up littered with hints that cause problems that are hard to track down.

But tracking down old hints is not so hard in Oracle Database 12.2. First, compile your program units with statement gathering enabled. Then run the following query against ALL_STATEMENTS to find all statements with one or more hints:

SELECT owner,
       object_name,
       line,
       full_text
  FROM all_statements
 WHERE has_hint = 'YES'

Know Thy Code, Improve Thy Code

Being able to slice and dice code in a variety of ways to analyze the impact of changes or understand more fully how that code is written is critical to your success. Enhancements in PL/Scope in Oracle Database 12c Release 2 give you powerful new tools for achieving a deep and thorough understanding of your code. This improved understanding will result in the development and delivery of higher-quality applications.

Next Steps

LEARN more about Oracle Database 12c.

MEET the Oracle Developer Advocates team.

TEST your SQL and PL/SQL knowledge.

Share

Powerful Impact Analysis

Easily identify program units affected by database object changes with PL/Scope in Oracle Database 12c Release 2.