Skip to Main Content
  • Questions
  • Definition of SQL statement execution

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Dan.

Asked: August 05, 2009 - 1:35 pm UTC

Last updated: August 05, 2009 - 4:25 pm UTC

Version: 10.2.0

Viewed 10K+ times! This question is

You Asked

When we run a trace and TKPROF on a query (a select statement), we see timing information for three phases:

1. Parse
2. Execute
3. Fetch

Can you clarify exactly what it means to "execute" a statement?

The manual entry for TKPROF:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/sqltrace.htm#i4642

describes execution as "identifying the selected rows"

But this is somewhat vague:

1. Are the full query results materialized somewhere (in memory or on disk) during this phase, ready to be fetched? If this is not the case and we generate only pointers to rows in cache or disk, how do we "identify selected rows" when there are joins involved? i.e. what structures are created in memory or disk during execution?

2. I know that when we submit a query we are creating/opening a cursor - is it that a cursor just points to where the data is located whenever possible, and only when the nature of the query necessitates it (e.g. certain grouping, aggregation, joining) it is materialized to TEMP, and the cursor then points into TEMP somewhere?

This leads on to the next clarification:

3. Why is it that for some long-running queries with fairly complex execution plans (involving sorting, hashing etc.) we see an "execute" elapsed time of zero and all the time under the "parse" and "fetch" phase? i.e. exactly what operations are performed during "execution"? One would imagine that a significant amount of the work involved is in the sorting, hashing etc. which I assumed was part of the execution phase?

and Tom said...

think of it this way

1) parse - pretty well defined, that is prepareStatement - we do a soft or hard parse, compile the statement, figure out how to execute it.

2) execute - we OPEN the statement. For an update, for a delete, for an insert - that would be it, when you OPEN the statement, we execute it. All of the work happens here.

for select it is more complex. Most selects will do ZERO work during the execute. All we are doing is opening the cursor - the cursor is a pointer to the space in the shared pool where the plan is, your bind variable values, the SCN that represents the "as of" time for your query - in short the cursor at this point is your context, your virtual machine state, think of the SQL plan as if it were bytecode (it is) executed as a program (it is) in a virtual machine (it is). The cursor is your instruction pointer (where are you in the execution of this statement), your state (like registers), etc. Normally, a select does nothing here - it just "gets ready to rock and roll, the program is ready to go, but not yet really started".

However, there are exceptions to everything - turn on trace and do a select * from scott.emp FOR UPDATE. That is a select, but it is also an update. You would see work done during the execute as well as the fetch phase. The work done during the execute was that of going out and touching every row and locking it. The work done during the fetch phase was that of going out and retrieving the data back to the client.

3) fetch - this is where we see almost all of the work for SELECTS (and nothing really for the other DMLS as you do not fetch from an update).

There are two ways a SELECT might be processed. What I call a "quick return query" and a "slow return query"

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:275215756923#39255764276301

is an excerpt from Effective Oracle by Design describing this in depth, but suffice to say a query of the form:

select * from one_billion_row_table;

would not copy the data anywhere, would not need to access the last row before returning the first row. We would just read the data as you fetch it from the blocks it resides on.

However, a query of the form:

select * from one_billion_row_table order by unindexed_column;

that we would probably have to read the last row before returning the first row (since the last row read could well be the first row returned!) and we'd need to copy that somewhere (temp, sort area space) first.


In the case of the first query, if you:

parsed it (little work parsing)
opened it (no real world, just getting ready)
fetched 1 row and closed it

you would see VERY little work performed in the fetch phase, we'd just have to read one block probably to return the first record.

However, do the same steps against the second query and you would see the fetch of a single row do a TON of work - since we have to find the last row before the first can be returned.


see the above link, I demonstrate that.





Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Clears it all up

Dan, August 05, 2009 - 3:28 pm UTC

Thanks Tom,

This has made the process abundantly clear - a cursor is in effect a pointer into an instruction stack/"program" compiled from an execution plan in the internal VM, and depending on the nature of the query a fetch operation may directly grab a row/rows, or perform more complex logic involving many rows, aggregations, temporary result sets etc, before returning the specified number of rows.

Would you agree that the entry in the documentation:

EXECUTE
 
... For SELECT statements, this identifies the selected rows.


is somewhat misleading? As really the selected rows are not "identified" here, but are a product of the bytecode generated during the parse and "executed" during the fetch phase.

Oracle isn't even aware of how many rows there are at this point, and hasn't "identified" any rows in any meaningful way
Tom Kyte
August 05, 2009 - 4:24 pm UTC

not really, it does identify the rows. That is what the cursor state is all about.

They are identified - just as a recruiter identifies their set of candidates (needs 5 years of experience, proficient in language X, understands technology Y - those constraints identifies their pool of candidates). The set has been identified, categorized, pre-ordained if you will.

It just hasn't been materialized - we know what the answer will be, we just haven't put it together yet. The execute fixes the set of binds that will be used, the SCN that will be used - it identifies the set of values that will be used.


Although...

Dan, August 05, 2009 - 3:31 pm UTC

Although I suppose in a SELECT FOR UPDATE, relevant rows are identified and locked..
Tom Kyte
August 05, 2009 - 4:25 pm UTC

they are more than identified... they are actually found, materialized


http://www.thefreedictionary.com/identified

1. To establish the identity of.
2. To ascertain the origin, nature, or definitive characteristics of.

3. Biology To determine the taxonomic classification of (an organism).
4. To consider as identical or united; equate.
5. To associate or affiliate (oneself) closely with a person or group.
v.intr.
To establish an identification with another or others.


OK

Dan, August 06, 2009 - 6:49 am UTC

OK, I'll accept that it's a matter of semantics - for me, it has the connotation of finding specific rows i.e identifying -> establishing the IDs (especially in a computing field where we often talk of IDs, identifiers etc) rather than defining the characteristics of rows that would be selected.

I would read a recruiter "identifying" candidates as doing work to establish the identity of specific candidates (creating a list), rather than defining the characteristics of the set of potential candidates.

An advertiser "identifying a target market", on the other hand... ...But here the phrase contains additional information which makes the distinction clear.

In court, a witness identifies the criminal in the crowd by pointing out a specific person ("it was him"), rather than describing the person ("it was the man who was at the 32 Fleet Street at 11pm on Saturday") even if the constraints technically give enough information to derive the individual in question.

So I don't want to get into an argument about semantics, and accept that the documentation is technically correct by some accepted definitions of the word, just that it is open to some interpretation and it isn't always immediately clear what is being said.

At last!

Pier, August 06, 2009 - 8:07 am UTC

These were AND should have been AND are the things to be explained in detail FIRST and FOREMOST