When we run a trace and TKPROF on a query (a select statement), we see timing information for three phases:
1. Parse
2. Execute
3. Fetch
Can you clarify exactly what it means to "execute" a statement?
The manual entry for TKPROF:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/sqltrace.htm#i4642 describes execution as "identifying the selected rows"
But this is somewhat vague:
1. Are the full query results materialized somewhere (in memory or on disk) during this phase, ready to be fetched? If this is not the case and we generate only pointers to rows in cache or disk, how do we "identify selected rows" when there are joins involved? i.e. what structures are created in memory or disk during execution?
2. I know that when we submit a query we are creating/opening a cursor - is it that a cursor just points to where the data is located whenever possible, and only when the nature of the query necessitates it (e.g. certain grouping, aggregation, joining) it is materialized to TEMP, and the cursor then points into TEMP somewhere?
This leads on to the next clarification:
3. Why is it that for some long-running queries with fairly complex execution plans (involving sorting, hashing etc.) we see an "execute" elapsed time of zero and all the time under the "parse" and "fetch" phase? i.e. exactly what operations are performed during "execution"? One would imagine that a significant amount of the work involved is in the sorting, hashing etc. which I assumed was part of the execution phase?
think of it this way
1) parse - pretty well defined, that is prepareStatement - we do a soft or hard parse, compile the statement, figure out how to execute it.
2) execute - we OPEN the statement. For an update, for a delete, for an insert - that would be it, when you OPEN the statement, we execute it. All of the work happens here.
for select it is more complex. Most selects will do ZERO work during the execute. All we are doing is opening the cursor - the cursor is a pointer to the space in the shared pool where the plan is, your bind variable values, the SCN that represents the "as of" time for your query - in short
the cursor at this point is your context, your virtual machine state, think of the SQL plan as if it were bytecode (it is) executed as a program (it is) in a virtual machine (it is). The cursor is your instruction pointer (where are you in the execution of this statement), your state (like registers), etc. Normally, a select does nothing here - it just "gets ready to rock and roll, the program is ready to go, but not yet really started".
However, there are exceptions to everything - turn on trace and do a select * from scott.emp FOR UPDATE. That is a select, but it is also an update. You would see work done during the execute as well as the fetch phase. The work done during the execute was that of going out and touching every row and locking it. The work done during the fetch phase was that of going out and retrieving the data back to the client.
3) fetch - this is where we see almost all of the work for SELECTS (and nothing really for the other DMLS as you do not fetch from an update).
There are two ways a SELECT might be processed. What I call a "quick return query" and a "slow return query"
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:275215756923#39255764276301 is an excerpt from Effective Oracle by Design describing this in depth, but suffice to say a query of the form:
select * from one_billion_row_table;
would not copy the data anywhere, would not need to access the last row before returning the first row. We would just read the data as you fetch it from the blocks it resides on.
However, a query of the form:
select * from one_billion_row_table order by unindexed_column;
that we would probably have to read the last row before returning the first row (since the last row read could well be the first row returned!) and we'd need to copy that somewhere (temp, sort area space) first.
In the case of the first query, if you:
parsed it (little work parsing)
opened it (no real world, just getting ready)
fetched 1 row and closed it
you would see VERY little work performed in the fetch phase, we'd just have to read one block probably to return the first record.
However, do the same steps against the second query and you would see the fetch of a single row do a TON of work - since we have to find the last row before the first can be returned.
see the above link, I demonstrate that.