You Asked

I have two queries -

1. I see sorts(Disk) in the autotrace output for a query.What is actually sort(Disk) and how it works. Is the rowsets are brought in memory in chunks, sorted and written back to temp tablespace. After which the chunks are merged again on the sort key. I am bit confused.

2. Where is the intermediate results of the query is stored in PGA? I think not is Hash or Sort area. I could not find this on google.

Below is the query/plan on self join with a large table(more than million). Where is the result of operation 5 and 6 are stored before the hash join.

select /*+ full(t1) full(t2) full(t0) */ count(*) from TRANSACTIONTABLE t0,TRANSACTIONTABLE t1,TRANSACTIONTABLE t2 where t0.Col1 = t2.Col2 and t1.Col2 = t2.Col1

-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 24 | | 120K (1)| 00:00:05 |
| 1 | SORT AGGREGATE | | 1 | 24 | | | |
|* 2 | HASH JOIN | | 8455K| 193M| 134M| 120K (1)| 00:00:05 |
| 3 | TABLE ACCESS FULL | TRANSACTIONTABLE | 7823K| 44M| | 37309 (1)| 00:00:02 |
|* 4 | HASH JOIN | | 8133K| 139M| 134M| 78354 (1)| 00:00:04 |
| 5 | TABLE ACCESS FULL| TRANSACTIONTABLE | 7823K| 44M| | 37309 (1)| 00:00:02 |
| 6 | TABLE ACCESS FULL| TRANSACTIONTABLE | 7823K| 89M| | 37309 (1)| 00:00:02 |
-----------------------------------------------------------------------------------------------

and Connor said...

We *try* to sort stuff in memory (because obviously thats the most efficient means of doing so). This is called an "optimal" sort (ie, in memory)

If we can't, then we will *still* use memory, but because we can't sort the entire set, we sort a chunk of it, and then dump that to disk. Then we sort the next chunk, and so forth.

Now we have 'n' sorted chunks on disk - we then read them and merge the results to provide a single sorted result. This is called a 1-pass sort, ie, we had to pass through the result on disk once.

Sometimes the data to be sorted is so large, than when we come to merge the sorted chunks, we have to dump a merged chunk back down to disk. We'll then come back later and so *another* merge operation. This is called a multi-pass sort, and is obviously the slowest.

The "sorts (disk)" reflects the latter two, but you can see more granular detail in some of the performance views, eg

SQL> desc V$SQL_WORKAREA
 Name                                                                    Null?    Type
 ----------------------------------------------------------------------- -------- -----------------
 ADDRESS                                                                          RAW(8)
 HASH_VALUE                                                                       NUMBER
 SQL_ID                                                                           VARCHAR2(13)
 CHILD_NUMBER                                                                     NUMBER
 WORKAREA_ADDRESS                                                                 RAW(8)
 OPERATION_TYPE                                                                   VARCHAR2(40)
 OPERATION_ID                                                                     NUMBER
 POLICY                                                                           VARCHAR2(10)
 ESTIMATED_OPTIMAL_SIZE                                                           NUMBER
 ESTIMATED_ONEPASS_SIZE                                                           NUMBER
 LAST_MEMORY_USED                                                                 NUMBER
 LAST_EXECUTION                                                                   VARCHAR2(10)
 LAST_DEGREE                                                                      NUMBER
 TOTAL_EXECUTIONS                                                                 NUMBER
 OPTIMAL_EXECUTIONS                                                               NUMBER
 ONEPASS_EXECUTIONS                                                               NUMBER
 MULTIPASSES_EXECUTIONS                                                           NUMBER
 ACTIVE_TIME                                                                      NUMBER
 MAX_TEMPSEG_SIZE                                                                 NUMBER
 LAST_TEMPSEG_SIZE                                                                NUMBER
 CON_ID                                                                           NUMBER

Hash areas are not dissimilar - we put results of the hash in memory if we can, otherwise we'll dump chunks out to disk as we go.

Rating

(2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Thanks for the first part of the queston

Sujit, February 09, 2017 - 5:09 am UTC

Hi Connor,
Thanks for the clear explanation for first part of the question. Please let me know answer to the second part of the question.

February 10, 2017 - 5:10 pm UTC

If it can fit in memory, the PGA.

If it cannot, it will be in temp.

Thanks Connor..Please let me know about the part 2 of the question

Sujit Pandey, February 09, 2017 - 7:01 am UTC

Thanks a lot Connor for the clear explanation..
Please let me know answer to part 2 of the question.

More to Explore

Hints

Need more information on hints? Check out the complete hint descriptions

Breadcrumb

Question and Answer

You Asked

and Connor said...

Rating

Comments

Thanks for the first part of the queston

Thanks Connor..Please let me know about the part 2 of the question

More to Explore

Hints

How Sorts (Disk) in query works

Breadcrumb

Question and Answer

You Asked

and Connor said...

Rating

Comments

Thanks for the first part of the queston

Thanks Connor..Please let me know about the part 2 of the question

More to Explore

Hints