Skip to Main Content

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, sabari.

Asked: September 19, 2017 - 7:45 pm UTC

Last updated: September 20, 2017 - 6:35 am UTC

Version: oracle 11g

Viewed 1000+ times

You Asked

Hi Tom,

Thanks for taking time to read my question. I have a query that joins about 36 tables. Driving table has about 43 million records. 5 other tables have about 9 million records. Rest of the tables are small. Its a combination of inner and left outer joins I am selecting 950 columns by joining these 36 tables. When I do count(*) on this query, I am getting the count of 43 million in about 10-12 mins. If I replace the count(*) with the 950 columns in running for about 5 hrs to give first few hunderd rows. All the code after the "from" is same in both the cases.

Can you please help me why its taking longer when I select 950 columns? I thought count(*) will take more time as it has to fetch all the records ?

Also, do you think there is a better approach or better way of doing this, instead of joining 36 tables and selecting 950 columns ?

Thanks again for all your help


and Connor said...

Changing the selected columns is of course a change to the query, and hence the optimizer may come up with a different means to run it, eg

SQL> create table t as select * from dba_objects;

Table created.

SQL>
SQL> create index ix on t ( object_id );

Index created.

SQL>
SQL> set autotrace traceonly explain
SQL> select created
  2  from   t
  3  where  object_id > 0 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 78256 |   993K|   424   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    | 78256 |   993K|   424   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_ID">0)

SQL>
SQL> select count(*)
  2  from   t
  3  where  object_id > 0 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 110194766

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |     5 |    50   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE       |      |     1 |     5 |            |          |
|*  2 |   INDEX FAST FULL SCAN| IX   | 78256 |   382K|    50   (2)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID">0)


Now I assume you want to use count(*) to avoid getting your 43 million rows back. You could aggregate your columns, eg

SQL> select created,
  2         owner,
  3         object_name
  4  from   t
  5  where  object_id > 0 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 78256 |  4050K|   424   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    | 78256 |  4050K|   424   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_ID">0)

SQL>
SQL> select max(created),
  2         max(owner),
  3         max(object_name)
  4  from   t
  5  where  object_id > 0 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 2966233522

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    53 |   424   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |    53 |            |          |
|*  2 |   TABLE ACCESS FULL| T    | 78256 |  4050K|   424   (1)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID">0)


which is not a guarantee of getting the same plan, is doing the effort of getting all of the columns you will be requiring for your true query, and hence gives you a good idea of the effort that will be involved. Or you can look at getting the plan for the existing query, and nesting into a WITH query, eg

SQL> explain plan for
  2  select created,
  3         owner,
  4         object_name
  5  from   t
  6  where  object_id > 0 ;

Explained.

SQL>
SQL> SELECT * from table(dbms_xplan.display(format=>'typical +outline'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 78256 |  4050K|   424   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    | 78256 |  4050K|   424   (1)| 00:00:01 |
--------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      FULL(@"SEL$1" "T"@"SEL$1")
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('12.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */


SQL> set autotrace traceonly explain
SQL> with x as
  2  (
  3  select
  4    /*+
  5        BEGIN_OUTLINE_DATA
  6        FULL(@"SEL$1" "T"@"SEL$1")
  7        OUTLINE_LEAF(@"SEL$1")
  8        ALL_ROWS
  9        DB_VERSION('12.2.0.1')
 10        OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
 11        IGNORE_OPTIM_EMBEDDED_HINTS
 12        END_OUTLINE_DATA
 13    */
 14         created,
 15         owner,
 16         object_name
 17  from   t
 18  where  object_id > 0
 19  )
 20  select count(*)
 21  from  x;

Execution Plan
----------------------------------------------------------
Plan hash value: 3715821533

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |       |   424   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |       |            |          |
|   2 |   VIEW              |      | 78256 |       |   424   (1)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| T    | 78256 |   382K|   424   (1)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("OBJECT_ID">0)



But there's some obvious questions here....

1) A 36 table join sounds ... well ... pretty complex
2) 950 columns sounds ... well ... a lot
3) 43 million rows

Unless you're loading another table...I can't think of any requirement that would want a 43million x 950 column result.


Is this answer out of date? If it is, please let us know via a Comment

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.