You Asked

Hi Tom,

Thanks for taking time to read my question. I have a query that joins about 36 tables. Driving table has about 43 million records. 5 other tables have about 9 million records. Rest of the tables are small. Its a combination of inner and left outer joins I am selecting 950 columns by joining these 36 tables. When I do count(*) on this query, I am getting the count of 43 million in about 10-12 mins. If I replace the count(*) with the 950 columns in running for about 5 hrs to give first few hunderd rows. All the code after the "from" is same in both the cases.

Can you please help me why its taking longer when I select 950 columns? I thought count(*) will take more time as it has to fetch all the records ?

Also, do you think there is a better approach or better way of doing this, instead of joining 36 tables and selecting 950 columns ?

Thanks again for all your help

and Connor said...

Changing the selected columns is of course a change to the query, and hence the optimizer may come up with a different means to run it, eg

SQL> create table t as select * from dba_objects;

Table created.

SQL>
SQL> create index ix on t ( object_id );

Index created.

SQL>
SQL> set autotrace traceonly explain
SQL> select created
  2  from   t
  3  where  object_id > 0 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 78256 |   993K|   424   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    | 78256 |   993K|   424   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_ID">0)

SQL>
SQL> select count(*)
  2  from   t
  3  where  object_id > 0 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 110194766

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |     5 |    50   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE       |      |     1 |     5 |            |          |
|*  2 |   INDEX FAST FULL SCAN| IX   | 78256 |   382K|    50   (2)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID">0)

Now I assume you want to use count(*) to avoid getting your 43 million rows back. You could aggregate your columns, eg

SQL> select created,
  2         owner,
  3         object_name
  4  from   t
  5  where  object_id > 0 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 78256 |  4050K|   424   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    | 78256 |  4050K|   424   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_ID">0)

SQL>
SQL> select max(created),
  2         max(owner),
  3         max(object_name)
  4  from   t
  5  where  object_id > 0 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 2966233522

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    53 |   424   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |    53 |            |          |
|*  2 |   TABLE ACCESS FULL| T    | 78256 |  4050K|   424   (1)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID">0)

which is not a guarantee of getting the same plan, is doing the effort of getting all of the columns you will be requiring for your true query, and hence gives you a good idea of the effort that will be involved. Or you can look at getting the plan for the existing query, and nesting into a WITH query, eg

SQL> explain plan for
  2  select created,
  3         owner,
  4         object_name
  5  from   t
  6  where  object_id > 0 ;

Explained.

SQL>
SQL> SELECT * from table(dbms_xplan.display(format=>'typical +outline'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 78256 |  4050K|   424   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    | 78256 |  4050K|   424   (1)| 00:00:01 |
--------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      FULL(@"SEL$1" "T"@"SEL$1")
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('12.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */


SQL> set autotrace traceonly explain
SQL> with x as
  2  (
  3  select
  4    /*+
  5        BEGIN_OUTLINE_DATA
  6        FULL(@"SEL$1" "T"@"SEL$1")
  7        OUTLINE_LEAF(@"SEL$1")
  8        ALL_ROWS
  9        DB_VERSION('12.2.0.1')
 10        OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
 11        IGNORE_OPTIM_EMBEDDED_HINTS
 12        END_OUTLINE_DATA
 13    */
 14         created,
 15         owner,
 16         object_name
 17  from   t
 18  where  object_id > 0
 19  )
 20  select count(*)
 21  from  x;

Execution Plan
----------------------------------------------------------
Plan hash value: 3715821533

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |       |   424   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |       |            |          |
|   2 |   VIEW              |      | 78256 |       |   424   (1)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| T    | 78256 |   382K|   424   (1)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("OBJECT_ID">0)

But there's some obvious questions here....

1) A 36 table join sounds ... well ... pretty complex
2) 950 columns sounds ... well ... a lot
3) 43 million rows

Unless you're loading another table...I can't think of any requirement that would want a 43million x 950 column result.

Is this answer out of date? If it is, please let us know via a Comment

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.

Breadcrumb

Question and Answer

You Asked

and Connor said...

More to Explore

Performance

Query Performance Large SQL

Breadcrumb

Question and Answer

You Asked

and Connor said...

More to Explore

Performance