Skip to Main Content

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, USer.

Asked: April 11, 2017 - 1:57 pm UTC

Last updated: April 13, 2017 - 1:59 am UTC

Version: 12.1.0.2

Viewed 1000+ times

You Asked

Hi,

Please have a look at the two SQL statements below, in one I am using RANK and in another I am using ROW_NUMBER.
Notice that the cardinality estimate is correct in case we use RANK and in case of ROW_NUMBER it changes to 1.
Am I doing something wrong?

Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production 0
PL/SQL Release 12.1.0.2.0 - Production 0
CORE 12.1.0.2.0 Production 0
TNS for Linux: Version 12.1.0.2.0 - Production 0
NLSRTL Version 12.1.0.2.0 - Production 0

SELECT a.renamed_1_id
     , a.renamed_2
     , a.name
     , a.statUS
  FROM (SELECT renamed_1_ID
             , renamed_2
             , NAME
             , STATUS
             , RANK ()
               OVER (
                  PARTITION BY NAME
                  ORDER BY
                     DECODE (STATUS
                           , 'ACTIVE', 1
                           , 'SUSPENDED', 2    
                           , 3)
                   , last_update_time DESC)
                  RN
          FROM nme_renamed_1) a
 WHERE a.RN = 1;
          
Plan
SELECT STATEMENT  ALL_ROWSCost: 4  Bytes: 7,200  Cardinality: 72     
 3 VIEW <<SchemaName>>. Cost: 4  Bytes: 7,200  Cardinality: 72    
  2 WINDOW SORT PUSHED RANK  Cost: 4  Bytes: 3,960  Cardinality: 72   
   1 TABLE ACCESS FULL TABLE <<SchemaName>>.nme_renamed_1 Cost: 3  Bytes: 3,960  Cardinality: 72  

SELECT a.renamed_1_ID
     , a.renamed_2
     , a.NAME
     , a.STATUS
  FROM (SELECT renamed_1_ID
             , renamed_2
             , NAME
             , STATUS
             , ROW_NUMBER ()
               OVER (
                  PARTITION BY NAME
                  ORDER BY
                     DECODE (STATUS
                           , 'ACTIVE', 1
                           , 'SUSPENDED', 2    
                           , 3)
                   , last_update_time DESC)
                  RN
          FROM nme_renamed_1) a
 WHERE a.RN = 1;


Plan
SELECT STATEMENT  ALL_ROWSCost: 4  Bytes: 100  Cardinality: 1     
 3 VIEW  <<SchemaName>>. Cost: 4  Bytes: 100  Cardinality: 1    
  2 WINDOW SORT PUSHED RANK  Cost: 4  Bytes: 3,960  Cardinality: 72   
   1 TABLE ACCESS FULL TABLE  <<SchemaName>>.nme_renamed_1 Cost: 3  Bytes: 3,960  Cardinality: 72  



Regards!!

and Chris said...

Row_number() assigns a new value for each row. Thus you're guaranteed to only have one row with the value 1.

Rank() assigns ties the same value. So you could have any number of rows with rank = 1:

create table t as
  select 1 x from dual connect by level <= 5;
  
exec dbms_stats.gather_table_stats(user, 't');

select row_number() over (order by x) rn ,
       rank() over (order by x) rk
from   t;

RN  RK  
1   1   
2   1   
3   1   
4   1   
5   1


Your query with row_number() can only return one row. So the cardinality estimate for this (1) is correct.

When using rank() you could have anywhere between 1-72 rows returned (assuming the estimate at step 2 is correct). The optimizer is going with the worst case scenario here:

set serveroutput off

with rns as (
  select row_number() over (order by x) rn ,
         rank() over (order by x) rk
  from t
)
  select * from rns
  where  rk = 1;

select * from table(dbms_xplan.display_cursor(null, null, 'BASIC +ROWS'));

PLAN_TABLE_OUTPUT                                                        
EXPLAINED SQL STATEMENT:                                                 
------------------------                                                 
with rns as (   select row_number() over (order by x) rn ,               
rank() over (order by x) rk   from t )   select * from rns   where  rk   
= 1                                                                      
                                                                         
Plan hash value: 2764225767                                              
                                                                         
-------------------------------------------------                        
| Id  | Operation                | Name | Rows  |                        
-------------------------------------------------                        
|   0 | SELECT STATEMENT         |      |       |                        
|   1 |  VIEW                    |      |     5 |                        
|   2 |   WINDOW SORT PUSHED RANK|      |     5 |                        
|   3 |    TABLE ACCESS FULL     | T    |     5 |                        
-------------------------------------------------

with rns as (
  select row_number() over (order by x) rn ,
         rank() over (order by x) rk
  from t
)
  select * from rns
  where  rn = 1;
  
select * from table(dbms_xplan.display_cursor(null, null, 'BASIC +ROWS'));

PLAN_TABLE_OUTPUT                                                        
EXPLAINED SQL STATEMENT:                                                 
------------------------                                                 
with rns as (   select row_number() over (order by x) rn ,               
rank() over (order by x) rk   from t )   select * from rns   where  rn   
= 1                                                                      
                                                                         
Plan hash value: 2764225767                                              
                                                                         
-------------------------------------------------                        
| Id  | Operation                | Name | Rows  |                        
-------------------------------------------------                        
|   0 | SELECT STATEMENT         |      |       |                        
|   1 |  VIEW                    |      |     1 |                        
|   2 |   WINDOW SORT PUSHED RANK|      |     5 |                        
|   3 |    TABLE ACCESS FULL     | T    |     5 |                        
-------------------------------------------------  

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A slight different scenario

Rajeshwaran, April 11, 2017 - 4:47 pm UTC

A slight different scenario.

the object_id is unique across the entire table, and have 19 distict values in the OWNER column.

so given this query, during optimization using the available column statistics, why does't the optimizer come up with the estimated cardinality as 19 (for WINDOW SORT PUSHED RANK| step) rather than 60757 ?

(since within each partition the object_id is unique, because the object_id is unique across the entire table itself)
select *
from (
select t.* ,
  row_number() over( partition by owner 
    order by object_id ) rn 
from t
     )
where rn =1


demo@ORA12C> create table t as select * from all_objects;

Table created.

demo@ORA12C> create unique index t_idx on t(object_id);

Index created.

demo@ORA12C>
demo@ORA12C> column column_name format a10
demo@ORA12C> select column_name, num_distinct
  2  from user_tab_col_statistics
  3  where table_name ='T'
  4  and column_name ='OWNER';

COLUMN_NAM NUM_DISTINCT
---------- ------------
OWNER                19

demo@ORA12C>
demo@ORA12C> select * from table(dbms_xplan.display_cursor(format=>'IOstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------
SQL_ID  af51v7qq567r1, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ * from ( select t.* ,
row_number() over( partition by owner   order by object_id ) rn from t
 ) where rn = 1

Plan hash value: 3047187157

----------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |      1 |        |     19 |00:00:00.21 |    1227 |   1207 |
|*  1 |  VIEW                    |      |      1 |  60757 |     19 |00:00:00.21 |    1227 |   1207 |
|*  2 |   WINDOW SORT PUSHED RANK|      |      1 |  60757 |     19 |00:00:00.21 |    1227 |   1207 |
|   3 |    TABLE ACCESS FULL     | T    |      1 |  60757 |  60757 |00:00:00.09 |    1227 |   1207 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RN"=1)
   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "OWNER" ORDER BY "OBJECT_ID")<=1)


23 rows selected.

demo@ORA12C>
demo@ORA12C> select * from table(dbms_xplan.display_cursor(format=>'IOstats last'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------
SQL_ID  230y1bp6bdydm, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ * from ( select t.* ,  rank()
over( partition by owner   order by object_id ) rn from t   ) where rn
= 1

Plan hash value: 3047187157

----------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |      1 |        |     19 |00:00:00.15 |    1227 |     54 |
|*  1 |  VIEW                    |      |      1 |  60757 |     19 |00:00:00.15 |    1227 |     54 |
|*  2 |   WINDOW SORT PUSHED RANK|      |      1 |  60757 |     19 |00:00:00.15 |    1227 |     54 |
|   3 |    TABLE ACCESS FULL     | T    |      1 |  60757 |  60757 |00:00:00.01 |    1227 |     54 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RN"=1)
   2 - filter(RANK() OVER ( PARTITION BY "OWNER" ORDER BY "OBJECT_ID")<=1)


23 rows selected.

demo@ORA12C>

Connor McDonald
April 13, 2017 - 1:26 am UTC

I think might be asking a bit much of the optimizer :-)

Partition By clause

USer, April 12, 2017 - 1:17 pm UTC

Thanks Chris!!
So it would remain the same even if I add a partition by clause, have a look at the cardinality estimates when query returns 2 records

drop table t;

Table dropped.

create table t as
  select 1 x, CASE WHEN  level <=5 then 10 else 20 end y
  from dual connect by level <= 10;

Table created.
  
exec dbms_stats.gather_table_stats(user, 't');

PL/SQL procedure successfully completed.

select row_number() over (partition by y order by x) rn ,
       rank() over (partition by y  order by x) rk
from   t;

        RN         RK
---------- ----------
         1          1
         2          1
         3          1
         4          1
         5          1
         1          1
         2          1
         3          1
         4          1
         5          1

10 rows selected.

with rns as (
  select row_number() over (partition by y order by x) rn ,
         rank() over (partition by y order by x) rk
  from t
)
  select * from rns
  where  rk = 1;

        RN         RK
---------- ----------
         1          1
         2          1
         3          1
         4          1
         5          1
         1          1
         2          1
         3          1
         4          1
         5          1

10 rows selected.

select * from table(dbms_xplan.display_cursor(null, null, 'BASIC +ROWS'));

PLAN_TABLE_OUTPUT                                                               
--------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:                                                        
------------------------                                                        
with rns as (   select row_number() over (partition by y order by x) rn         
,          rank() over (partition by y order by x) rk   from t )                
select * from rns   where  rk = 1                                               
                                                                                
Plan hash value: 3047187157                                                     
                                                                                
-------------------------------------------------                               
| Id  | Operation                | Name | Rows  |                               
-------------------------------------------------                               
|   0 | SELECT STATEMENT         |      |       |                               
|   1 |  VIEW                    |      |    10 |                               
|   2 |   WINDOW SORT PUSHED RANK|      |    10 |                               
|   3 |    TABLE ACCESS FULL     | T    |    10 |                               
-------------------------------------------------                               
                                                                                

17 rows selected.

  
with rns as (
  select row_number() over (partition by y order by x) rn ,
         rank() over (partition by y order by x) rk
  from t
)
  select * from rns
  where  rn = 1;  
  
        RN         RK
---------- ----------
         1          1
         1          1

2 rows selected.
  
select * from table(dbms_xplan.display_cursor(null, null, 'BASIC +ROWS'));

PLAN_TABLE_OUTPUT                                                               
--------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:                                                        
------------------------                                                        
with rns as (   select row_number() over (partition by y order by x) rn         
,          rank() over (partition by y order by x) rk   from t )                
select * from rns   where  rn = 1                                               
                                                                                
Plan hash value: 3047187157                                                     
                                                                                
-------------------------------------------------                               
| Id  | Operation                | Name | Rows  |                               
-------------------------------------------------                               
|   0 | SELECT STATEMENT         |      |       |                               
<b>|   1 |  VIEW                    |      |     1 |  </b>                             
|   2 |   WINDOW SORT PUSHED RANK|      |    10 |                               
|   3 |    TABLE ACCESS FULL     | T    |    10 |                               
-------------------------------------------------                               
                                                                                

17 rows selected.


Ah so adding the partition by clause does not changes the estimates. Even when 2 rows are returned the carinality is estimated as 1.

<u>The same happens even if we have more rows.</u>

SQL> 
SQL> drop table t;

Table dropped.

SQL> 
SQL> create table t as
  2    select 1 x, MOD(level,5)+1 y
  3    from dual connect by level <= 50;

Table created.

SQL>   
SQL> exec dbms_stats.gather_table_stats(user, 't');

PL/SQL procedure successfully completed.

SQL> 
SQL> select row_number() over (partition by y order by x) rn ,
  2         rank() over (partition by y  order by x) rk
  3  from   t;

        RN         RK
---------- ----------
         1          1
         2          1
         3          1
         4          1
         5          1
         6          1
         7          1
         8          1
         9          1
        10          1
         1          1

        RN         RK
---------- ----------
         2          1
         3          1
         4          1
         5          1
         6          1
         7          1
         8          1
         9          1
        10          1
         1          1
         2          1

        RN         RK
---------- ----------
         3          1
         4          1
         5          1
         6          1
         7          1
         8          1
         9          1
        10          1
         1          1
         2          1
         3          1

        RN         RK
---------- ----------
         4          1
         5          1
         6          1
         7          1
         8          1
         9          1
        10          1
         1          1
         2          1
         3          1
         4          1

        RN         RK
---------- ----------
         5          1
         6          1
         7          1
         8          1
         9          1
        10          1

50 rows selected.

SQL> 
SQL> with rns as (
  2    select row_number() over (partition by y order by x) rn ,
  3           rank() over (partition by y order by x) rk
  4    from t
  5  )
  6    select * from rns
  7    where  rk = 1;

        RN         RK
---------- ----------
         1          1
         2          1
         3          1
         4          1
         5          1
         6          1
         7          1
         8          1
         9          1
        10          1
         1          1

        RN         RK
---------- ----------
         2          1
         3          1
         4          1
         5          1
         6          1
         7          1
         8          1
         9          1
        10          1
         1          1
         2          1

        RN         RK
---------- ----------
         3          1
         4          1
         5          1
         6          1
         7          1
         8          1
         9          1
        10          1
         1          1
         2          1
         3          1

        RN         RK
---------- ----------
         4          1
         5          1
         6          1
         7          1
         8          1
         9          1
        10          1
         1          1
         2          1
         3          1
         4          1

        RN         RK
---------- ----------
         5          1
         6          1
         7          1
         8          1
         9          1
        10          1

50 rows selected.

SQL> 
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'BASIC +ROWS'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:
------------------------
with rns as (   select row_number() over (partition by y order by x) rn
,          rank() over (partition by y order by x) rk   from t )
select * from rns   where  rk = 1

Plan hash value: 3047187157

-------------------------------------------------
| Id  | Operation                | Name | Rows  |
-------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |       |
|   1 |  VIEW                    |      |    50 |
|   2 |   WINDOW SORT PUSHED RANK|      |    50 |
|   3 |    TABLE ACCESS FULL     | T    |    50 |
-------------------------------------------------


17 rows selected.

SQL>   
SQL> with rns as (
  2    select row_number() over (partition by y order by x) rn ,
  3           rank() over (partition by y order by x) rk
  4    from t
  5  )
  6    select * from rns
  7    where  rn = 1;  

        RN         RK
---------- ----------
         1          1
         1          1
         1          1
         1          1
         1          1

SQL>   
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'BASIC +ROWS'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:
------------------------
with rns as (   select row_number() over (partition by y order by x) rn
,          rank() over (partition by y order by x) rk   from t )
select * from rns   where  rn = 1

Plan hash value: 3047187157

-------------------------------------------------
| Id  | Operation                | Name | Rows  |
-------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |       |
<b>|   1 |  VIEW                    |      |     1 |</b>
|   2 |   WINDOW SORT PUSHED RANK|      |    50 |
|   3 |    TABLE ACCESS FULL     | T    |    50 |
-------------------------------------------------

Is this correct?




Connor McDonald
April 13, 2017 - 1:59 am UTC

That would appear to be a limitation of the optimizer.

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.