Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Eyal.

Asked: January 05, 2016 - 12:31 pm UTC

Last updated: June 06, 2019 - 2:17 pm UTC

Version: Oracle Database 11g Release 11.2.0.3.0

Viewed 10K+ times! This question is

You Asked

I have the following 2 tables:

CREATE TABLE accounts( 
  id NUMBER unique not null, 
  account_name VARCHAR2(30)
  );

CREATE TABLE log_data( 
  account_id NUMBER not null, 
  log_type NUMBER, 
  log_time TIMESTAMP,
  msg CLOB
  );


and the following data (5 accounts and millions of log records) and indexes
insert into accounts
SELECT ROWNUM, dbms_random.string('L', 20)
FROM   ( SELECT 1 just_a_column
         FROM   dual
         CONNECT BY LEVEL <= 5
       );
insert /*+append */ into log_data 
SELECT mod(rownum,5)+1, 2, SYSDATE-(ROWNUM), dbms_random.string('L', 20)
FROM   ( SELECT 1 just_a_column
         FROM   dual
         CONNECT BY LEVEL <= 100000
       );
commit;

create index log_date on log_data (account_id, log_time);


my requirement is to list the last 5 log messages for each account. Is there a way to do that which will scan the only 5 rows per account?
If I do something like the following:
SELECT /*+ gather_plan_statistics */
  *
FROM
  (
    SELECT 
      row_number() over (partition BY account_id order by log_time) rownumber,
      account_name,
      account_id,
      log_time,
      msg
    FROM
      accounts ac inner join 
      log_data logs on ac.id = logs.account_id
    order by account_id, log_time
  )
WHERE
  rownumber < 5;


Then the execution plan shows it goes through all the records:
----------------------------------------------------------------------------------------------------------------------------------  
| Id  | Operation                | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem | 
----------------------------------------------------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT         |          |      1 |        |     20 |00:00:00.21 |    1346 |   1341 |       |       |          | 
|*  1 |  VIEW                    |          |      1 |    125K|     20 |00:00:00.21 |    1346 |   1341 |       |       |          | 
|*  2 |   WINDOW SORT PUSHED RANK|          |      1 |    125K|     30 |00:00:00.21 |    1346 |   1341 |    24M|  1808K|   21M (0)| 
|*  3 |    HASH JOIN             |          |      1 |    125K|    100K|00:00:00.11 |    1346 |   1341 |   963K|   963K|  754K (0)| 
|   4 |     TABLE ACCESS FULL    | ACCOUNTS |      1 |      5 |      5 |00:00:00.01 |       7 |      6 |       |       |          | 
|   5 |     TABLE ACCESS FULL    | LOG_DATA |      1 |    125K|    100K|00:00:00.04 |    1339 |   1335 |       |       |          | 
----------------------------------------------------------------------------------------------------------------------------------- 

Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 

   1 - filter("ROWNUMBER"<5)                                                                                                        
   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "LOGS"."ACCOUNT_ID" ORDER BY "LOGS"."LOG_TIME")<5)                                   
   3 - access("AC"."ID"="LOGS"."ACCOUNT_ID")                                                                                        


Any idea how to optimize that? Basically, I'm looking for a way to get 30 in the A-Rows columns all the way?

and Chris said...

Thanks for providing a complete test case :)

Sayan Malakshinov has an article describing how to do this:

http://orasql.org/2012/09/21/distinct-values-by-index-topn/

Using his solution, I've plugged in your tables and data. As you can see, there's no longer a full scan of log_data and at most 20 A-rows in any step. The buffers used has dropped to 27, orders of magnitude less:

Note: this *requires* the index on (account_id, log_time) to give correct results! Without this, it'll give incorrect results "silently". Use with caution!

with t_unique ( account_id ) as
 ( 
 select min ( t1.account_id ) from log_data t1
 union all
 select
  ( select min ( t1.account_id ) from log_data t1 where t1.account_id>t.account_id
  )
 from t_unique t
 where account_id is not null
 )
select /*+ gather_plan_statistics use_nl (rids tt) */tt.*, a.account_name
  from t_unique v
     ,table(
           cast(
                multiset(
                         select tt.rowid rid
                         from log_data tt
                         where tt.account_id=v.account_id
                           and rownum<5
                         order by tt.log_time desc
                        )
                as sys.odcivarchar2list
               )
           ) rids
     ,log_data tt
   ,accounts a
 where tt.rowid=rids.column_value
 and   a.id = tt.account_id
 order by tt.account_id,tt.log_time desc;  

select * from table(dbms_xplan.display_cursor(null, null, 'BASIC +IOSTATS LAST'));

--------------------------------------------------------------------------------------------------------------------                                            
| Id  | Operation                                     | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |                                            
--------------------------------------------------------------------------------------------------------------------                                            
|   0 | SELECT STATEMENT                              |          |      1 |        |     20 |00:00:00.01 |      27 |                                            
|   1 |  SORT ORDER BY                                |          |      1 |     16M|     20 |00:00:00.01 |      27 |                                            
|*  2 |   HASH JOIN                                   |          |      1 |     16M|     20 |00:00:00.01 |      27 |                                            
|   3 |    TABLE ACCESS FULL                          | ACCOUNTS |      1 |      5 |      5 |00:00:00.01 |       7 |                                            
|   4 |    NESTED LOOPS                               |          |      1 |     16M|     20 |00:00:00.01 |      20 |                                            
|   5 |     NESTED LOOPS                              |          |      1 |  16336 |     20 |00:00:00.01 |      19 |                                            
|   6 |      VIEW                                     |          |      1 |      2 |      6 |00:00:00.01 |       9 |                                            
|   7 |       UNION ALL (RECURSIVE WITH) BREADTH FIRST|          |      1 |        |      6 |00:00:00.01 |       9 |                                            
|   8 |        SORT AGGREGATE                         |          |      1 |      1 |      1 |00:00:00.01 |       2 |                                            
|   9 |         INDEX FULL SCAN (MIN/MAX)             | LOG_DATE |      1 |      1 |      1 |00:00:00.01 |       2 |                                            
|  10 |        SORT AGGREGATE                         |          |      5 |      1 |      5 |00:00:00.01 |       7 |                                            
|  11 |         FIRST ROW                             |          |      5 |      1 |      4 |00:00:00.01 |       7 |                                            
|* 12 |          INDEX RANGE SCAN (MIN/MAX)           | LOG_DATE |      5 |      1 |      4 |00:00:00.01 |       7 |                                            
|  13 |        RECURSIVE WITH PUMP                    |          |      6 |        |      5 |00:00:00.01 |       0 |                                            
|  14 |      COLLECTION ITERATOR SUBQUERY FETCH       |          |      6 |   8168 |     20 |00:00:00.01 |      10 |                                            
|* 15 |       COUNT STOPKEY                           |          |      6 |        |     20 |00:00:00.01 |      10 |                                            
|* 16 |        INDEX RANGE SCAN DESCENDING            | LOG_DATE |      6 |   1023 |     20 |00:00:00.01 |      10 |                                            
|  17 |     TABLE ACCESS BY USER ROWID                | LOG_DATA |     20 |   1023 |     20 |00:00:00.01 |       1 |                                            
--------------------------------------------------------------------------------------------------------------------                                            
                                                                                                                                                                
Predicate Information (identified by operation id):                                                                                                             
---------------------------------------------------                                                                                                             
                                                                                                                                                                
   2 - access("A"."ID"="TT"."ACCOUNT_ID")                                                                                                                       
  12 - access("T1"."ACCOUNT_ID">:B1)                                                                                                                            
  15 - filter(ROWNUM<5)                                                                                                                                         
  16 - access("TT"."ACCOUNT_ID"=:B1)

Rating

  (15 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Wow! :)

Eyal, January 05, 2016 - 2:25 pm UTC

Explain plan looks perfect! Got some learning to do.. was not aware of the table\multiset operators

Fancy, but is it worth the added complexity?

Tubby, January 05, 2016 - 7:51 pm UTC

Neat solution to be sure, I'm just wondering if the complexity introduced is worth it.

I generally advocate SQL over PLSQL, but there are instances where maintainability and understandability come in to play as well (not every developer on your team is likely to be so well versed that this solution is understandable). And although I hate developing code for the lowest common denominator, sometimes it is warranted :)

Comparing the solution provided against something like

select /*+ gather_plan_statistics */
  *
from log_data l
where rowid in
(
  select
    the_rowid
  from
  (
    select
      rowid as the_rowid
    FROM
      log_data l1
    where
      l1.account_id = :THE_ID_PASSED_IN_TO_THIS
    order by log_time desc
  )
  where rownum <= 5
);


Would likely be a good idea. You would of course have to loop through the ACCOUNTS table in order to use the code above (passing the ID in to the currently declared bind of :THE_ID_PASSED_IN_TO_THIS), but you should find that the actual IO required is less with this approach.

Regardless of the solution the OP chooses to go with I would recommend that they investigate utilizing index key compression on the index for the LOG_DATA table.

create index log_date on log_data (account_id, log_time ) 

compress 1;


This will reduce the number of leaf blocks in the index given that account_id is such a repetitive value we would expect a pretty large savings. One caveat to that of course is that if this table is being hammered by many concurrent sessions we probably wouldn't want to use the compression as it would increase contention for the index (less leaf blocks means more sessions fighting for them).

Cheers,
Chris Saxon
January 06, 2016 - 12:44 am UTC

Thanks for your input.

Help me to understand ?

Rajeshwaran, Jeyabal, January 06, 2016 - 8:34 am UTC

my requirement is to list the last 5 log messages for each account.

Does this below code returns the last 5 log message or it return "any" 5 log message and sorts them in Descending order?
cast(
 multiset(
    select tt.rowid rid
    from log_data tt
    where tt.account_id=v.account_id
      and rownum<5
    order by tt.log_time desc
   )
 as sys.odcivarchar2list

Chris Saxon
January 06, 2016 - 2:46 pm UTC

The query only selects values from the log_date index (account_id, log_time and rowid). So Oracle can do an index range scan descending. Walking along the index like this will return the values in (descending) index order. Oracle will then stop when it has read 4 of these.

If Oracle chooses a FTS, then yes, the results will be random. To guard against this I should really have included the index_desc hint:

select /*+ full (tt) */
       tt.rowid rid,
       tt.*
from   log_data tt
where  tt.account_id = 1
and    rownum < 5
order by tt.log_time desc;

RID                ACCOUNT_ID   LOG_TYPE LOG_TIME             MSG                                                                            
------------------ ---------- ---------- -------------------- ----------------------
AAAZQXAAEAAAAL7AAE          1          2 18-OCT-2015 10.00.12 qxeasqrimiitvxvagbve
AAAZQXAAEAAAAL7AAJ          1          2 13-OCT-2015 10.00.12 ozdefbeevltjshpdxwrh
AAAZQXAAEAAAAL7AAO          1          2 08-OCT-2015 10.00.12 qmzwmnvpgjhwhntrgrsd
AAAZQXAAEAAAAL7AAT          1          2 03-OCT-2015 10.00.12 insyepumqfgsnoksnqbn

select /*+ index_desc(tt log_date) */
       tt.rowid rid,
       tt.*
from   log_data tt
where  tt.account_id = 1
and    rownum < 5
order by tt.log_time desc;

RID                ACCOUNT_ID   LOG_TYPE LOG_TIME             MSG                                                                            
------------------ ---------- ---------- -------------------- ----------------------
AAAZQXAAEAAAAL/AAE          1          2 01-JAN-2016 10.00.12 wuppjcnqhguioetwzgwa
AAAZQXAAEAAAAL/AAJ          1          2 27-DEC-2015 10.00.12 lmiamhpanhfgcznfoshy
AAAZQXAAEAAAAL/AAO          1          2 22-DEC-2015 10.00.12 ipathrnhpijnusikjmjn
AAAZQXAAEAAAAL/AAT          1          2 17-DEC-2015 10.00.12 jdvomodbmghkeuygfyuf



Index_desc Hint

Rajeshwaran Jeyabal, January 06, 2016 - 11:43 am UTC

To guard against this I should really have included the index_desc hint: - No, not really a little bit of plsql could help here.

loaded few more data into "log_data" table to make it big.

set feedback off
drop table accounts purge;
drop table log_data purge;
CREATE TABLE accounts( 
  id NUMBER unique not null, 
  account_name VARCHAR2(30)
  );
CREATE TABLE log_data( 
  account_id NUMBER not null, 
  log_type NUMBER, 
  log_time TIMESTAMP,
  msg varchar2(50)
  );
insert into accounts
SELECT ROWNUM, dbms_random.string('L', 20)
FROM   ( SELECT 1 just_a_column
         FROM   dual
         CONNECT BY LEVEL <= 5
       );
commit;    
alter session force parallel dml parallel 4   ;
insert /*+append */ into log_data 
SELECT mod(rownum,5)+1, 2, SYSDATE-(ROWNUM), dbms_random.string('L', 20)
FROM   ( SELECT 1 just_a_column
         FROM   dual
         CONNECT BY LEVEL <= 100000
       ) ;
commit;
begin
 for x in 1..5
 loop
  dbms_application_info.set_client_info('x ='||x);
  insert into log_data
  SELECT mod(rownum,5)+1, 2, SYSDATE-(ROWNUM), msg
  from log_data;
  commit;
 end loop;
end;
/ 
alter session disable parallel dml;
create index log_date on log_data (account_id, log_time);  
begin
 dbms_stats.gather_table_stats(user,'accounts');
 dbms_stats.gather_table_stats(user,'log_data',degree=>4,
  method_opt=>'for all indexed columns size 254');
end;
/
set feedback on


So for single account-id, got an index access.

rajesh@ORA11G> set autotrace traceonly explain
rajesh@ORA11G> select *
  2  from (
  3  select to_char(log_time,'yyyymmddhhmiss')||msg txt
  4  from log_data
  5  where account_id = 1
  6  order by log_time desc
  7       )
  8  where rownum <= 5;

Execution Plan
----------------------------------------------------------
Plan hash value: 1871793034

-------------------------------------------------------------------------------------------
| Id  | Operation                      | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |          |     5 |   170 |     8   (0)| 00:00:01 |
|*  1 |  COUNT STOPKEY                 |          |       |       |            |          |
|   2 |   VIEW                         |          |     6 |   204 |     8   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID | LOG_DATA |   629K|    22M|     8   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN DESCENDING| LOG_DATE |     6 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=5)
   4 - access("ACCOUNT_ID"=1)


When i used to have Table un-nesting, it fails.(because T2 is deeply nested)

rajesh@ORA11G> select t1.id,t1.account_name,
  2          to_date(substr(t2.column_value,1,14),'yyyymmddhhmiss') as log_date,
  3          substr(column_value,15) as msg
  4  from accounts t1,
  5    table( cast( multiset(select *
  6    from (
  7    select to_char(log_time,'yyyymmddhhmiss')||msg txt
  8    from log_data t2
  9    where t2.account_id = t1.id
 10    order by log_time desc
 11         )
 12    where rownum <= 5) as sys.odcivarchar2list)) ;
  where t2.account_id = t1.id
                        *
ERROR at line 9:
ORA-00904: "T1"."ID": invalid identifier



with little bit of plsql, I am able to achieve it.

rajesh@ORA11G> create or replace function foo(p_id number)
  2  return sys.odcivarchar2list
  3  as
  4    l_data sys.odcivarchar2list ;
  5  begin
  6     select txt
  7        bulk collect into l_data
  8      from (
  9      select to_char(log_time,'yyyymmddhhmiss')||msg txt
 10      from log_data
 11      where account_id = p_id
 12      order by log_time desc
 13           )
 14    where rownum <= 5 ;
 15    return l_data;
 16  end;
 17  /

Function created.

rajesh@ORA11G>
rajesh@ORA11G> select t1.id,t1.account_name,
  2      to_date(substr(column_value,1,14),'yyyymmddhhmiss') as log_date,
  3      substr(column_value,1,15) as msg
  4  from accounts t1,table(foo(t1.id)) ;

        ID ACCOUNT_NAME                   LOG_DATE    MSG
---------- ------------------------------ ----------- ---------------
         1 zcwfnjvypviddxbrmgtk           01-JAN-2016 20160101043324h
         1 zcwfnjvypviddxbrmgtk           01-JAN-2016 20160101043322h
         1 zcwfnjvypviddxbrmgtk           01-JAN-2016 20160101043321v
         1 zcwfnjvypviddxbrmgtk           01-JAN-2016 20160101043320u
         1 zcwfnjvypviddxbrmgtk           01-JAN-2016 20160101043320h
         2 hqujrgijcottwdryvpnl           05-JAN-2016 20160105043324d
         2 hqujrgijcottwdryvpnl           05-JAN-2016 20160105043322u
         2 hqujrgijcottwdryvpnl           05-JAN-2016 20160105043321r
         2 hqujrgijcottwdryvpnl           05-JAN-2016 20160105043320p
         2 hqujrgijcottwdryvpnl           05-JAN-2016 20160105043320r
         3 aecgkaueodjhxjtqjfpq           04-JAN-2016 20160104043324l
         3 aecgkaueodjhxjtqjfpq           04-JAN-2016 20160104043322b
         3 aecgkaueodjhxjtqjfpq           04-JAN-2016 20160104043321i
         3 aecgkaueodjhxjtqjfpq           04-JAN-2016 20160104043320j
         3 aecgkaueodjhxjtqjfpq           04-JAN-2016 20160104043320f
         4 ypsmyzcympcnkvrsrpof           03-JAN-2016 20160103043324j
         4 ypsmyzcympcnkvrsrpof           03-JAN-2016 20160103043322n
         4 ypsmyzcympcnkvrsrpof           03-JAN-2016 20160103043321p
         4 ypsmyzcympcnkvrsrpof           03-JAN-2016 20160103043320b
         4 ypsmyzcympcnkvrsrpof           03-JAN-2016 20160103043320r
         5 ksrzynkgjhptffxzeffx           02-JAN-2016 20160102043324v
         5 ksrzynkgjhptffxzeffx           02-JAN-2016 20160102043322v
         5 ksrzynkgjhptffxzeffx           02-JAN-2016 20160102043321v
         5 ksrzynkgjhptffxzeffx           02-JAN-2016 20160102043320t
         5 ksrzynkgjhptffxzeffx           02-JAN-2016 20160102043320g

25 rows selected.

rajesh@ORA11G> exit


And Tkprof show's me this, back to index friendly

SELECT TXT 
FROM
 ( SELECT TO_CHAR(LOG_TIME,'yyyymmddhhmiss')||MSG TXT FROM LOG_DATA WHERE 
  ACCOUNT_ID = :B1 ORDER BY LOG_TIME DESC ) WHERE ROWNUM <= 5 


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      5      0.00       0.00          0          0          0           0
Fetch        5      0.01       0.00          0         40          0          25
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       11      0.01       0.00          0         40          0          25

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 90     (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         5          5          5  COUNT STOPKEY (cr=8 pr=0 pw=0 time=315 us)
         5          5          5   VIEW  (cr=8 pr=0 pw=0 time=309 us cost=8 size=204 card=6)
         5          5          5    TABLE ACCESS BY INDEX ROWID LOG_DATA (cr=8 pr=0 pw=0 time=85 us cost=8 size=23936656 card=629912)
         5          5          5     INDEX RANGE SCAN DESCENDING LOG_DATE (cr=3 pr=0 pw=0 time=52 us cost=3 size=0 card=6)(object id 92783)



With 12c in-place, it is purely SQL solution:

rajesh@ORA12C> select t1.id,t1.account_name,log_time,msg
  2  from accounts t1 cross apply
  3    ( select *
  4      from (
  5      select *
  6      from log_data t2
  7      where t2.account_id = t1.id
  8      order by log_time desc
  9            )
 10      where rownum <= 5 )
 11  /
                                          
        ID ACCOUNT_NAME      LOG_TIME      MSG 
---------- ----------------- ------------- --------------------------------------------------
         1 jypsweynpwahnfgizfwm           01-JAN-16 05.00.49.000000 PM
mmqkrqptrshvzbepuwqx
.....
.....
.....
25 rows selected.

rajesh@ORA12C> 
rajesh@ORA12C> select t1.id,t1.account_name,
  2          to_date(substr(column_value,1,14),'yyyymmddhhmiss') as log_date,
  3          substr(column_value,15) as msg
  4  from accounts t1,
  5    table( cast( multiset(select *
  6    from (
  7    select to_char(log_time,'yyyymmddhhmiss')||msg txt
  8    from log_data t2
  9    where t2.account_id = t1.id
 10    order by log_time desc
 11         )
 12    where rownum <= 5) as sys.odcivarchar2list)) ;

        ID ACCOUNT_NAME                   LOG_DATE
---------- ------------------------------ -----------

.....
.....
.....
25 rows selected.

Chris Saxon
January 06, 2016 - 3:07 pm UTC

If using the PL/SQL solution, you need to beware of read-consistency issues:

https://blogs.oracle.com/sql/entry/the_problem_with_sql_calling

While the 12c solutions work, they're back to the original problem of too many gets the poster wanted to avoid:

SQL> select t1.id,t1.account_name,log_time,msg
  2      from accounts t1 cross apply
  3        ( select *
  4          from (
  5          select *
  6          from log_data t2
  7          where t2.account_id = t1.id
  8          order by log_time desc
  9                )
 10         where rownum <= 5 );

25 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1867537541

-----------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name            | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                 |    25 | 51125 |       |  3164   (1)| 00:00:01 |
|   1 |  NESTED LOOPS             |                 |    25 | 51125 |       |  3164   (1)| 00:00:01 |
|   2 |   TABLE ACCESS FULL       | ACCOUNTS        |     5 |   150 |       |     3   (0)| 00:00:01 |
|   3 |   VIEW                    | VW_LAT_2D0B8FC8 |     5 | 10075 |       |   632   (1)| 00:00:01 |
|*  4 |    COUNT STOPKEY          |                 |       |       |       |            |          |
|   5 |     VIEW                  |                 | 20000 |    38M|       |   632   (1)| 00:00:01 |
|*  6 |      SORT ORDER BY STOPKEY|                 | 20000 |  1132K|  1424K|   632   (1)| 00:00:01 |
|*  7 |       TABLE ACCESS FULL   | LOG_DATA        | 20000 |  1132K|       |   349   (1)| 00:00:01 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter(ROWNUM<=5)
   6 - filter(ROWNUM<=5)
   7 - filter("T2"."ACCOUNT_ID"="T1"."ID")

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       6318  consistent gets
          0  physical reads
        124  redo size
      10551  bytes sent via SQL*Net to client
       6050  bytes received via SQL*Net from client
         52  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
         25  rows processed

SQL>
SQL> select t1.id,t1.account_name,
  2           to_date(substr(column_value,1,14),'yyyymmddhhmiss') as log_date,
  3           substr(column_value,15) as msg
  4   from accounts t1,
  5     table( cast( multiset(select *
  6     from (
  7     select to_char(log_time,'yyyymmddhhmiss')||msg txt
  8     from log_data t2
  9     where t2.account_id = t1.id
 10     order by log_time desc
 11          )
 12     where rownum <= 5) as sys.odcivarchar2list)) ;

25 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2518336803

------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                 | 40840 |  1276K|   140   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                            |                 | 40840 |  1276K|   140   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL                      | ACCOUNTS        |     5 |   150 |     3   (0)| 00:00:01 |
|   3 |   COLLECTION ITERATOR SUBQUERY FETCH     |                 |  8168 | 16336 |    27   (0)| 00:00:01 |
|*  4 |    COUNT STOPKEY                         |                 |       |       |            |       |
|   5 |     VIEW                                 | VW_LAT_1BBF5C63 |  1000 |  1955K|    29   (4)| 00:00:01 |
|*  6 |      SORT ORDER BY STOPKEY               |                 |  1000 | 55000 |    29   (4)| 00:00:01 |
|   7 |       TABLE ACCESS BY INDEX ROWID BATCHED| LOG_DATA        |  1000 | 55000 |    28   (0)| 00:00:01 |
|*  8 |        INDEX RANGE SCAN                  | LOG_DATE        |   400 |       |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter(ROWNUM<=5)
   6 - filter(ROWNUM<=5)
   8 - access("T2"."ACCOUNT_ID"="T1"."ID")

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


Statistics
----------------------------------------------------------
          4  recursive calls
          0  db block gets
       6586  consistent gets
          0  physical reads
          0  redo size
       1517  bytes sent via SQL*Net to client
        500  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
         25  rows processed

solution ?

Rajeshwaran Jeyabal, January 06, 2016 - 3:16 pm UTC

so in this case, would you advice us to go with index_desc hint for 11g database, rather than invoking Table function?
Chris Saxon
January 06, 2016 - 5:47 pm UTC

It depends :)

If the data rarely changes (say it's loaded daily), it's unlikely that you'll run the query while the data changes. So the problems/risks with read consistency are mitigated.

On the other hand, if it's changing constantly then I'd opt for the pure SQL method. Or you could modify the function to pass in SCNs enabling as of scn flashback queries. That's getting fiddly though. At this point it's debatable whether the PL/SQL solution is simpler than the original SQL.

Depending on the performance requirements and how many different categories/account there are, I would also consider taking Tubby's solution and running the statement in N concurrent sessions (N is the number of accounts). Assuming of course you're able to spin up that many sessions at once.

2016 resolution ?

Rajeshwaran Jeyabal, January 06, 2016 - 4:00 pm UTC

would you mind us in sharing your 2016 resolution ?
Chris Saxon
January 07, 2016 - 1:39 am UTC

I assume you are referring to Chris ?

Mine is here:

https://connormcdonald.wordpress.com/2016/01/04/resolutions-for-2016/

12c Solution

Tubby, January 06, 2016 - 5:33 pm UTC

Just a comment about the solution provided for 12c and it being "back to the too many gets".

We can use a little manual optimization to get around that.

create index log_date on log_data (account_id, log_time desc) compress 1;

with data as
(
   select t1.id,t1.account_name,the_rowid
       from accounts t1
       cross apply
         ( select the_rowid
           from (
           select rowid as the_rowid
           from log_data t2
           where t2.account_id = t1.id
           order by log_time desc
                 )
          where rownum <= 5 )
)
select /*+ gather_plan_statistics */
   *
from data d, log_data l
where d.the_rowid = l.rowid;

select * from table(dbms_xplan.display_cursor(null, null, 'BASIC +IOSTATS LAST'));

---------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                 |      1 |        |     25 |00:00:00.01 |     325 |
|   1 |  NESTED LOOPS               |                 |      1 |     25 |     25 |00:00:00.01 |     325 |
|   2 |   NESTED LOOPS              |                 |      1 |     25 |     25 |00:00:00.01 |     300 |
|   3 |    TABLE ACCESS FULL        | ACCOUNTS        |      1 |      5 |      5 |00:00:00.01 |      11 |
|   4 |    VIEW                     | VW_LAT_18ABF8A7 |      5 |      5 |     25 |00:00:00.04 |     289 |
|*  5 |     COUNT STOPKEY           |                 |      5 |        |     25 |00:00:00.04 |     289 |
|   6 |      VIEW                   |                 |      5 |  20000 |     25 |00:00:00.04 |     289 |
|*  7 |       SORT ORDER BY STOPKEY |                 |      5 |  20000 |     25 |00:00:00.04 |     289 |
|*  8 |        INDEX RANGE SCAN     | LOG_DATE        |      5 |  20000 |    100K|00:00:00.22 |     289 |
|   9 |   TABLE ACCESS BY USER ROWID| LOG_DATA        |     25 |      1 |     25 |00:00:00.01 |      25 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - filter(ROWNUM<=5)
   7 - filter(ROWNUM<=5)
   8 - access("T2"."ACCOUNT_ID"="T1"."ID")


34 rows selected.

Elapsed: 00:00:00.47
TUBBY_ORCL?


So admittedly more work than the solution originally posted, but I would personally choose the less complex solution for a couple of reasons.

1) we don't need to hint it to get Oracle to "behave" (and the consequences if the hints are obviated because of a query rewrite become quite severe)
2) the likelihood of the optimizer choosing a bad execution plan should be lessened.

Cheers,
Chris Saxon
January 06, 2016 - 5:52 pm UTC

Thanks for the enhancement. I agree the simpler, more obvious solution is generally preferable - assuming it gives good enough performance.

actual rows scanned is still high

Eyal, January 06, 2016 - 5:59 pm UTC

Tubby, in your latest solution the index scan still access 100K rows (in the index, but still..)

In my case only pure SQL is an option and unfortunately, because we're using an ORM, even some of the advanced (multiset,table) operators are not available.

was hoping the cross\apply solution would work since we're going to upgrade to 12c in the coming month.

(Thanks everyone for the help so far..)
Chris Saxon
January 07, 2016 - 1:35 am UTC

Whenever you have a problem, you can use an ORM....

... now you have two problems :-)


(Sorry, couldnt resist)

Connor

Full scan Vs Rowid approach

Rajeshwaran Jeyabal, January 07, 2016 - 6:19 am UTC

Thanks Tubby and Chris, just compared the full table scan with rowid approach and rowid solution seem to be faster.

when replacing rowid with log_date,msg why plan changes from index to FTS - instead it shouldn't be Index range scan followed by table access by rowid ? is that something due to the index CF ?

Using Rowid

with datas as (
select t1.id, t1.account_name,rid
from accounts t1 cross apply
  ( select t2.rowid rid
    from log_data t2
    where t2.account_id = t1.id
    order by t2.log_time desc
    fetch first 5 rows only )
       )
select d.id,d.account_name,t2.log_time,t2.msg
from datas d, log_data t2
where d.rid = t2.rowid

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      1.48       3.41       8948       8987          0          25
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      1.48       3.41       8948       8987          0          25

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 111  
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
        25         25         25  NESTED LOOPS  (cr=8987 pr=8948 pw=0 time=627660 us cost=33009 size=2000 card=25)
        25         25         25   NESTED LOOPS  (cr=8962 pr=8948 pw=0 time=627696 us cost=32984 size=900 card=25)
         5          5          5    TABLE ACCESS FULL ACCOUNTS (cr=8 pr=0 pw=0 time=56 us cost=3 size=120 card=5)
        25         25         25    VIEW  VW_LAT_18ABF8A7 (cr=8954 pr=8948 pw=0 time=3417390 us cost=6596 size=60 card=5)
        25         25         25     SORT ORDER BY (cr=8954 pr=8948 pw=0 time=3417383 us cost=6596 size=190 card=5)
        25         25         25      VIEW  (cr=8954 pr=8948 pw=0 time=3417267 us cost=6596 size=190 card=5)
        25         25         25       WINDOW SORT PUSHED RANK (cr=8954 pr=8948 pw=0 time=3417247 us cost=6596 size=16640000 card=640000)
   3200000    3200000    3200000        INDEX RANGE SCAN LOG_DATE (cr=8954 pr=8948 pw=0 time=2891882 us cost=1867 size=16640000 card=640000)(object id 101600)
        25         25         25   TABLE ACCESS BY USER ROWID LOG_DATA (cr=25 pr=0 pw=0 time=117 us cost=1 size=44 card=1)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  db file sequential read                      8948        0.07          2.02
  SQL*Net message from client                     2        0.13          0.13


with FTS

select t1.id, t1.account_name,log_time,msg
from accounts t1 cross apply
  ( select t2.log_time,t2.msg
    from log_data t2
    where t2.account_id = t1.id
    order by t2.account_id,t2.log_time desc
    fetch first 5 rows only )

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      1.80       6.65      89085      89113          0          25
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      1.82       6.66      89085      89113          0          25

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 111  
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
        25         25         25  NESTED LOOPS  (cr=89113 pr=89085 pw=0 time=1255526 us cost=54323 size=1600 card=25)
         5          5          5   TABLE ACCESS FULL ACCOUNTS (cr=8 pr=0 pw=0 time=67 us cost=3 size=120 card=5)
        25         25         25   VIEW  VW_LAT_2D0B8FC8 (cr=89105 pr=89085 pw=0 time=6654820 us cost=10864 size=200 card=5)
        25         25         25    VIEW  (cr=89105 pr=89085 pw=0 time=6654802 us cost=10864 size=395 card=5)
        25         25         25     WINDOW SORT PUSHED RANK (cr=89105 pr=89085 pw=0 time=6654777 us cost=10864 size=22400000 card=640000)
   3200000    3200000    3200000      TABLE ACCESS FULL LOG_DATA (cr=89105 pr=89085 pw=0 time=10938968 us cost=4926 size=22400000 card=640000)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  direct path read                             2772        0.20          4.34
  SQL*Net message from client                     2        0.13          0.13

rajesh@ORA12C> column table_name format a10
rajesh@ORA12C> select t1.table_name,t1.blocks,t1.num_rows,t2.clustering_factor
  2  from user_tables t1,
  3    user_indexes t2
  4  where t1.table_name = t2.table_name
  5  and t1.table_name ='LOG_DATA';

TABLE_NAME     BLOCKS   NUM_ROWS CLUSTERING_FACTOR
---------- ---------- ---------- -----------------
LOG_DATA        18104    3200000           2425780

1 row selected.



Connor McDonald
January 07, 2016 - 11:46 am UTC

Using the index, Oracle accesses every row for each account. It then applies the filtering to get the top N for each account. As we're processing all the accounts, this is every row in the table.

As it's accessing every row, a full table scan is the way to go:

select /*+ gather_plan_statistics */t1.id, t1.account_name,log_time,msg
from   accounts t1 cross apply
  ( select /*+ index (t2 log_date) */t2.log_time,t2.msg
    from log_data t2
    where t2.account_id = t1.id
    order by t2.account_id,t2.log_time desc
    fetch first 5 rows only );
  
select * from table(dbms_xplan.display_cursor(null, null, '+IOSTATS +PREDICATE LAST'));

---------------------------------------------------------------------------------------------------------------------                                           
| Id  | Operation                               | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |                                           
---------------------------------------------------------------------------------------------------------------------                                           
|   0 | SELECT STATEMENT                        |                 |      1 |        |     25 |00:00:00.03 |    6573 |                                           
|   1 |  NESTED LOOPS                           |                 |      1 |     25 |     25 |00:00:00.03 |    6573 |                                           
|   2 |   TABLE ACCESS FULL                     | ACCOUNTS        |      1 |      5 |      5 |00:00:00.01 |       7 |                                           
|   3 |   VIEW                                  | VW_LAT_2D0B8FC8 |      5 |      5 |     25 |00:00:00.13 |    6566 |                                           
|*  4 |    VIEW                                 |                 |      5 |      5 |     25 |00:00:00.13 |    6566 |                                           
|*  5 |     WINDOW SORT PUSHED RANK             |                 |      5 |  20001 |     25 |00:00:00.13 |    6566 |                                           
|   6 |      TABLE ACCESS BY INDEX ROWID BATCHED| LOG_DATA        |      5 |  20001 |    100K|00:00:00.35 |    6566 |                                           
|*  7 |       INDEX RANGE SCAN                  | LOG_DATE        |      5 |  20001 |    100K|00:00:00.09 |     313 |                                           
---------------------------------------------------------------------------------------------------------------------                                           
                                                                                                                                                                
Predicate Information (identified by operation id):                                                                                                             
---------------------------------------------------                                                                                                             
                                                                                                                                                                
   4 - filter("from$_subquery$_004"."rowlimit_$$_rownumber"<=5)                                                                                                 
   5 - filter(ROW_NUMBER() OVER ( ORDER BY "T2"."ACCOUNT_ID",INTERNAL_FUNCTION("T2"."LOG_TIME") DESC )<=5)                                                      
   7 - access("T2"."ACCOUNT_ID"="T1"."ID")    

Two scenarios - help me to understand

Rajeshwaran Jeyabal, January 07, 2016 - 12:46 pm UTC

rajesh@ORA12C> create index LOG_DATE on LOG_DATA
  2  (account_id,log_time DESC)
  3  compress 1;

Index created.

rajesh@ORA12C> set autotrace traceonly explain
rajesh@ORA12C> select t1.id, t1.account_name,log_time,msg
  2  from   accounts t1 cross apply
  3    ( select t2.log_time,t2.msg
  4      from log_data t2
  5      where t2.account_id = t1.id
  6      order by t2.account_id,t2.log_time desc
  7      fetch first 5 rows only );

Execution Plan
----------------------------------------------------------
Plan hash value: 1148481600

-----------------------------------------------------------------------------------
| Id  | Operation                  | Name            | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                 |    25 |  1600 | 54323   (1)|
|   1 |  NESTED LOOPS              |                 |    25 |  1600 | 54323   (1)|
|   2 |   TABLE ACCESS FULL        | ACCOUNTS        |     5 |   120 |     3   (0)|
|   3 |   VIEW                     | VW_LAT_2D0B8FC8 |     5 |   200 | 10864   (1)|
|*  4 |    VIEW                    |                 |     5 |   395 | 10864   (1)|
|*  5 |     WINDOW SORT PUSHED RANK|                 |   640K|    21M| 10864   (1)|
|*  6 |      TABLE ACCESS FULL     | LOG_DATA        |   640K|    21M|  4926   (1)|
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("from$_subquery$_004"."rowlimit_$$_rownumber"<=5)
   5 - filter(ROW_NUMBER() OVER ( ORDER BY "T2"."ACCOUNT_ID",INTERNAL_FUNCTION("T2"."LOG_TIME"
              ) DESC )<=5)
   6 - filter("T2"."ACCOUNT_ID"="T1"."ID")


Chris - sorry to bother you again.

from the above explain plan we understand that

Approach#1
a) we get the first row from table "accounts" (lets say id =1) being passed as input to T2.
b) we get the full scan of table "log_data" and filter rows with t2.account_id = 1 (step id=6 from plan)
c) we do analytic on top of those filtered rows and retain only 5 rows from them (step id=5 and 4 from plan)
d) this five rows is returned as result for the input id = 1
e) step (a) to (d) is repeated again for each set of row present in table "accounts".

Instead of the above steps, how if the optimizer go this way

Approach#2
a) we get the first row from table "accounts" (lets say id =1) being passed as input to T2.
b) before executing the query inside the cross apply, look at "all" these WHERE CLAUSE , ORDER BY and FETCH FIRST 5 ROWS OPTION.
c) the optimizer should understand that user requires only five rows for the input account_id sorted in LOG_TIME in descending order.
d) luckly the index exists having account_id on leading portion and log_time sorted Decending internally.
e) so go to "that" leaf block walk down five index entries and go back to that table and read those rows using rowid from index.(if CF is worst, we end up with reading 5 different Table blocks for each input, that should be fine than FTS - where we scan all table blocks)
f) repeat the step 2.e for each set of inputs (in this case it should be "t1.id")

Do you think Approach#2 look smarter than Approach#1 ?

Also, please correct me if wrong in my understanding.
Chris Saxon
January 07, 2016 - 4:28 pm UTC

Look at the predicates for step 5. Oracle uses the analytic row_number() to implement the fetch first 5 rows clause. Analytics are processed after the where clause.

So Oracle is:

- Finding all the rows for an account
- Assigning the row_number() for each of these
- Then filtering based on the row_number assigned

i.e. it's transformed the query to:

select * from (
 select row_number () over (order by log_time desc) rn, l.*
 from   log_data l
 where  account_id = 1
)
where  rn <= 5;


The order by log_time isn't assigned until after processing the where. The optimizer doesn't know which five rows you want when accessing the data for a given account.

She ain't pretty ....

Tubby, January 07, 2016 - 6:11 pm UTC

"
Tubby, in your latest solution the index scan still access 100K rows (in the index, but still..)
"

Gave it some thought and came up with an alternative. Though we start getting back in to the complexity introduced in the original reply by Chris.

Since you're working with an ORM you should be able to simply create a view to hide the TABLE functions required for this (or the original) solution.

create or replace view quick_test as
with
   quick_data as
(
   select
         t1.id
      ,  t1.account_name
      ,  (--accumulate the top 5 log_data records by ROWID
            select
               listagg(the_rowid , ',') within group (order by 1)
            from
            (
               select
                  rowidtochar(rowid) as the_rowid
               from
                  log_data t2
               where
                  t2.account_id = t1.id
               order by
                  log_time desc
            )
            where rownum <= 5
          )
            as row_ids
   from
      accounts t1
),
   exploded_data as  --break out the comma separated list back in to rows
(
   select
      chartorowid(trim(regexp_substr(row_ids, '[^,]+', 1, column_value))) as the_rowid
   from
        quick_data q
      , table(cast(multiset(select level from dual connect by level <= regexp_count(row_ids, ',') + 1) as sys.odcinumberlist))
)
select *
from exploded_data
;

select/*+ gather_plan_statistics */
   *
from log_data l
where l.rowid in (select the_rowid from quick_test)
;

select * from table(dbms_xplan.display_cursor(null, null, '+IOSTATS +PREDICATE LAST'));

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name         | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |              |      1 |        |     25 |00:00:00.01 |      36 |
|   1 |  NESTED LOOPS                          |              |      1 |   1000 |     25 |00:00:00.01 |      36 |
|   2 |   SORT GROUP BY                        |              |      5 |      1 |      5 |00:00:00.01 |      10 |
|*  3 |    COUNT STOPKEY                       |              |      5 |        |     25 |00:00:00.01 |      10 |
|   4 |     VIEW                               |              |      5 |  20000 |     25 |00:00:00.01 |      10 |
|*  5 |      INDEX RANGE SCAN                  | LOG_DATE     |      5 |  20000 |     25 |00:00:00.01 |      10 |
|   6 |   VIEW                                 | VW_NSO_1     |      1 |  40840 |     25 |00:00:00.01 |      11 |
|   7 |    HASH UNIQUE                         |              |      1 |      1 |     25 |00:00:00.01 |      11 |
|   8 |     NESTED LOOPS                       |              |      1 |  40840 |     25 |00:00:00.01 |      11 |
|   9 |      VIEW                              |              |      1 |      5 |      5 |00:00:00.01 |      11 |
|  10 |       INDEX FULL SCAN                  | SYS_C0015952 |      1 |      5 |      5 |00:00:00.01 |       1 |
|  11 |      COLLECTION ITERATOR SUBQUERY FETCH|              |      5 |   8168 |     25 |00:00:00.01 |       0 |
|  12 |       CONNECT BY WITHOUT FILTERING     |              |      5 |        |     25 |00:00:00.01 |       0 |
|  13 |        FAST DUAL                       |              |      5 |      1 |      5 |00:00:00.01 |       0 |
|  14 |   TABLE ACCESS BY USER ROWID           | LOG_DATA     |     25 |   1000 |     25 |00:00:00.01 |      25 |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(ROWNUM<=5)
   5 - access("T2"."ACCOUNT_ID"=:B1)


33 rows selected.

Elapsed: 00:00:00.54
TUBBY_ORCL?


So it comes back to the basic question of, is the performance gain worth the added complexity? Not a question anyone here can answer since you know your environment better than us :)

Cheers,
Chris Saxon
January 07, 2016 - 10:54 pm UTC

Nice touch

Sorted Hash Cluster

?ilvinas, January 08, 2016 - 10:47 am UTC

Have you tried Sorted Hash Cluster?
It seems that Sorted Hash Cluster is perfect technology for your loging application.

https://docs.oracle.com/cd/E18283_01/server.112/e17120/hash003.htm#i1006536

Sorted Hash clusters ?

Rajeshwaran, Jeyabal, January 09, 2016 - 1:28 pm UTC

To build Hash cluster we should know the hash_keys.
Say in this case, if it is account_number from "ACCOUNT" table, how do we know what is the total distinct accounts_id we receive over the time?

In case of Warehouse/Reporting application, we can predict that before load, but not in case of OLTP (where the data load keep growing day by day)

create cluster shc
( ACCOUNT_ID number,
  LOG_TIME timestamp SORT )
Hashkeys 10000  <<<===============
size 8192 ; 

Connor McDonald
January 10, 2016 - 4:21 am UTC

This is true, but often this is an upper bound known for such entities, eg, for accounts and (say) their home address, you may have:

"I will never have more than 10,000 accounts, and they will never have more than 10 addresses over their life time"

and use that rule accordingly. If you eventually exceed the limit it you still only have a few hash collisions, and could plan a reorganisation if it *really* needed it.

But yes, in this case, where it appears to be more transactional history data (ie 'n' log transactions, where 'n' is unknown) a cluster probably isnt the best bet.


In response to "She ain't pretty .... | January 07, 2016 - 6:11 pm UTC "

A reader, January 12, 2016 - 10:12 am UTC


scott@orcltest> desc accounts
Name Null? Type
----------------------------------------------------------------------------------------------------- -------- ------------------------
ID NOT NULL NUMBER
ACCOUNT_NAME VARCHAR2(30)

scott@orcltest> create or replace view quick_test as
2 with
3 quick_data as
4 (
5 select
6 t1.id
7 , t1.account_name
8 , (--accumulate the top 5 log_data records by ROWID
9 select
10 listagg(the_rowid , ',') within group (order by 1)
11 from
12 (
13 select
14 rowidtochar(rowid) as the_rowid
15 from
16 log_data t2
17 where
18 t2.account_id = t1.id
19 order by
20 log_time desc
21 )
22 where rownum <= 5
23 )
24 as row_ids
25 from
26 accounts t1
27 ),
28 exploded_data as --break out the comma separated list back in to rows
29 (
30 select
31 chartorowid(trim(regexp_substr(row_ids, '[^,]+', 1, column_value))) as the_rowid
32 from
33 quick_data q
34 , table(cast(multiset(select level from dual connect by level <= regexp_count(row_ids, ',') +
35 1) as sys.odcinumberlist))
36 )
37 select *
38 from exploded_data
39 ;
t2.account_id = t1.id
*
ERROR at line 18:
ORA-00904: "T1"."ID": invalid identifier


Elapsed: 00:00:02.17

Use of INDEX_DESC hint is recommended in such scenarios?

Achal Bansal, June 06, 2019 - 1:13 pm UTC

This is regarding use of INDEX_DESC hint in the reply dated 6 Jan 2016.

My question is with reference to following Ask Tom thread where Tom recommends to avoid used of INDEX_DESC technique in similar type of scenario:

https://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:9843506698920#9896083461138

Just want to understand if I am missing something here. My understanding here is that the solution will fail if Oracle don't do INDEX RANGE SCAN DESC for any reason.
Chris Saxon
June 06, 2019 - 2:17 pm UTC

You're right it will fail - as the answer here says:

Note: this *requires* the index on (account_id, log_time) to give correct results! Without this, it'll give incorrect results "silently". Use with caution!

So you need to weigh up carefully whether this is the right solution for you.

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.