I have the following log table with millions of records and the following index:
CREATE TABLE log_data(
account_id NUMBER,
log_type NUMBER,
sys_name VARCHAR2(30),
log_time TIMESTAMP,
msg CLOB
);
create index log_date on log_data (account_id, log_type, NLSSORT(sys_name,'NLS_SORT=BINARY_CI'), log_time);
We're working with linguistic comparisons and case insensitive sort like this:
ALTER session SET nls_comp=linguistic;
ALTER session SET nls_sort=binary_ci;
I would like to get the latest 10 log events of a certain type for a specific system in a specific account:
explain plan for
SELECT *
FROM
(
SELECT log_time, msg
FROM log_data
WHERE
account_id=5
AND log_type=2
AND sys_name='system1'
ORDER BY
log_time DESC
)
WHERE
rownum<10;
SELECT * FROM table(DBMS_XPLAN.DISPLAY);
This shows:
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 18135 | 3 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 104 | 204K| 3 (0)| 00:00:01 |
|* 3 | SORT ORDER BY STOPKEY | | 104 | 220K| 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID | LOG_DATA | 104 | 220K| 3 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN DESCENDING| LOG_DATE | 1 | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
1 - filter(ROWNUM<10)
3 - filter(ROWNUM<10)
5 - access("ACCOUNT_ID"=5 AND "LOG_TYPE"=2 AND
NLSSORT("SYS_NAME",'nls_sort=''BINARY_CI''')=HEXTORAW('73797374656D3100') )
Note on line 3 there is a sort operation which I'd like to avoid because this table contains millions of records.
The interesting thing is that this seems to be related to the linguistic comparison because if I change my session to use binary and change the index as follows, the plan looks good:
ALTER session SET nls_comp=binary;
create index log_date on log_data (account_id, log_type, sys_name, log_time);
explain plan for
SELECT *
FROM
(
SELECT log_time, msg
FROM log_data
WHERE
account_id=5
AND log_type=2
AND sys_name='system1'
ORDER BY
log_time DESC
)
WHERE
rownum<10;
SELECT * FROM table(DBMS_XPLAN.DISPLAY);
shows this:
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 18135 | 3 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 10442 | 20M| 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID | LOG_DATA | 10442 | 20M| 3 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN DESCENDING| LOG_DATE | 9 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<10)
4 - access("ACCOUNT_ID"=5 AND "LOG_TYPE"=2 AND "SYS_NAME"='system1')
Beautiful!
no sort.What am I missing? Is it possible to avoid the SORT when using a linguistic comparison?
SORT BY STOPKEY shouldnt such a large issue in this case. For example,
SQL> CREATE TABLE log_data(
2 account_id NUMBER,
3 log_type NUMBER,
4 sys_name VARCHAR2(30),
5 log_time TIMESTAMP,
6 msg VARCHAR2(30)
7 );
Table created.
SQL>
SQL> insert /*+ APPEND */ into log_Data
2 select trunc(dbms_Random.value(1,1000)), mod(rownum,5), 'blah', sysdate - 300 + rownum / 1000, 'x'
3 from dual
4 connect by level <= 1000000;
1000000 rows created.
SQL>
SQL> create index log_date on log_data (account_id, log_type,
2 NLSSORT(sys_name,'NLS_SORT=BINARY_CI'), log_time);
Index created.
SQL>
SQL> ALTER session SET nls_comp=linguistic;
Session altered.
SQL>
SQL> ALTER session SET nls_sort=binary_ci;
Session altered.
SQL>
SQL>
SQL> explain plan for
2 SELECT *
3 FROM
4 (
5 SELECT log_time, msg
6 FROM log_data
7 WHERE
8 account_id=5
9 AND log_type=2
10 AND sys_name='blah'
11 ORDER BY
12 log_time DESC
13 )
14 WHERE
15 rownum<10;
Explained.
SQL>
SQL> SELECT * FROM table(DBMS_XPLAN.DISPLAY);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 2164520535
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 30 | 4 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 1 | 30 | 4 (0)| 00:00:01 |
|* 3 | SORT ORDER BY STOPKEY | | 1 | 25 | 4 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID BATCHED| LOG_DATA | 1 | 25 | 4 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN DESCENDING | LOG_DATE | 1 | | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<10)
3 - filter(ROWNUM<10)
5 - access("ACCOUNT_ID"=5 AND "LOG_TYPE"=2 AND
NLSSORT("SYS_NAME",'nls_sort=''BINARY_CI''')=HEXTORAW('626C616800'))
20 rows selected.
SQL>
SQL> set autotrace traceonly stat
SQL>
SQL> SELECT *
2 FROM
3 (
4 SELECT log_time, msg
5 FROM log_data
6 WHERE
7 account_id=5
8 AND log_type=2
9 AND sys_name='blah'
10 ORDER BY
11 log_time DESC
12 )
13 WHERE
14 rownum<10;
9 rows selected.
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
195 consistent gets
0 physical reads
0 redo size
766 bytes sent via SQL*Net to client
552 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
9 rows processed
SQL>
SQL> set autotrace off
SQL>
SQL> SELECT /*+ gather_plan_statistics */ *
2 FROM
3 (
4 SELECT log_time, msg
5 FROM log_data
6 WHERE
7 account_id=5
8 AND log_type=2
9 AND sys_name='blah'
10 ORDER BY
11 log_time DESC
12 )
13 WHERE
14 rownum<10;
LOG_TIME MSG
--------------------------------------------------------------------------- ------------------------------
19-NOV-17 04.55.17.000000 AM x
19-NOV-17 04.12.05.000000 AM x
15-NOV-17 10.12.05.000000 AM x
09-NOV-17 11.24.05.000000 PM x
06-NOV-17 03.57.41.000000 PM x
03-NOV-17 12.57.41.000000 PM x
01-NOV-17 12.14.29.000000 PM x
30-OCT-17 12.50.29.000000 AM x
08-OCT-17 10.40.53.000000 AM x
9 rows selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'))
PLAN_TABLE_OUTPUT
----------------------------------------------------------
SQL_ID 3hub5whtuc5d4, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */ * FROM ( SELECT log_time,
msg FROM log_data WHERE account_id=5 AND log_type=2
AND sys_name='blah' ORDER BY log_time DESC ) WHERE
rownum<10
Plan hash value: 2164520535
---------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem |
---------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 9 |00:00:00.01 | 195 | |
|* 1 | COUNT STOPKEY | | 1 | | 9 |00:00:00.01 | 195 | |
| 2 | VIEW | | 1 | 1 | 9 |00:00:00.01 | 195 | |
|* 3 | SORT ORDER BY STOPKEY | | 1 | 1 | 9 |00:00:00.01 | 195 | 2048 | 2
| 4 | TABLE ACCESS BY INDEX ROWID BATCHED| LOG_DATA | 1 | 1 | 202 |00:00:00.01 | 195 | |
|* 5 | INDEX RANGE SCAN DESCENDING | LOG_DATE | 1 | 1 | 202 |00:00:00.01 | 3 | |
---------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<10)
3 - filter(ROWNUM<10)
5 - access("ACCOUNT_ID"=5 AND "LOG_TYPE"=2 AND "LOG_DATA"."SYS_NC00006$"=HEXTORAW('626C616800'))
27 rows selected.
We managed to descend the index, and we only sorted the rows AFTER the filter.