Thanks for the question, Biswajit.

Asked: January 14, 2019 - 10:35 am UTC

Last updated: January 15, 2019 - 3:12 pm UTC

Version: 11g

Viewed 10K+ times! This question is

You Asked

I have a table 'raw_traffic' with approximately 35 columns titled

id,
status,
insert_date,
change_date,
device_name,
device_port,
class,
prev_id,
col_1,
col_2,
col_3,
.
.
.
col_N

Whenever I receive fresh traffic data, I need to find the previous row id with a specific combination of device_name, device_port, class, status and some other columns
The approach that we currently have is to run the following query to get the last inserted id.

SELECT MAX(id) FROM raw_traffic WHERE device_name='___', device_port='___', class='___', col_1='___', col_2='___' and status='___'

After retrieving this id we are storing it in the prev_id column of the new row and doing an insert.

It is important to mention that this table at any given point of time has around half a million records and around 10 indexes (one on each of these searchable columns). Consequently the select and the insert operations have become intolerably slow and deteriorating by the day.

In order to improve it I was thinking of putting the retrieve last id function in a PL/SQL block or even a before_insert trigger in the db.
What would the best approach for handling such a scenario. Moreover what should be the most efficient index strategy for such a table ?

and Chris said...

If you want to find the largest value in a column for all the rows that equal some values in other columns, you can create an index on:

- all the columns in your where
- followed by the column you want the max value from

The database can then do a nice efficient min/max scan of the index.

For example, the following searches the table for all rows where the first three columns = 0. And gets the max value for the last one:

create table t as 
  select mod ( level, 2 ) c1,
         mod ( level, 4 ) c2,
         mod ( level, 8 ) c3,
         level c4
  from   dual
  connect by level <= 100;
  
create index i on t ( c1, c2, c3, c4 );
  
set serveroutput off

select /*+ gather_plan_statistics */ 
       max ( c4 )
from   t
where  c1 = 0 
and    c2 = 0 
and    c3 = 0;

select * 
from   table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST'));

PLAN_TABLE_OUTPUT                                                                                 
SQL_ID  1t8zzb3s0r1fr, child number 0                                                             
-------------------------------------                                                             
select /*+ gather_plan_statistics */         max ( c4 ) from   t where                            
c1 = 0  and    c2 = 0  and    c3 = 0                                                              
                                                                                                  
Plan hash value: 3623458378                                                                       
                                                                                                  
-----------------------------------------------------------------------------------------------   
| Id  | Operation                    | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |   
-----------------------------------------------------------------------------------------------   
|   0 | SELECT STATEMENT             |      |      1 |        |      1 |00:00:00.01 |       1 |   
|   1 |  SORT AGGREGATE              |      |      1 |      1 |      1 |00:00:00.01 |       1 |   
|   2 |   FIRST ROW                  |      |      1 |      1 |      1 |00:00:00.01 |       1 |   
|*  3 |    INDEX RANGE SCAN (MIN/MAX)| I    |      1 |      1 |      1 |00:00:00.01 |       1 |   
-----------------------------------------------------------------------------------------------   
                                                                                                  
Predicate Information (identified by operation id):                                               
---------------------------------------------------                                               
                                                                                                  
   3 - access("C1"=0 AND "C2"=0 AND "C3"=0)

This only accesses one entry in the index. So is about as efficient as you can get.

This only works if you've got equality (=) on all the leading columns of the index columns. If you have any inequalities (<, >=, etc.) you won't get this scan.

For example, the following changes the predicate on c2 to < 1. Instead of = 0.

Although for these data the effect is the same, the database does a normal range scan of the index. Which reads 12 entries instead of 1:

select /*+ gather_plan_statistics */ 
       max ( c4 )
from   t
where  c1 = 0 
and    c2 < 1 
and    c3 = 0;

select * 
from   table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST'));

PLAN_TABLE_OUTPUT                                                                      
SQL_ID  4sd3havabynyu, child number 1                                                  
-------------------------------------                                                  
select /*+ gather_plan_statistics */         max ( c4 ) from   t where                 
c1 = 0  and    c2 < 1  and    c3 = 0                                                   
                                                                                       
Plan hash value: 163676535                                                             
                                                                                       
------------------------------------------------------------------------------------   
| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |   
------------------------------------------------------------------------------------   
|   0 | SELECT STATEMENT  |      |      1 |        |      1 |00:00:00.01 |       1 |   
|   1 |  SORT AGGREGATE   |      |      1 |      1 |      1 |00:00:00.01 |       1 |   
|*  2 |   INDEX RANGE SCAN| I    |      1 |     12 |     12 |00:00:00.01 |       1 |   
------------------------------------------------------------------------------------   
                                                                                       
Predicate Information (identified by operation id):                                    
---------------------------------------------------                                    
                                                                                       
   2 - access("C1"=0 AND "C3"=0 AND "C2"<1)                                            
       filter("C3"=0)

What would the best approach for handling such a scenario

Hard to say without understanding why you want to get the previous id for these values. What's the purpose of this column? How are you calculating the id values?

Rating

(1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Biswajit Shaw, January 15, 2019 - 12:11 pm UTC

Hi Chris,
Thanks for the response.

About your questions to get a clearer context of the problem,

1. How do we calculate the id values - we have a sequence in the db and a before_insert trigger which fetches the next value from the sequence and adds it to to-be-inserted data.

CREATE SEQUENCE raw_traffic_gen
START WITH    1
INCREMENT BY  1
MAXVALUE      9999999999
NOORDER
NOCACHE
NOCYCLE;

-- Trigger to set the value of the raw_traffic_gen to the 
-- id column of raw_traffic table for insert operation
CREATE OR REPLACE TRIGGER raw_traffic_on_insert
  BEFORE INSERT ON raw_traffic
FOR EACH ROW
  BEGIN
    IF (:NEW.ID = 0 OR :NEW.ID IS NULL) THEN
      SELECT raw_traffic_gen.NEXTVAL INTO :NEW.ID FROM dual;
  END IF;
END;
/

2. Why we want to get the previous id for these values - once the import of fresh traffic is completed with the prev_id values in place, we have another view which does a join on the same table and a couple of more tables to feed data to another batch job which calculates the difference between the values of certain column like COL1_new_value - COL1_prev_value

CREATE OR REPLACE FORCE VIEW "TEST"."RAW_TRAFFIC_COMPLETE"
( 
  "ID", "C1", "C2", ...
  "ID_PREV", "C1_PREV", "C2_PREV", ... ,
  "TABLE1_COLUMN", ...
) 
AS 
SELECT cur.id, cur.c1, cur.c2, ... ,
       prev.id, prev.c1, prev.c2, ... , 
       t1.COL_1, ...
FROM raw_traffic cur 
LEFT JOIN raw_traffic prev ON prev.id = cur.prev_utz_id  
LEFT JOIN table_1 t1 ON cur.CN = t1.CN 
WHERE C1.status in (__,__)
ORDER BY cur.C1, cur.C2
;

January 15, 2019 - 3:12 pm UTC

1. Although sequences generate incrementing values, they are NOT guaranteed to represent the order you insert rows to the table. RAC in particular will break this.

You CANNOT rely on these values to say which order rows were inserted in your database.

2. You'll have to give more details, but I suspect you can do this with analytic functions instead of a self-join.

Submit a new question with example data ( create table + inserts ) showing what you're trying to do.

More to Explore

Design

New to good database design? Check out Chris Saxon's full fundamentals class.

Breadcrumb

Question and Answer

You Asked

and Chris said...

Rating

Comments

More to Explore

Design

Retrieve the id of the last inserted row with some combination of values and store it in current row

Breadcrumb

Question and Answer

You Asked

and Chris said...

Rating

Comments

More to Explore

Design