Skip to Main Content
  • Questions
  • How does insert and delete use space regarding to indexes?

Breadcrumb

May 4th

Question and Answer

Connor McDonald

Thanks for the question, jian huang.

Asked: January 06, 2009 - 7:33 am UTC

Last updated: October 20, 2018 - 4:15 pm UTC

Version: 9.2.0

Viewed 10K+ times! This question is

You Asked

Hello Tom

I have such as test:
SQL> create table t as select * from dba_objects;

Table created.

SQL> desc t
Name

OWNER
OBJECT_NAME
SUBOBJECT_NAME
OBJECT_ID
DATA_OBJECT_ID
OBJECT_TYPE
CREATED
LAST_DDL_TIME
TIMESTAMP
STATUS
TEMPORARY
GENERATED
SECONDARY

SQL> create index t_ind on t(object_id);

Index created.

SQL> desc segment_space_a
PROCEDURE segment_space_a
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
P_SEGNAME VARCHAR2 IN
P_OWNER VARCHAR2 IN DEFAULT
P_TYPE VARCHAR2 IN DEFAULT
P_PARTITION VARCHAR2 IN DEFAULT

SQL> set serveroutput on size 1000000
SQL> exec segment_space_a(P_SEGNAME=>'T_IND',p_type=>'INDEX');
Unformatted Blocks ..................... 0
FS1 Blocks (0-25) ..................... 0
FS2 Blocks (25-50) ..................... 1
FS3 Blocks (50-75) ..................... 0
FS4 Blocks (75-100)..................... 0
Full Blocks ..................... 65
Total Blocks............................ 80
Total Bytes............................. 655,360
Total MBytes............................ 0
Unused Blocks........................... 7
Unused Bytes............................ 57,344
Last Used Ext FileId.................... 14
Last Used Ext BlockId................... 120
Last Used Block......................... 1

PL/SQL procedure successfully completed.

SQL> delete from t;

29599 rows deleted.

SQL> exec segment_space_a(P_SEGNAME=>'T_IND',p_type=>'INDEX');
Unformatted Blocks ..................... 0
FS1 Blocks (0-25) ..................... 0
FS2 Blocks (25-50) ..................... 66
FS3 Blocks (50-75) ..................... 0
FS4 Blocks (75-100)..................... 0
Full Blocks ..................... 0
Total Blocks............................ 80
Total Bytes............................. 655,360
Total MBytes............................ 0
Unused Blocks........................... 7
Unused Bytes............................ 57,344
Last Used Ext FileId.................... 14
Last Used Ext BlockId................... 120
Last Used Block......................... 1

PL/SQL procedure successfully completed.

SQL> insert into t select * from dba_objects;

29600 rows created.

SQL> exec segment_space_a(P_SEGNAME=>'T_IND',p_type=>'INDEX');
Unformatted Blocks ..................... 76
FS1 Blocks (0-25) ..................... 0
FS2 Blocks (25-50) ..................... 102
FS3 Blocks (50-75) ..................... 0
FS4 Blocks (75-100)..................... 0
Full Blocks ..................... 128
Total Blocks............................ 384
Total Bytes............................. 3,145,728
Total MBytes............................ 3
Unused Blocks........................... 64
Unused Bytes............................ 524,288
Last Used Ext FileId.................... 14
Last Used Ext BlockId................... 776
Last Used Block......................... 64

PL/SQL procedure successfully completed.

SQL>

First I create a table t and index it. then I check the index blocks to see the space consumed.
Then I delete all rows from T and check the index block and as expected , the blocks remain the same... But when I reinsert the *same* rows, I expect the deleted spaces to be reused, since index data has its own place to go and rows are same. but you can see that the blocks increase by almost 55 percent.

Why is that? Since the same rows are inserted ?

Thanks!

and Tom said...

space in an index cannot be reused in the SAME transaction. If you:


delete from t;
1,000,000 rows deleted
insert into t select * from t2;
1,000,000 rows inserted

you would expect to see the indexes effectively double in size. If on the other hand you:

delete from t;
1,000,000 rows deleted
COMMIT;
insert into t select * from t2;
1,000,000 rows inserted


then you would find the index to probably be about the same size - the space would be reused.


Now, if the job was to remove all rows and reload the table, I would recommend:
truncate t;
insert /*+ APPEND */ into t select * from t2;


That would result in the most efficient space utilization. If you cannot have the data "disappear" (eg, this reload must be transactional) you can use partitioning, for example assume you want to reload T from a staging table:

drop table t;
drop table stage;
create table stage as select * from all_objects;

create table t
(
part_id,
OWNER, OBJECT_NAME, SUBOBJECT_NAME, OBJECT_ID, DATA_OBJECT_ID, OBJECT_TYPE, CREATED,
LAST_DDL_TIME, TIMESTAMP, STATUS, TEMPORARY, GENERATED, SECONDARY
)
NOLOGGING
PARTITION BY list (part_id)
(
  PARTITION part1 VALUES (1),
  PARTITION part2 VALUES (2)
)
as 
select 1, stage.* 
  from stage;

create index t_idx on t(owner,object_type,object_name) local nologging;

create or replace procedure refresh_t
as
 l_empty   number;
 l_full    number;
begin
 /* figure out partition to insert into, that'll be the EMPTY one */
 select empty, full into l_empty, l_full
   from (
    select 2 empty, 1 full
      from t partition (part1) 
  where rownum=1 
  union all 
    select 1 empty, 2 full
   from t partition (part2)
  where rownum=1
        );
 insert /*+ APPEND */ into t
 select l_empty, stage.* 
      from stage;

 execute immediate 'alter table t truncate partition part' || to_char(l_full);
end;
/



that'll have space utilized as nicely as possible. Only one partition at a time will appear to have data as far as end users are concerned.

Rating

  (24 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A reader, January 07, 2009 - 4:37 am UTC

Richard Foote has write a note about index which says the following.

When a index entry is marked as deleted and a subsequent insert occurs in the same block where the deleted entry reside then oracle automatically cleans up the deleted entry.

The note is here

http://richardfoote.wordpress.com/2008/06/18/deleted-index-entries-part-ii-v-2-schneider/

A resent post in OTN forum by him also says the same

http://forums.oracle.com/forums/message.jspa?messageID=3197213#3197213

My question would be, the OP has first created the table with dba_objects and after performing the delete on the table he once again inserted the same data that is from dba_object. So, will not all or at least most of the existing blocks containing the deleted entry be revisited during the insert, And hence forth cleaning up most of the deleted entries?

Have i understood the concept explained by Richard Foote in a wrong way. Please correct me.

Thanks,
Karthick.
Tom Kyte
January 07, 2009 - 9:34 am UTC

deleted index space *cannot* be reused in a single transaction.

If you throw a commit in there - then it can (and will) be reused.

consider:

ops$tkyte%ORA9IR2> create table t as select * from all_objects;

Table created.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> create index t_idx on t(object_name);

Index created.

ops$tkyte%ORA9IR2> analyze index t_idx validate structure;

Index analyzed.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> select name, blocks, lf_blks, br_blks, lf_rows, del_lf_rows, br_rows from index_stats;

NAME      BLOCKS    LF_BLKS    BR_BLKS    LF_ROWS DEL_LF_ROWS    BR_ROWS
----- ---------- ---------- ---------- ---------- ----------- ----------
T_IDX        256        149          1      30691           0        148

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> delete from t;

30691 rows deleted.

ops$tkyte%ORA9IR2> COMMIT;

Commit complete.

ops$tkyte%ORA9IR2> insert into t select * from all_objects;

30692 rows created.

ops$tkyte%ORA9IR2> COMMIT;

Commit complete.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> analyze index t_idx validate structure;

Index analyzed.

ops$tkyte%ORA9IR2> select name, blocks, lf_blks, br_blks, lf_rows, del_lf_rows, br_rows from index_stats;

NAME      BLOCKS    LF_BLKS    BR_BLKS    LF_ROWS DEL_LF_ROWS    BR_ROWS
----- ---------- ---------- ---------- ---------- ----------- ----------
T_IDX        256        149          1      30692           0        148

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> delete from t;

30692 rows deleted.

ops$tkyte%ORA9IR2> insert into t select * from all_objects;

30692 rows created.

ops$tkyte%ORA9IR2> COMMIT;

Commit complete.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> analyze index t_idx validate structure;

Index analyzed.

ops$tkyte%ORA9IR2> select name, blocks, lf_blks, br_blks, lf_rows, del_lf_rows, br_rows from index_stats;

NAME      BLOCKS    LF_BLKS    BR_BLKS    LF_ROWS DEL_LF_ROWS    BR_ROWS
----- ---------- ---------- ---------- ---------- ----------- ----------
T_IDX        384        310          1      61384       30692        309





it is all about the fact that index space cannot be reused in the same transaction that freed it

Thanks!

jian huang zheng, January 07, 2009 - 12:02 pm UTC

Nice explaination, Thanks!

Richard Foote, January 08, 2009 - 2:01 am UTC

Yes, a deleted index entry will not be cleaned out within the same transaction.

However, note that with a UNIQUE index, a deleted entry can be subsequently reused if re-inserted within the same transaction.

So with the orginal OP index on object_id, if it was created as a Unique index, and the deleted / insert applied, you would find the index size would remain the same.

Cheers
Richard Foote
http://richardfoote.wordpress.com/
Tom Kyte
January 08, 2009 - 9:20 am UTC

Richard -

thanks much for the followup tweak :)

Confused

Narendra, February 11, 2009 - 10:23 am UTC

Hello Tom/Richard,

I am bit confused here. In this post, you have demonstrated above that index space is not reused within the same transaction (in 9i). But in a very old review ( http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:2424495760219#5404349841171 ),
you have demonstrated how the deleted space is reused (in 8i). That example seems to achieve space-reuse within the same transaction.
Can you please help me in understanding the same ?
Tom Kyte
February 11, 2009 - 1:00 pm UTC

I see commits all over that example you pointed to - why do you think it was the same transaction, be more specific

Example

Narendra, February 12, 2009 - 4:22 am UTC

Tom,

Here is a copy-paste of the example I was talking about

You can test your theories about the empno 1-10 easily on your own systems -- give it a whirl. The
easiest way to bust a myth is to give a counter case! I'll do this one for you:


ops$tkyte@ORA817DEV.US.ORACLE.COM> create table emp ( empno int constraint emp_pk primary key,
ename varchar2(30) );
Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into emp select rownum, username from all_users;
46 rows created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> analyze index emp_pk validate structure;
Index analyzed.

ops$tkyte@ORA817DEV.US.ORACLE.COM> select lf_rows, del_lf_rows from index_stats;

LF_ROWS DEL_LF_ROWS
---------- -----------
46 0

that shows there are 46 leaf row entries -- 46 index entries

ops$tkyte@ORA817DEV.US.ORACLE.COM> delete from emp where empno <= 10;
10 rows deleted.

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> analyze index emp_pk validate structure;
Index analyzed.

ops$tkyte@ORA817DEV.US.ORACLE.COM> select lf_rows, del_lf_rows from index_stats;

LF_ROWS DEL_LF_ROWS
---------- -----------
46 10

That shows there are still 46 entries HOWEVER, 10 of them happened to be "deleted" -- not
pointing to an active row in the table anymore...


ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into emp select rownum, lower(username) from all_users
where rownum <= 10;

10 rows created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> analyze index emp_pk validate structure;

Index analyzed.

ops$tkyte@ORA817DEV.US.ORACLE.COM> select lf_rows, del_lf_rows from index_stats;

LF_ROWS DEL_LF_ROWS
---------- -----------
46 0

Lo and behold, there are still only 46 leaf block entries (rows) and now there are 0 deleted
ones. That proves that we'll reuse space. Just to make it more interesting, we'll do the same
thing but we'll reuse the entries WITH DIFFERENT values (not 1..10)

Tom Kyte
February 12, 2009 - 12:16 pm UTC

well two things

a) a unique index can reuse space within the same transaction (that was just mentioned right above - Richard Foote comment)

b) analyze is ddl, ddl commits - the analyze are commits.

ops$tkyte%ORA11GR1> create table t ( x int constraint t_pk primary key );

Table created.

ops$tkyte%ORA11GR1> insert into t values ( 1 );

1 row created.

ops$tkyte%ORA11GR1> analyze index t_pk validate structure;

Index analyzed.

ops$tkyte%ORA11GR1> rollback;

Rollback complete.

ops$tkyte%ORA11GR1> select * from t;

         X
----------
         1



if analyze didn't commit, the rollback would undo the insert.

Thanks

Narendra, February 13, 2009 - 4:14 am UTC

Thanks Tom.
I missed out both the aspects, unique & analyze as DDL.

Michael Virnstein, February 27, 2009 - 3:08 am UTC

Hi Tom,

when i have a table, where i frequently delete and insert lots of rows and this table has a primary key populated by a sequence, will the deleteted space be reused on that primary key index, when new rows get inserted or will it just grow? I see lots of DEL_LF_ROWS in that index.
Is an reverse key index a good solution for such a primary key index to reuse space?

Regards, Michael
Tom Kyte
March 03, 2009 - 9:36 am UTC

do you delete MOST but not ALL of the old rows? If so, you might need to coalesce that index.


do you delete large range of contigous rows? for example: where id between 100 and 10000. Then the index will take care of itself over time. You would expect to see deleted leaf rows after a delete, but the subsequent inserts will cause fully empty leaf blocks to move over from the left to the right and get reused over time.

Michael Virnstein, March 03, 2009 - 12:14 pm UTC

Thanks for your answer, tom.

> do you delete MOST but not ALL of the old rows? If so, you might need to coalesce that index.

we do not delete all of the rows and we don't delete ranges. The pks of the rows involved are quite random, so i would have to go with the coalesce. This approach isn't possible at the moment, because we are using standard edition of 10g. I also wouldn't want to coalesce periodically. I read some of your posts regarding index rebuilding, where you said to better fix the problem once and not maintaining something over and over.
As i don't need to query the index with "pk > :x" or "max(pk)", i thought a reverse index would probably be a solution, because it would result in a more even data distribution and deleted space would be reused more quickly than waiting that an index block will be empty completely.
Is there something wrong with my approach?
Tom Kyte
March 03, 2009 - 9:04 pm UTC

given your constraints, it would be something to definitely consider - yes.

Index "Leaking" with regular DELETEs and INSERTs

Hemant K Chitale, March 04, 2009 - 1:20 am UTC

In a Peoplesoft application, I have a table which udergoes periodic DELETEs (ie in batch jobs / process requests) :
"DELETE FROM PS_COMB_EXP_TAO4 WHERE PROCESS_INSTANCE >= 0",
followed by an INSERT.
Every INSERT is then followed by a COMMIT.
(Agreed, that there may be no COMMIT between the DELETE and the INSERT and, since the two Indexes are not Unique, index leaf blocks may not be reused by the same transaction ... but the *next* transaction should be able to re-use the blocks ?)

However, I find that the Index segment keeps growing. Even queries (from a seperate session) also show very high consistent gets when using the Index :

SQL> set autotrace on
SQL> select count(*) from sysadm.ps_comb_exp_tao4 where process_instance > 0;
COUNT(*)
----------
0
Execution Plan
----------------------------------------------------------
0
SELECT STATEMENT Optimizer=ALL_ROWS (Cost=3 Card=1 Bytes=4)
1 0
SORT (AGGREGATE)
2 1
INDEX (FAST FULL SCAN) OF 'PSACOMB_EXP_TAO4' (INDEX) (Cost=3 Card=500 Bytes=2000)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
6775 consistent gets
6401 physical reads
0 redo size
335 bytes sent via SQL*Net to client
427 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> select /*+ FULL (c) */ count(*) from sysadm.ps_comb_exp_tao4 c;
COUNT(*)
----------
0
Execution Plan
----------------------------------------------------------
0
SELECT STATEMENT Optimizer=ALL_ROWS (Cost=4 Card=1)
1 0
SORT (AGGREGATE)
2 1
TABLE ACCESS (FULL) OF 'PS_COMB_EXP_TAO4' (TABLE) (Cost=4 Card=500)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
137 consistent gets
0 physical reads
0 redo size
335 bytes sent via SQL*Net to client
427 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed


The optimizer is, unfortunately, using the Index for even executing the DELETEs.  Since such DELETEs happen frequently through the day, from standard Peoplesoft code, I need to find why the Index keeps growing.
COALESCE/REBUILD would "fix" the issue but would have to be scheduled as an additional job.

(The current Execution Plan shows a CARD of 500 as that is from the "Locked" Statistics on the table).

Tom Kyte
March 04, 2009 - 1:11 pm UTC

table create
index create
more information on the number of rows deleted/added over time.


why are the statistics so bad on this? It cost the fast full scan at 3, the table scan at 4. If the index is BIGGER than the table, that would be 'wrong'. Are the statistics even remotely "correct" for this stuff?

Information about the table and "leaking" index

Hemant K Chitale, March 05, 2009 - 1:32 am UTC

SQL> desc sysadm.ps_comb_exp_tao4
 Name                                                                     Null?    Type
 ------------------------------------------------------------------------ -------- --------------------------
 PROCESS_INSTANCE                                                         NOT NULL NUMBER(10)
 SETID                                                                    NOT NULL VARCHAR2(5)
 TREE_EFFDT                                                                        DATE
 PROCESS_GROUP                                                            NOT NULL VARCHAR2(10)
 COMBINATION                                                              NOT NULL VARCHAR2(10)
 SEQUENCE_NBR_6                                                           NOT NULL NUMBER(38)
 CHARTFIELD                                                               NOT NULL VARCHAR2(18)
 RANGE_FROM_30                                                            NOT NULL VARCHAR2(30)
 EFFDT_FROM                                                                        DATE
 EFFDT_TO                                                                          DATE

SQL> @which_indexes
Enter value for table_name: PS_COMB_EXP_TAO4
Enter value for table_owner: SYSADM

Index                               Column                          Pos
----------------------------------- ------------------------------ ----
PSACOMB_EXP_TAO4                    PROCESS_INSTANCE                  1
                                    SETID                             2
                                    PROCESS_GROUP                     3
                                    CHARTFIELD                        4
                                    RANGE_FROM_30                     5
                                    EFFDT_FROM                        6
                                    EFFDT_TO                          7
                                    COMBINATION                       8
                                    SEQUENCE_NBR_6                    9


PSBCOMB_EXP_TAO4                    PROCESS_INSTANCE                  1
                                    SETID                             2
                                    PROCESS_GROUP                     3
                                    CHARTFIELD                        4
                                    COMBINATION                       5
                                    SEQUENCE_NBR_6                    6
                                    RANGE_FROM_30                     7
                                    EFFDT_FROM                        8
                                    EFFDT_TO                          9



18 rows selected.

SQL>

The Statistics, as I've said, have been "LOCKED" at a count of 500 rows.  The table undergoes INSERT and DELETE and INSERT (most likely with no COMMITs in between, only at the end of "processing") in sets of a few hundred to a max of 20,000 rows.

What seems strange to me is that an Index lookup has to read so very many blocks -- as if Index Leaf Block cleanup isn't occurring.

PROCESS_INSTANCE is the leading column of the index and every "set" of operations (DELETE INSERT) is for a *new* PROCESS_INSTANCE, so "older" PROCESS_INSTANCE values are not being re-used

Tom Kyte
March 05, 2009 - 1:47 pm UTC

unlock the statistics, you are doing this to yourself, we don't know what the structures look like - hence you will get 'bad' plans.


... as if Index Leaf Block cleanup isn't occurring.

...


the table is empty, until the table contains more rows, these leaf blocks have no place to go - you need fresh data in there to use them.


there are no create tables, no create indexes here. But anyway, this is happening (bad plans) because you have tricked us into thinking the index is very very very small - smaller than the table.

And right now, it isn't.


Concentrate on the Index Size

Hemant K Chitale, March 05, 2009 - 10:50 pm UTC

>the table is empty, until the table contains more rows, these leaf blocks have no place to go - you need fresh data in there to use them.

That is why I say that the table undergoes (very many) cycles of DELETE .. INSERT .. end transaction.

I can understand leaf blocks not being reused between the DELETE and INSERT in 1 transaction.

But why does the next transaction still result in the index growing further.

The next transaction that runs DELETE .. INSERT .. end transaction is not cleaning up the leaf blocks left by the first transaction. So, after the DELETE the INSERT in this second transaction allocates new leaf blocks again.

and so on for few thousand transactions every day.


My question isn't about the execution plan and I do know how the (incorrect) statistics can drive the wrong execution plan. My concern here is about the index growing.



Tom Kyte
March 06, 2009 - 10:11 am UTC

Can you show us the empirical information that shows this index is growing over time?

I know it grew, but you say "it keeps growing", can you provide us a history of the growth and what happened over time (I think it has grown, and here it is, you have what you have - it is done growing, but it is currently big).

You might need to schedule a coalesce on this periodically. The delete+insert will never reuse the space and if you have more than one going on concurrently......


If you don't have the empirical information over time - coalesce or rebuild it, and start a job that monitors the growth of this segment over time (eg; every six hours, have something come in an check out the index size and save the information in a table)

why unique index can while non-unique cant?

jian huang zheng, March 07, 2009 - 11:09 am UTC

Hello Tom

I just wonder why oracle permits in a single transaction to reuse the space for the unique index, while non-unique just cant?

what is the technical reason behind this?

Thanks!

Index Growth Statistics

Hemant K Chitale, March 08, 2009 - 10:25 pm UTC

Tom,  To monitor the "growth" (or "leakage") in the Index sizes :

We had enabled collection of size information only during Overnight Batch runs (11pm to 8am).  However, the table is also used by processes running for user requests.

Thus, I see "leakage" at different times (but no statistics captured between 8am and 11pm) :

(higlighting only the timestamps where I see an increase in the size)

SQL> l
  1  select segment_name, collection_datestamp, blocks
  2  from segments_size_history
  3  where segment_name in ('PSACOMB_EXP_TAO4','PSBCOMB_EXP_TAO4')
  4  and to_char(collection_datestamp,'DD-MON')  = '04-MAR'
  5  and (
  6  to_char(collection_datestamp,'MI') like '%2'
  7  )
  8* order by segment_name, collection_datestamp
SQL> /

SEGMENT_NAME              COLLECTION_D     BLOCKS
------------------------- ------------ ----------
PSACOMB_EXP_TAO4          04-MAR 00:02     215552

PSACOMB_EXP_TAO4          04-MAR 04:42     215552
PSACOMB_EXP_TAO4          04-MAR 04:52     215552

PSACOMB_EXP_TAO4          04-MAR 05:02     216576

PSACOMB_EXP_TAO4          04-MAR 05:12     219776
PSACOMB_EXP_TAO4          04-MAR 05:22     219776

PSACOMB_EXP_TAO4          04-MAR 05:32     223104
PSACOMB_EXP_TAO4          04-MAR 05:42     223104

PSACOMB_EXP_TAO4          04-MAR 07:02     223104
PSACOMB_EXP_TAO4          04-MAR 07:12     223104

PSACOMB_EXP_TAO4          04-MAR 23:42     264704
PSACOMB_EXP_TAO4          04-MAR 23:52     264704


PSBCOMB_EXP_TAO4          04-MAR 00:02     255232

PSBCOMB_EXP_TAO4          04-MAR 04:32     255232
PSBCOMB_EXP_TAO4          04-MAR 04:42     255232

PSBCOMB_EXP_TAO4          04-MAR 04:52     256256

PSBCOMB_EXP_TAO4          04-MAR 05:02     261120
PSBCOMB_EXP_TAO4          04-MAR 05:12     261120

PSBCOMB_EXP_TAO4          04-MAR 05:22     262144

PSBCOMB_EXP_TAO4          04-MAR 05:32     265216

PSBCOMB_EXP_TAO4          04-MAR 07:02     265216
PSBCOMB_EXP_TAO4          04-MAR 07:12     265216

PSBCOMB_EXP_TAO4          04-MAR 23:42     320128
PSBCOMB_EXP_TAO4          04-MAR 23:52     320128


SQL>

Tom Kyte
March 09, 2009 - 3:53 am UTC

this is a very short period of time, does it keep growing forever, or does it stop.


I'll have to ask again - does the delete delete everything?

how does the delete work? show the syntax of it with representative inputs

describe the column(s) this index is on - in detail. their types, their values.

does more than one session do a delete+insert at or near the same point in time.

We really need to understand all of the processing here.

Operations against the Table

Hemant K Chitale, March 10, 2009 - 2:54 am UTC

I have, earlier, provided the DELETE statement, the manner of usage and the Index definitions.

The DELETE always deletes *all* the rows.  

Nevertheless, here's a listing :
SQL> l
  1  select count(*), substr(sql_text,1,182), sum(executions), sum(buffer_gets), sum(cpu_time), sum(elapsed_time), sum(rows_processed)
  2  from v$sqlstats
  3  where upper(sql_text) like '%PS_COMB_EXP_TAO4%'
  4  and (upper(sql_text) like 'DELETE%' or upper(sql_text) like 'INSERT INTO %' )
  5  group by substr(sql_text,1,182)
  6* order by 2
SQL> /

  COUNT(*)                                                                                                                          
----------                                                                                                                          
SUBSTR(SQL_TEXT,1,182)                                                                                                              
------------------------------------------------------------------------------------------------------------------------------------
SUM(EXECUTIONS) SUM(BUFFER_GETS) SUM(CPU_TIME) SUM(ELAPSED_TIME) SUM(ROWS_PROCESSED)                                                
--------------- ---------------- ------------- ----------------- -------------------                                                
         1                                                                                                                          
DELETE FROM PS_COMB_EXP_TAO4 WHERE PROCESS_INSTANCE >= 0                                                                            
           5414        158238699     571747534         632733408            17060606                                                
                                                                                                                                    
         15                                                                                                                          
INSERT INTO PS_COMB_EXP_TAO4 (PROCESS_INSTANCE, SETID, TREE_EFFDT, PROCESS_GROUP, COMBINATION, SEQUENCE_NBR_6, CHARTFIELD, RANGE_FRO
M_30, EFFDT_FROM, EFFDT_TO) SELECT 2224871, A.SETI                                                                                  
             15               74         54876            252910                   2  
.... not listing every INSERT statement .....
--------------- ----------------               ----------------- -------------------                                                
                                                                                                                                    
sum                                                                                                                                 
                                                                                                                                    
           7367        165663154                       817821184            18600907                                                
                                                                                                                                    


there are 38 different INSERT statements. Each execution of an INSERT has a specific PROCESS_INSTANCE value -- thus an INSERT will only provide one PROCESS_INSTANCE for all the rows it inserts.

These operations happen concurrently between 1 to 3 sessions.

Tom Kyte
March 11, 2009 - 2:23 am UTC

...I have, earlier, provided the DELETE statement, the manner of usage and the
Index definitions.
....

and I don't page up/down/all around trying to re-find if the information was provided, this is a review/followup section - short, sweet, get tons of them every day. In between reading your snippets, I read dozens - if not hundreds - of other review/followups.

"The DELETE always deletes *all* the rows. "

and why can't we truncate?



and, this is a very short period of time, does it keep growing forever, or does it stop.


Not very helpful response

Hemant K Chitale, March 12, 2009 - 4:27 am UTC

> Why can't we truncate ?
Yes, I know about TRUNCATE. I also know about REBUILD. I also know about COALESCE.
This is Peoplesoft (ever heard of Peoplesoft ?) delivered code.
Yes, I have an SR with Peoplesoft and the response has been "I have searched Knowledge base and also if any existing bugs on theze kind of issues.There are no known issues reported for this table in PeopleSoft.
There are no specific inserts and deletes against this table."
which isn't helpful because I can see Peoplesoft delivered coded doing INSERTs and DELETEs.
All I need to know is why the index size keeps growing on and on. By more than 1GB over 20 days.

If you don't have the time to scroll up and down .... (I wonder if every asktom thread repeats information).
I know that a number of your supporters are going to jump on me for the title "Not very helpful response". That's just too bad. But if you offer to help someone, please put in the effort to help someone.
Tom Kyte
March 12, 2009 - 12:53 pm UTC

yes, I've heard of peoplesoft, did you know with peoplesoft you can sometimes write your own code, that not all code is delivered that a lot of code is developed - by you? did you know that? Do I know that you guys didn't write this code? do I? No, I do not.


Sorry you didn't find this back and forth useful. Please schedule a coalesce in the meantime - or consider unlocking the statistics.

... If you don't have the time to scroll up and down .... (I wonder if every asktom
thread repeats information).
....

if they want me to consider looking at *a REVIEW FOLLOWUP WHICH ISN"T SUPPOSED TO BE a new question* sure they do.


And you know what, if you KNOW there are deletes/inserts being done against this table, you'd be able to tell support the module that is involved and give them more information...


All I can suspect here (since I cannot reproduce myself) is that you have more concurrent "delete+inserts" happening at the same time than you believe you do - OR (just as likely) your analysis of how the table is used is flawed at some level, it doesn't happen as simply as you describe.

And no, I don't think it is horrible for me to ask you to reiterate something once or twice - you are very very close to the problem, you have all of the information. Did you hit page up and see how big this page is? How big all of the pages are? How hard it can be to scrape together the bits of data?

We'll just go ahead with REBUILDs

Hemant K Chitale, March 12, 2009 - 11:51 pm UTC

First, sorry I riled you. My apologies.

I can't reproduce the "index size leakage" using a dummy table and dummy data.

Some other references are at
http://tech.groups.yahoo.com/group/psftdba/message/2533

I know the dropping or rebuilding the indexes would "fix" the issue but I coulnd't find out why the indexes grow.


Tom Kyte
March 13, 2009 - 7:28 am UTC

suggest coalesces over rebuilds, rebuilds could lead easily to ora-8103's (object no longer exists) as it drops the old segment - coalesce - always online, in place operation.

unless and until we can reproduce the issue - it'll be hard to answer "why", that is what I was trying to get to (as well as suggesting things that would avoid it in the first place regardless). If we cannot simulate it using your description of what is happening - then something else is happening beyond what you are describing (we are missing part of the equation - part of the puzzle - the description of what is taking place is not 100% there yet, we have a missing fact or two). And if and when we discover that - we'll likely be able to answer "why"

Insert/Delete Cycles On Table With Unique And Non-Unique Index

Dave, January 18, 2010 - 1:07 pm UTC

We are using Oracle 10gR2 with a table that is having some unexpected index contention waits. The table is hash partitioned on a transaction ID field (64-way hash). There is a local primary key which includes this field, along with a couple of other fields. The transaction ID field is essentially a sequence managed by an external application and is alwasy increasing. There are constant single-row inserts into the table from multiple sessions - about 100-200 per second depending on time of day. We have to keep four days of data, and the primary key does not have a date field on which we can do range-partitioning, so we have to do deletes. We delete from one hash partition at a time, where the insert date is less than or a specific date about 4 days ago. To prevent this from locking a lot of rows, we put a clause to limit rownum to less than 5000 and do a commit afterward. We run this delete/commit repeatedly until all the records that need to be deleted are deleted. There is a local, non-unique index on the insert date field, which is used by the deletes. So, both the non-unique index and primary key index are right-growing indexes, and we regularly delete rows from the table that would be on the left side of the index.

This has been going fine for about 3 months now, but we are suddenly starting to see strange index contention waits - mostly related to the insert date index. Multiple inserts into the table all seem to wait for up to 16 seconds, and then at the same time they suddenly are able to insert the record and contiue. The wait events on the AWR show index contention, but it seems more like a locking operation. The rows with the waits are all on the same hash-partition of the table. The problem does not occur while the delete operations are occuring on the same table hash-partition, but are in the range of 5-10 minutes after. We delete every hour, so the link between the delete operation and the index contention is anecdotal at best, but seems to be a logical explanation.

Given the previous entries in this thread, it seems the local primary key index should be re-using the blocks but that the non-unique index on the insert date will not. Does this mean we probably only need to rebuild the local non-unique index on the insert date, or should be consider rebuilding both that and the local primary key index?
Tom Kyte
January 19, 2010 - 3:54 pm UTC

... To prevent this from locking a lot of rows, we put a clause
to limit rownum to less than 5000 and do a commit afterward. ...

I didn't get this, if you are DELETING them, why do you care if they are locked? What could be, what would be logically trying to modify data you are getting rid of? If there would be a contention issue - I'd say you have a big logic flaw. You are deleting active data. If the data isn't active, no one is modifying it, then who cares if you lock it? No one will be touching it.... I'm confused.


... Given the previous entries in this thread, it seems the local primary key index
should be re-using the blocks but that the non-unique index on the insert date
will not. ...

I'm not sure how to read that - it sounds like you are saying "this page tells us the unique index will reuse space but the non-unique will not". If that is what you meant, please point out where you got that from - it is not true.


I think your cause and effect observation might not be true - why does it seem logical to combine the two since they happen so far apart from each other in time?

do you have the AWR data in report form for us to look at?

script to reproduce growing/leaking PK index

Alex Koop, October 06, 2011 - 3:30 am UTC

In our production, tables which have a lot of DML operations are a real headache! There are some kind of message dispatching tables, which experience a lot of inserts and deletions, and at end of a day their indexes have hundreds of MB in size, although they contain only few thousands rows. We do a index rebuild on such indexes every night and only this way helps.
Finally I have a script, which helps to reproduce this problem (similar to the problem in PeopleSoft software described above):

CREATE TABLESPACE TEST_TBS DATAFILE
'test_tbs' SIZE 50000K AUTOEXTEND ON NEXT 1000K MAXSIZE UNLIMITED
LOGGING
ONLINE
PERMANENT
EXTENT MANAGEMENT LOCAL UNIFORM SIZE 128K
BLOCKSIZE 8K
SEGMENT SPACE MANAGEMENT AUTO
FLASHBACK ON;
/

CREATE TABLE t1 (t1_key NUMBER(11), t2_col NUMBER(11)) TABLESPACE TEST_TBS;
/

CREATE UNIQUE INDEX pk_t1 ON t1(t1_key) TABLESPACE TEST_TBS;
/

ALTER TABLE t1 ADD CONSTRAINT pk_t1 PRIMARY KEY (t1_key) USING INDEX pk_t1;
/

DECLARE
idx NUMBER (11);
bulk_size NUMBER(11);
BEGIN

EXECUTE IMMEDIATE 'TRUNCATE TABLE t1';

bulk_size := 1000;
idx := 0;

WHILE idx < 2000
LOOP
DELETE FROM t1;
COMMIT;

INSERT INTO t1 (t1_key)
(SELECT LEVEL + idx * bulk_size
FROM DUAL
CONNECT BY LEVEL < bulk_size);

COMMIT;

idx := idx + 1;
END LOOP;
END;
/

After execution of this script, the index PK_T1 is at least 5-10 MB big, although table T1 contains only 1000 rows. If I use 'SEGMENT SPACE MANAGEMENT MANUAL' option when creating tablespace, then this problem does not occur and index PK_T1 is only 64K.

INSERT grows

Racer I., October 08, 2018 - 6:39 am UTC

Hi,

I found this thread when searching for the problem described in my next entry and as there wasn't an answer to the last entry above I'll put my findings here. Namely that this does indeed happen (12.1, without special TS settings, which means AS(S)M, I think).
My guess : optimized index maintenance for BULK-DML (INSERT as SELECT) where Oracle builds a mini-index for just the new values then merges it into the existing structure. This seems to bypass the free-list, at least those blocks that would have to be unhooked from the index structure and gets the necessary blocks directly from new extents. For monotonously growing indexes the merge will just stitch the new part onto the end of the old, even if that was completely deleted. That seems to happen in the example above. Although previously I thought the mini-index+merge was only done for APPEND/DIRECT-path and left the index unusable but maybe Oracle has optimized this further? Cant't seem to find any documentation for this though.

If this is the only use-case of that index it will grow indefinitely and never reuse deleted blocks from the free-list.
For non-monotonously growing indexes the merging will fill existing blocks and the resulting splits will probably reuse free-list blocks.
For monotonously growing indexes with INSERTS from many concurrent sessions each will get a random selection of key values so all but the first insert will do merging instead of stitching. This will reuse free-list blocks, but possibly not all of them so the index will still grow from the constant use of new extents for the temporary mini-index.
Its possible this kicks in also for JDBC-Batching (preparedStatement.executeBatch) and PL/SQL-Batching (FORALL...INSERT) and even if the number of inserted rows is small or even just one row (not tested). In the latter case the space waste could be significant.

regards,
Connor McDonald
October 10, 2018 - 3:08 am UTC

OK, here's my hypothesis. It would not suprise me if the entry from October 06, 2011 indeed revealed some sort of anomalous behaviour. Over time, there's been some issues with ASSM in terms of "runaway" segment growth, but I've rarely seen them on more recent releases. So I repeated the script, scaled up to 9000 iterations on both ASSM and non-ASSM. So first with ASSM

SQL> CREATE TABLE t1 (t1_key NUMBER(11), t2_col NUMBER(11)) TABLESPACE largets;

Table created.

SQL> CREATE UNIQUE INDEX pk_t1 ON t1(t1_key) TABLESPACE largets;

Index created.

SQL> ALTER TABLE t1 ADD CONSTRAINT pk_t1 PRIMARY KEY (t1_key) USING INDEX pk_t1;

Table altered.

SQL>
SQL> set serverout on
SQL> DECLARE
  2     idx   NUMBER (11);
  3     bulk_size NUMBER(11);
  4     s number;
  5     t1 timestamp;
  6     t2 timestamp;
  7  BEGIN
  8
  9     EXECUTE IMMEDIATE 'TRUNCATE TABLE t1';
 10
 11     bulk_size := 1000;
 12     idx := 0;
 13
 14     WHILE idx < 9000
 15     LOOP
 16        DELETE FROM t1;
 17        COMMIT;
 18
 19        t1 := systimestamp;
 20        INSERT INTO t1 (t1_key)
 21           (SELECT LEVEL + idx * bulk_size
 22              FROM DUAL
 23           CONNECT BY LEVEL < bulk_size);
 24        t2 := systimestamp;
 25
 26        COMMIT;
 27
 28        idx := idx + 1;
 29        select bytes into s from user_segments where segment_name = 'PK_T1';
 30        dbms_output.put_line(lpad(idx,8)||' - '||lpad(s,12)||', elapsed='||(t2-t1));
 31     END LOOP;
 32  END;
 33  /
1 -        65536, elapsed=+000000000 00:00:00.008000000
2 -        65536, elapsed=+000000000 00:00:00.003000000
3 -       131072, elapsed=+000000000 00:00:00.002000000
4 -       131072, elapsed=+000000000 00:00:00.003000000
5 -       131072, elapsed=+000000000 00:00:00.002000000
6 -       131072, elapsed=+000000000 00:00:00.002000000
7 -       131072, elapsed=+000000000 00:00:00.003000000
8 -       131072, elapsed=+000000000 00:00:00.002000000
9 -       131072, elapsed=+000000000 00:00:00.003000000
10 -       131072, elapsed=+000000000 00:00:00.003000000
11 -       131072, elapsed=+000000000 00:00:00.002000000
...
...
175 -      2097152, elapsed=+000000000 00:00:00.002000000
176 -      2097152, elapsed=+000000000 00:00:00.002000000
177 -      2097152, elapsed=+000000000 00:00:00.002000000
178 -      2097152, elapsed=+000000000 00:00:00.004000000
179 -      2097152, elapsed=+000000000 00:00:00.004000000
180 -      2097152, elapsed=+000000000 00:00:00.003000000
181 -      2097152, elapsed=+000000000 00:00:00.002000000
182 -      2097152, elapsed=+000000000 00:00:00.002000000
183 -      2097152, elapsed=+000000000 00:00:00.002000000
184 -      2097152, elapsed=+000000000 00:00:00.002000000
185 -      2097152, elapsed=+000000000 00:00:00.003000000
186 -      2097152, elapsed=+000000000 00:00:00.002000000
187 -      2097152, elapsed=+000000000 00:00:00.002000000
...
...

8995 -      2097152, elapsed=+000000000 00:00:00.002000000
8996 -      2097152, elapsed=+000000000 00:00:00.002000000
8997 -      2097152, elapsed=+000000000 00:00:00.002000000
8998 -      2097152, elapsed=+000000000 00:00:00.002000000
8999 -      2097152, elapsed=+000000000 00:00:00.002000000
9000 -      2097152, elapsed=+000000000 00:00:00.003000000

PL/SQL procedure successfully completed.

SQL>


We grow to 2meg, but we never got any larger than that. And that is pretty much how I would expect ASSM to work, since thats almost a "design principle" for ASSM, namely, we're happy to be more loose with space in order to better serve concurrency/RAC etc. (The latter obviously not being tested in this scenario).

And on no-assm, we got a better storage result

SQL> CREATE TABLE t1 (t1_key NUMBER(11), t2_col NUMBER(11)) TABLESPACE no_assm;

Table created.

SQL> CREATE UNIQUE INDEX pk_t1 ON t1(t1_key) TABLESPACE no_assm;

Index created.

SQL> ALTER TABLE t1 ADD CONSTRAINT pk_t1 PRIMARY KEY (t1_key) USING INDEX pk_t1;

Table altered.

SQL>
SQL> set serverout on
SQL> DECLARE
  2     idx   NUMBER (11);
  3     bulk_size NUMBER(11);
  4     s number;
  5     t1 timestamp;
  6     t2 timestamp;
  7  BEGIN
  8
  9     EXECUTE IMMEDIATE 'TRUNCATE TABLE t1';
 10
 11     bulk_size := 1000;
 12     idx := 0;
 13
 14     WHILE idx < 9000
 15     LOOP
 16        DELETE FROM t1;
 17        COMMIT;
 18
 19        t1 := systimestamp;
 20        INSERT INTO t1 (t1_key)
 21           (SELECT LEVEL + idx * bulk_size
 22              FROM DUAL
 23           CONNECT BY LEVEL < bulk_size);
 24        t2 := systimestamp;
 25
 26        COMMIT;
 27
 28        idx := idx + 1;
 29        select bytes into s from user_segments where segment_name = 'PK_T1';
 30        dbms_output.put_line(lpad(idx,8)||' - '||lpad(s,12)||', elapsed='||(t2-t1));
 31     END LOOP;
 32  END;
 33  /
1 -      1048576, elapsed=+000000000 00:00:00.006000000
2 -      1048576, elapsed=+000000000 00:00:00.002000000
3 -      1048576, elapsed=+000000000 00:00:00.002000000
4 -      1048576, elapsed=+000000000 00:00:00.002000000
5 -      1048576, elapsed=+000000000 00:00:00.002000000
6 -      1048576, elapsed=+000000000 00:00:00.002000000
...
...

8994 -      1048576, elapsed=+000000000 00:00:00.002000000
8995 -      1048576, elapsed=+000000000 00:00:00.002000000
8996 -      1048576, elapsed=+000000000 00:00:00.002000000
8997 -      1048576, elapsed=+000000000 00:00:00.002000000
8998 -      1048576, elapsed=+000000000 00:00:00.004000000
8999 -      1048576, elapsed=+000000000 00:00:00.002000000
9000 -      1048576, elapsed=+000000000 00:00:00.002000000

PL/SQL procedure successfully completed.

SQL>


but my point remains - 9,000,000 rows "exercised" and our index segment is still on 2meg under ASSM. I don't see *never ending* growth.

DELETE woes

Racer I., October 08, 2018 - 6:40 am UTC

Hi,

Here I'll describe the problem that brought me to this thread.
scenario : self-implemented queueing table with monotonously growing index (DATE-column). Inserts on the right, (complete) deletes on the left.
problem : selecting from the left (oldest first processing) gets progressively slower, ever more buffer gets reported
reason : deleted index blocks are put on the free-list but are not unhooked from the structure until the block is reused
workaround : INDEX SHRINK SPACE COMPACT every hour
scope : just this one index (0.01% of the application) but fairly central, as all asynchronous processing uses this table

So far we'd assumed the reason the blocks aren't reused and unhooked in a timely manner was the workload pattern. There are several hundred different message-types, each with their own partition and local index which can show different workload patterns. Sometimes INSERT-bursts once day, only noise or nothing during the rest of the day (few opportunities to reuse free-list blocks). Sometimes a steady medium inflow (many opportunities). Now I begin to suspect the problem described in the previous entry contributes as well.

ideas : we could open enhancement requests with oracle
a) ALTER INDEX, ALTER SESSION or Hint to switch DELETE-index maintenance to IMMEDIATE (i.e. unhook completely deleted INDEX-blocks right away).
b) new ALTER INDEX command or extension of SHRINK SPACE to just remove completely empty blocks from the index structure instead of a full SHRINK. Maybe choose any combination of LEFT HAND, RIGHT HAND or INTERNAL. We would just use LEFT HAND after every DELETE. That should be a lot less system load and avoid all the bad things (see Richard Foote : rebuilding the truth).
c) Redesigning B-tree indexes in oracle (or add a new index type) without the double linking of leaf blocks which makes index maintenance twice as costly as it needs to be. Use the branch blocks for range scanning and propagate completely deleted leaf blocks to their branch block (set the reference to deleted or simply remove it altogether *).

*) branch blocks are always hot and this would make them even more so. Maybe allow 100 consistent versions instead of the usual <10?

What would be your take on this?

regards,

Connor McDonald
October 10, 2018 - 3:53 am UTC

I'm going to propose something different :-) This strikes me as something you could fix in the application.

Coming back to:

scenario : self-implemented queueing table with monotonously growing index (DATE-column). Inserts on the right, (complete) deletes on the left.
problem : selecting from the left (oldest first processing) gets progressively slower, ever more buffer gets reported


I started with this:

SQL> CREATE TABLE t1 (t1_key NUMBER(11), t2_col NUMBER(11)) TABLESPACE largets;

Table created.

SQL> CREATE UNIQUE INDEX pk_t1 ON t1(t1_key) TABLESPACE largets;

Index created.

SQL> ALTER TABLE t1 ADD CONSTRAINT pk_t1 PRIMARY KEY (t1_key) USING INDEX pk_t1;

Table altered.

SQL>
SQL> INSERT INTO t1 (t1_key)
  2  SELECT rownum FROM DUAL
  3  CONNECT BY LEVEL < 1000;

999 rows created.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> exec dbms_stats.gather_table_stats('','T1');

PL/SQL procedure successfully completed.

SQL>
SQL> set autotrace traceonly stat
SQL> select *
  2  from (
  3    select /*+ index_asc(t pk_t1) */ *
  4    from t1
  5    order by t1_key
  6  )
  7  where rownum <= 10;

10 rows selected.


Statistics
----------------------------------------------------------
         17  recursive calls
         16  db block gets
         13  consistent gets
          0  physical reads
       3152  redo size
        728  bytes sent via SQL*Net to client
        608  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed

SQL> /

10 rows selected.


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          0  physical reads
          0  redo size
        728  bytes sent via SQL*Net to client
        608  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed

SQL> set autotrace off
SQL>
SQL>
SQL>


So this is our "baseline" so to speak. We want to perform approx 5 consistent gets when trying to get the first 10 rows from our table, no matter what activity occurs on the table.

Now I'll mimic the ongoing delete and inserts as per previous review

SQL> set serverout on
SQL> DECLARE
  2     idx   NUMBER (11);
  3     bulk_size NUMBER(11);
  4  BEGIN
  5     bulk_size := 1000;
  6     idx := 0;
  7
  8     WHILE idx < 5000
  9     LOOP
 10        DELETE FROM t1;
 11        COMMIT;
 12
 13        INSERT INTO t1 (t1_key)
 14           (SELECT LEVEL + idx * bulk_size
 15              FROM DUAL
 16           CONNECT BY LEVEL < bulk_size);
 17
 18        COMMIT;
 19
 20        idx := idx + 1;
 21     END LOOP;
 22  END;
 23  /

PL/SQL procedure successfully completed.


and now we can see we're into the buffer get territory you mentioned

SQL> set autotrace traceonly stat
SQL> select *
  2  from (
  3    select /*+ index_asc(t pk_t1) */ *
  4    from t1
  5    order by t1_key
  6  )
  7  where rownum <= 10;

10 rows selected.


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        118  consistent gets
          0  physical reads
          0  redo size
        758  bytes sent via SQL*Net to client
        608  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed


But here's my point. Obviously I'm just showing a query here, but I'm assuming the same person that does the querying is also the person doing the deleting, ie,

- read some rows
- "process" them
- delete them, because we're done

But that means we *know* the keys we're interested in. That gives us an easy direct entry point into the index for subsequent processing. Something like this:

SQL> col t1_key new_value lastprocessed
SQL>
SQL> set autotrace traceonly stat
SQL> select *
  2  from (
  3    select /*+ index_asc(t pk_t1) */ *
  4    from t1
  5    order by t1_key
  6  )
  7  where rownum <= 10;

10 rows selected.


Statistics
-------------------------------------------------------
          0  recursive calls
          0  db block gets
        118  consistent gets
          0  physical reads
          0  redo size
        758  bytes sent via SQL*Net to client
        608  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed

SQL> set autotrace off
SQL>
SQL> pro &&lastprocessed
4999010


I've "processed" my 10 rows, and I *know* that the last row I encountered was key = 4999010. I'll store that.

SQL> create table last_run ( key int );

Table created.

SQL> insert into last_run values (&&lastprocessed );

1 row created.

SQL> commit;

Commit complete.



So now, I adjust my query to get unprocessed rows to take into account

SQL> set autotrace traceonly stat
SQL> select *
  2  from (
  3    select /*+ index_asc(t pk_t1) */ *
  4    from t1
  5    where t1_key >= ( select key from last_run )
  6    order by t1_key
  7  )
  8  where rownum <= 10;

10 rows selected.


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         12  consistent gets
          0  physical reads
          0  redo size
        758  bytes sent via SQL*Net to client
        608  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed


and voila, we're "jumping" the deleted entries

INSERT grows II

Racer I., October 10, 2018 - 8:52 am UTC

Hi,

select block_size, extent_management, allocation_type, segment_space_management, bigfile
from user_tablespaces where tablespace_name = (select tablespace_name from user_tables where table_name = 'T1')

BLOCK_SIZE EXTENT_MANAGEMENT ALLOCATION_TYPE SEGMENT_SPACE_MANAGEMENT BIGFILE
8192 LOCAL SYSTEM AUTO YES

In our datatabase (tablespace settings above) I see this (segment size, and buffer gets for MIN()-select, 2000 loops, code at the end) :

0 - 0
1 - 65536 gets : 3
4 - 131072 gets : 4
16 - 196608 gets : 13
36 - 262144 gets : 20
64 - 327680 gets : 28
98 - 393216 gets : 35
100 - 393216 gets : 37
140 - 458752 gets : 43
190 - 524288 gets : 50
200 - 524288 gets : 50
248 - 589824 gets : 58
300 - 589824 gets : 64
313 - 655360 gets : 65
386 - 720896 gets : 73
400 - 720896 gets : 73
466 - 786432 gets : 80
500 - 786432 gets : 80
554 - 851968 gets : 88
600 - 851968 gets : 88
649 - 917504 gets : 95
700 - 917504 gets : 98
752 - 983040 gets : 103
800 - 983040 gets : 109
862 - 1048576 gets : 110
900 - 1048576 gets : 117
980 - 2097152 gets : 118
1000 - 2097152 gets : 118
1100 - 2097152 gets : 207
1200 - 2097152 gets : 243
1224 - 3145728 gets : 244
1300 - 3145728 gets : 244
1400 - 3145728 gets : 244
1500 - 3145728 gets : 304
1594 - 4194304 gets : 370
1600 - 4194304 gets : 370
1700 - 4194304 gets : 370
1800 - 4194304 gets : 449
1900 - 4194304 gets : 495
2000 - 4194304 gets : 495

The segment size and buffer gets continually rise. The first in the usual exponential instead of linear manner of steady growth. There might still be a plateau later (like in your example) but the rising buffer gets mean there is still little buffer reuse.

Without the deletes the segment grows to 27 MB by 1600 :

1506 - 25165824 gets : 3
1570 - 26214400 gets : 3
1633 - 27262976 gets : 3

The versions with FOR/FORALL for the insert are the same (just 200 loops tested). So its probably not mini-index stitching per se.

With bulk_size = 100 (300 loops) there is no growth:

0 - 0
1 - 65536 gets : 1
100 - 65536 gets : 1
200 - 65536 gets : 1
300 - 65536 gets : 1

Maybe 1000 index entries are just to big for a single block and that means an additional new block per loop?

With bulk_size = 10.000 (500 loops) the growth seems to level off faster :

0 - 0
1 - 196608 gets : 3
3 - 327680 gets : 14
6 - 393216 gets : 21
9 - 458752 gets : 29
13 - 524288 gets : 36
17 - 589824 gets : 41
22 - 655360 gets : 51
27 - 720896 gets : 56
33 - 786432 gets : 66
39 - 851968 gets : 71
46 - 917504 gets : 81
53 - 983040 gets : 86
61 - 1048576 gets : 96
69 - 2097152 gets : 104
100 - 2097152 gets : 123
200 - 2097152 gets : 124
300 - 2097152 gets : 124
400 - 2097152 gets : 124
500 - 2097152 gets : 124

New guess : As long as the segment is growing (for whatever reason) the new extent blocks are put on the free-list as you expect and there is simply no system which blocks get taken off (i.e. the recently deleted are not preferred).
There could be multiple free-lists. Maybe these effects feeds off each other?

---

DECLARE
   idx   NUMBER (11);
   bulk_size NUMBER(11);
   vsize number;
   voldsize number;
   pre    NUMBER := 0;
   post NUMBER;
   dummy NUMBER;
   ids DBMS_SQL.NUMBER_TABLE;    
BEGIN

   EXECUTE IMMEDIATE 'TRUNCATE TABLE t1';

   bulk_size := 10000;
   idx := 0;

   dbms_output.enable;
   voldsize := 0;
   dbms_output.put_line(lpad(idx, 8) || ' - ' || lpad(voldsize, 12));
   FOR i IN 1..bulk_size LOOP
     ids(i) := i;
   END LOOP;
   WHILE idx < 300
   LOOP
      DELETE FROM t1;
      COMMIT;

/*      FORALL i IN 1..(bulk_size - 1)
        INSERT INTO t1 (t1_key) VALUES ((ids(i)) + idx * bulk_size);
*/
/*      FOR i IN 2..bulk_size LOOP
        INSERT INTO t1 (t1_key) VALUES ((i - 1) + idx * bulk_size);
      END LOOP;
*/
      INSERT INTO t1 (t1_key)
         (SELECT LEVEL + idx * bulk_size
            FROM DUAL
         CONNECT BY LEVEL < bulk_size);

      COMMIT;
      
      idx := idx + 1;
      select bytes into vsize from user_segments where segment_name = 'PK_T1';
      if ((vsize > voldsize) OR (MOD(idx, 100) = 0)) THEN
        select value into pre from v$sesstat where statistic# = 132 and sid = userenv('sid');
        select MIN(t1_key) into dummy from t1;
        select value into post from v$sesstat where statistic# = 132 and sid = userenv('sid');
        voldsize := vsize;
        dbms_output.put_line(lpad(idx, 8) || ' - ' || lpad(voldsize, 12) || ' gets : ' || (post - pre));
        pre := post;
     end if;
   END LOOP;
END;
/

Connor McDonald
October 12, 2018 - 3:34 am UTC

OK... the "statistic# = 132" tells me something useful. You're on 12.1 ?

I just repeated my original code in 12.1 and the growth seems unabated.

So you can tell what my advice is going to be here :-)

Move to 12.2 or beyond :-) And look at perhaps my interim workaround of storing last known values until then.

DELETE woes II

Racer I., October 10, 2018 - 8:55 am UTC

Hi,

I've also been thinking about something like that, so the range scans can start close to the end of the empty blocks.
I was still hoping there would be some technical solution within oracle.
There are some complications :
The indexed date column is actually calculated (via trigger) and uses the creationtimestamp minus X hours, with X being the message types priority (up to 99).
So if the priority was increased (ad hoc) new rows could be older than the stored cutoff and missed. I guess deleting the cutoff in that scenario would work. There would also be some concurrency between the processes (there are multiple of these) over manipulating the stored cutoffs (which would also need to be held per message type).
I was also thinking about keeping the cutoff only in the session of the process. There are multiple instances and they live in a middle tier connection pool, so we might occasionally hit empty or old ones resulting in too long scans. The code could forget and refresh the cutoff every minute to cover the priority changes. Most scans would be short this way but the occasional long scan would drag the empty blocks into the SGA again and again.

regards,

Connor McDonald
October 12, 2018 - 3:35 am UTC

I was still hoping there would be some technical solution within oracle.


12.2+ will be better :-)

Delete woes (silver lining)

Racer I., October 12, 2018 - 5:58 am UTC

Hi,

Thanks for your insights. I think i've gotten a better idea of the (delete) problem now.
We currently use SHRINK SPACE (with HWM reset) during maintenance (every few weeks) but the hourly shrink is COMPACT. Without HWM reset this probably puts all the emptied blocks up to the HWM on the freelist. With their usage essentially random the chance of any subsequently deleted blocks being reused is tiny. The additional unnecessary extent growth in 12.1 makes the problem grow steadily between HWM resets (in addition to burst growth). 12.2 could help somewhat with that but will not solve the problem (like burst+noise).

So we will have to look into some cutoff-storage scheme.

Would you consider the general topic of empty leaf block handling ( https://richardfoote.wordpress.com/?s=empty+leaf&searchbutton=go%21 has some more scenarios) to be something worth of addressing with Oracle? Maybe drop a hint with a link to this thread at some meeting ;) Or mention it to Richard (strength in numbers)?

regards,

Connor McDonald
October 13, 2018 - 4:28 am UTC

Would you consider the general topic of empty leaf block handling to be something worth of addressing with Oracle?

I think I need to explain precisely where I sit in the Oracle hierarchy...Lets just say its quicker to start at the bottom of the organization chart than the top :-)

water to w(h)ine

Racer I., October 18, 2018 - 10:58 am UTC

Hi,

Ok. I actually opened an SR for this. Nothing to report there yet but I've been digging around more and found this :

https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=451423899421370&id=271855.1&_afrWindowMode=0&_adf.ctrl-state=dne1h6slr_21

In 11.2 onwards the QMON processes generally perform any required index maintenance for all the indexes and IOTs associated with a Queue Table automatically. The goal of the automated index/IOT cleanup is to make sure that dequeue doesn't slow down when there are lot of free index pages in the index segment. The QMON slave processes achieve this by freeing the empty index pages for both single and multi-consumer queues back to the segment free list but it should be noted that this process is not equivalent to an index rebuild or coalesce. The free index pages are identified during dequeue calls and there is no extra overhead/cost in identifying the empty pages. However, there are few scenarios like dequeue by message id or dequeue by condition in which the empty index pages may not be traversed by dequeue and thus cannot be freed back to the free list. So for normal dequeue using FIRST/NEXT_MESSAGE navigation options we will be able to identify and free the index pages back to the free list which will help maintain dequeue performance. The statistical information relating to the number of freed index or IOT pages can be obtained from the GV$PERSISTENT_QMN_CACHE view. In case we need to find out how many index pages were freed by the QMON slave processes we need to run following query:

Apparently Oracle has identified the problem for its own AQ system and actually established extra internal mechanisms to address it. Seems unfair ;).

regards,
Connor McDonald
October 20, 2018 - 4:15 pm UTC

Well....if you choose to build your own home grown AQ mechanism :-)

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.