Hi TOMs! I've noticed that periodically our 12.1.0.2 Asynchronous Global Index Maintenance job (PMO_DEFERRED_GIDX_MAINT_JOB) gets hung up from time to time and I've tracked it down to a long running insert causing the job to wait on enq: TX - row lock contention. The heavily partitioned table has a primary key, however, the newly inserted rows will never match any of the orphaned entries that are getting cleaned (and I would presume that the index would know this), so my question is: is this expected behavior? My guess is that the TX lock is preventing the coalesce leaf shuffle part, but was hoping that you could explain it in that way that you do? Thank you much!
I've linked a simplified test case to LiveSQL, but have pasted my results here since LiveSQL doesn't provide access to v$ views.
SQL> @index_contention_1row_unique.sql
SQL>
SQL> drop table mylist;
Table dropped.
SQL>
SQL> create table mylist (id number, n number)
2 partition by list (n)
3 (
4 partition one values (1),
5 partition two values (2)
6 );
Table created.
SQL>
SQL> REM alter table mylist add constraint mylist_pk primary key(id);
SQL> create unique index mylist_uniq on mylist(id);
Index created.
SQL>
SQL> begin
2
3 insert into mylist values (1,1);
4 insert into mylist values (2,2);
5 end;
6 /
PL/SQL procedure successfully completed.
SQL>
SQL> analyze index mylist_uniq validate structure;
Index analyzed.
SQL>
SQL> SELECT height, blocks, lf_blks, br_blks, del_lf_rows, btree_space,
2 pct_used FROM index_stats;
HEIGHT BLOCKS LF_BLKS BR_BLKS DEL_LF_ROWS BTREE_SPACE PCT_USED
---------- ---------- ---------- ---------- ----------- ----------- ----------
1 8 1 0 0 7996 1
1 row selected.
SQL>
SQL> select * from mylist;
ID N
---------- ----------
1 1
2 2
2 rows selected.
SQL>
SQL> alter table mylist drop partition two update indexes;
Table altered.
SQL>
SQL> analyze index mylist_uniq validate structure;
Index analyzed.
SQL>
SQL> SELECT height, blocks, lf_blks, br_blks, del_lf_rows, btree_space,
2 pct_used FROM index_stats;
HEIGHT BLOCKS LF_BLKS BR_BLKS DEL_LF_ROWS BTREE_SPACE PCT_USED
---------- ---------- ---------- ---------- ----------- ----------- ----------
1 8 1 0 1 7996 1
1 row selected.
SQL>
SQL> alter table mylist add partition two values (2);
Table altered.
SQL>
SQL> begin
2
3 insert into mylist values (3,2);
4
5 end;
6 /
PL/SQL procedure successfully completed.
SQL>
SQL> select * from mylist;
ID N
---------- ----------
1 1
3 2
2 rows selected.
SQL>
SQL> select * from v$lock where sid = sys_context('USERENV','SID');
ADDR KADDR SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK CON_ID
---------------- ---------------- ---------- -- ---------- ---------- ---------- ---------- ---------- ---------- ----------
0000000090218188 0000000090218200 121 TO 86562 1 3 0 96423 0 3
000000009021B118 000000009021B190 121 AE 133 0 4 0 97510 0 3
00007FDD82F7E0F8 00007FDD82F7E160 121 TM 99654 0 3 0 0 0 3
00007FDD82F7E0F8 00007FDD82F7E160 121 TM 99658 0 3 0 0 0 3
000000008DF7C648 000000008DF7C6C8 121 TX 1310745 330 6 0 0 0 0
5 rows selected.
SQL>
SQL> prompt "Run: alter index mylist_uniq coalesce cleanup; from another session"
"Run: alter index mylist_uniq coalesce cleanup; from another session"
SQL>
SQL> prompt "Then hit enter..."
"Then hit enter..."
SQL> pause
SQL>
SQL> set lines 200
SQL> set pages 200
SQL>
SQL> prompt "WAITEVENT: "enq: TX - row lock contention" Reference Note (Doc ID 1966048.1)"
"WAITEVENT: "enq: TX - row lock contention" Reference Note (Doc ID 1966048.1)"
SQL>
SQL> SELECT
2 sid, seq#, state, seconds_in_wait,
3 'TX-'||lpad(ltrim(p2raw,'0'),8,'0')||'-'||lpad(ltrim(p3raw,'0'),8,'0') TX,
4 trunc(p2/65536) XIDUSN,
5 trunc(mod(p2,65536)) XIDSLOT,
6 p3 XIDSQN
7 FROM v$session_wait
8 WHERE event='enq: TX - row lock contention'
9 ;
SID SEQ# STATE SECONDS_IN_WAIT TX XIDUSN XIDSLOT XIDSQN
---------- ---------- ------------------- --------------- -------------------------------------------------------------------- ---------- ---------- ----------
100 167 WAITING 3 TX-00140019-0000014A 20 25 330
1 row selected.
SQL>
SQL> SELECT distinct w.tx, l.inst_id, l.sid, l.lmode, l.request
2 FROM
3 ( SELECT p2,p3,
4 'TX-'||lpad(ltrim(p2raw,'0'),8,'0')||'-'||lpad(ltrim(p3raw,'0'),8,'0') TX
5 FROM v$session_wait
6 WHERE event='enq: TX - row lock contention'
7 and state='WAITING'
8 ) W,
9 gv$lock L
10 WHERE l.type(+)='TX'
11 and l.id1(+)=w.p2
12 and l.id2(+)=w.p3
13 ORDER BY tx, lmode desc, request desc
14 ;
TX INST_ID SID LMODE REQUEST
-------------------------------------------------------------------- ---------- ---------- ---------- ----------
TX-00140019-0000014A 1 121 6 0
TX-00140019-0000014A 1 100 0 4
2 rows selected.
SQL>
SQL> SELECT row_wait_obj#, row_wait_file#, row_wait_block#, row_wait_row#
2 FROM v$session
3 WHERE event='enq: TX - row lock contention'
4 AND state='WAITING'
5 ;
ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW#
------------- -------------- --------------- -------------
-1 0 0 0
1 row selected.
SQL>
SQL> prompt "What Does obj#=-1 Mean for Wait Events 'Enq: TX - Row Lock Contention' / 'enq: TX - allocate ITL entry' ? (Doc ID 1945821.1)"
"What Does obj#=-1 Mean for Wait Events 'Enq: TX - Row Lock Contention' / 'enq: TX - allocate ITL entry' ? (Doc ID 1945821.1)"
SQL>
Thanks for the great test case. Though you don't need to recreate the partition to see this effect. Just dropping it produces a similar result:
create table mylist (id number, n number)
partition by list (n)
(
partition one values (1),
partition two values (2)
);
create unique index mylist_uniq on mylist(id);
insert into mylist values (1,1);
insert into mylist values (2,2);
commit;
select * from mylist;
alter table mylist drop partition two update indexes;
insert into mylist values (3,1);
-- in a different session, run:
alter index mylist_uniq coalesce cleanup;
-- This will be blocked by the insert above until you commit
This is expected behavior. It comes about because unique indexes can reuse orphaned entries.
As George Eadon, one of our developers, explains:
"In unique indexes coalesce cleanup will always wait for row locks in the index. We are careful to do this in a deadlock-free way. This is required because without waiting we can not be sure we have cleaned up all of the orphaned rows. And if we can't be sure that we cleaned all of the orphaned rows, we cannot delete the orphaned entry metadata.
In your example, because we use async index maintenance, after the DROP PARTITION the index will still contain two rows:
- key <1> w/ rowid for partition ONE
- key <2> w/ rowid for partition TWO, which no longer exists
Now, suppose we insert (2, 1) (instead of (3, 1) in your example). This means key <2> will be updated with a rowid for partition ONE.
When coalesce cleanup comes through we find this uncommitted key <2>. If we do not wait for this uncommitted key we do not know if the change will commit or rollback (or rollback to intermediate savepoint and then commit.) So, if we were to skip over this key without waiting, and then the insert of (2, 1) rolled back, we would be left with key <2> pointing into partition TWO after the coalesce finishes."
When it comes to inserting a new value such as (3,1) in my example, the database doesn't know whether this is a genuinely new entry or a reused orphaned entry. So the coalesce has to wait until this commits or rolls back.
If you want to know more, you can read Richard Foote's article on async maintenance for unique indexes:
https://richardfoote.wordpress.com/2013/08/07/12c-asynchronous-global-index-maintenance-part-iii-re-makere-model/