Skip to Main Content
  • Questions
  • Process very slow on SELECT statement, V$SESSION shows latch: cache buffers chains, SQL_Address goes to 00

Breadcrumb

May 4th

Question and Answer

Connor McDonald

Thanks for the question, David.

Asked: November 26, 2015 - 9:08 pm UTC

Last updated: November 28, 2015 - 3:55 am UTC

Version: Oracle Database 11g Release 11.2.0.1.0 - 64bit Production

Viewed 1000+ times

You Asked

Dear Tom,

In a package a SELECT statement normally completes in 1 seconds or less, but sometimes rising to 10 or 20 seconds depending on the machine used.

Sometimes the process ends up waiting (maybe an hour) in the same statement.
In these cases at first the process appears in V$SESSION with an SQL_Address filled in, "Active Sessions" in SQL Developer shows the query correctly.
After some minutes the SQL_Address disappears, the process disappears from "Active Sessions".

Queries on V$Latch identify a latch, ...
select p1, p1raw, p1text from v$session where sid = 16;
... but it does not seem to exist ...
select gets, misses, sleeps, name
from v$latch
where addr = (select p1raw from v$session where sid = 16)
;
-- no rows selected

CPU usage is very high during this period, one core 99%.

Once this occurs the problem recurrs until the instance is restarted.
It occurs on Linux and Windows.

This occurs in test environments where this is the only process.

If the real process is semi-blocked in this fashion, running the test code below also blocks.
The latch codes listed from V$Latch for the two processes are different.

The anonymous block below has the SELECTs in question, copied from the package and adjusted to stand alone.
Instead of three separate queries this part of the process has been a UNION ALL of these queries and a CONNECT BY.
All versions have suffered this problem.

The code block is running inside a CURSOR picking out objects to be processed here from the same table, the source of the installationUid18Set in the code below.
Also, either side of the code are other queries over other tables that do not seem to be affected.

The central table ExtIndex has around a million rows, it manages an association between two different data sources, hence the CHAR and VARCHAR2 mixture.

I would be grateful for any help.

Thanks
Dave Marsh

-- ----------------------------------------------------------------------------------------------------
-- ----------------------------------------------------------------------------------------------------
-- Dave Marsh, 2015-11-26 12:58

-- ----------------------------------------------------------------------------------------------------
--

Declare

TYPE DescribedEquipmentRow IS RECORD (
equipmentUid ExtIndex.equipmentUid%type, -- equipmentUid of object - as used in ExtIndex
externalUid ExtIndex.externalUid%type, -- externalUid / SIT of object - as used in ExtIndex
tag ExtIndex.tag%type, -- tag of object - as used in ExtIndex
originalTag ExtIndex.originalPath%type, -- original SIT path of object - as used in ExtIndex
detailsTable ExtIndex.detailsTable%type, -- detailsTable - as ExtIndex
type ExtIndex.type%type, -- type - as ExtIndex
subType ExtIndex.subType%type, -- subType - as ExtIndex
depth ExtIndex.depth%type, -- depth of this entry
uidUp ExtIndex.uidUp%type, -- uidUp of this entry
sitSource ExtIndex.levelsSource%type, -- sitSource entries - to recognise existing generated panels
ownerUid ExtIndex.equipmentUid%type, -- Installation Uid - always the equipmentUid for installation for this equipment
-- except for the installation itself, when it is the externalUid of the installation
description ExtIndex.description%type, -- Description
levelsSource ExtIndex.levelsSource%type) -- levelsSource entries - to recognise exsiting SCADA panels
;
TYPE DescribedEquipmentTable IS TABLE OF DescribedEquipmentRow;
TYPE DescribedEquipmentCollection IS TABLE OF DescribedEquipmentRow INDEX BY VARCHAR2(18);

-- TYPE StringifiedVectorTable IS TABLE OF VARCHAR2(18);
installationUidSet EfaTools.StringifiedVectorTable;

usualDateFormat CONSTANT VARCHAR2(24) := 'YYYY-MM-DD HH24:MI:SS';

installationEquipmentsList DescribedEquipmentTable;
loadingList DescribedEquipmentTable;

-- TYPE UidCharList IS TABLE OF CHAR(18);
installationUid18Set UidCharList;
useGlobalDate DATE;

iRow INTEGER;
iOffSet INTEGER;

Begin

Dbms_Output.put_line('Go '||TO_CHAR(SYSDATE, usualDateFormat));

-- Get the test date in use
SELECT max(birthDate) INTO useGlobalDate
FROM ExtIndex
;

-- Collect some test installations
-- In the package this is a different but similar query (which does not seem to suffer from the hang)
-- running as a CURSOR SELECT and calling this process fro a 100 installations at a time
SELECT Uid_
BULK COLLECT INTO installationUid18Set
FROM ExtIndex
WHERE depth = 2
AND birthDate = useGlobalDate
AND deathDate IS NULL
AND substr(tag, 2, 1) = 'D'
AND substr(equipmentUid, 8, 1) = '7'
AND rowNum <= 100
;
Dbms_Output.put_line('Got '||TO_CHAR(installationUid18Set.count)||' Uids.');

installationUidSet := NEW EfaTools.StringifiedVectorTable();
installationUidSet.extend(installationUid18Set.count);
FOR iRow IN 1..installationUid18Set.count
LOOP
installationUidSet(iRow) := trim(installationUid18Set(iRow));
END LOOP;

-- Process may 'stall here'
SELECT *
BULK COLLECT INTO installationEquipmentsList
FROM (
-- Installations
SELECT son.equipmentUid, son.externalUid, son.tag,
DECODE(son.levelsSource, 0, son.originalPath, son.originalTag),
upper(son.detailsTable), son.type, son.subType, son.depth, son.uidUp,
son.sitSource,
son.equipmentUid AS ownerUid,
son.description, son.levelsSource
FROM ExtIndex son
WHERE son.equipmentUid IN (SELECT * FROM TABLE(CAST(installationUidSet AS EfaTools.StringifiedVectorTable)))
AND son.birthDate = useGlobalDate
AND son.deathDate IS NULL
)
;
Dbms_Output.put_line('Got '||TO_CHAR(installationEquipmentsList.count)||' Installations.');

-- Process may 'stall here'
SELECT *
BULK COLLECT INTO loadingList
FROM (
-- Panels
SELECT son.equipmentUid, son.externalUid, son.tag,
DECODE(son.levelsSource, 0, son.originalPath, son.originalTag),
upper(son.detailsTable), son.type, son.subType, son.depth, son.uidUp,
son.sitSource,
trim(son.uidUp) AS ownerUid,
son.description, son.levelsSource
FROM ExtIndex son
WHERE son.uidUp IN (SELECT * FROM TABLE(CAST(installationUid18Set AS UidCharList)))
AND son.birthDate = useGlobalDate
AND son.deathDate IS NULL
)
;
Dbms_Output.put_line('Got '||TO_CHAR(loadingList.count)||' Panels.');

iOffSet := installationEquipmentsList.count;
IF (iOffSet > 0) THEN
installationEquipmentsList.extend(loadingList.count);
END IF;
iRow := 0;
WHILE (iRow < loadingList.count)
LOOP
iRow := iRow +1;
installationEquipmentsList(iRow + iOffSet) := loadingList(iRow);
END LOOP;

-- Process may 'stall here'
SELECT *
BULK COLLECT INTO loadingList
FROM (
-- Equipment
SELECT son.equipmentUid, son.externalUid, son.tag,
DECODE(son.levelsSource, 0, son.originalPath, son.originalTag),
upper(son.detailsTable), son.type, son.subType, son.depth, son.uidUp,
son.sitSource,
trim(f.uidUp) AS ownerUid,
son.description, son.levelsSource
FROM ExtIndex son,
ExtIndex f
WHERE f.uidUp IN (SELECT * FROM TABLE(CAST(installationUid18Set AS UidCharList)))
AND f.birthDate = useGlobalDate
AND f.deathDate IS NULL
AND son.uidUp = f.uid_
AND son.deathDate IS NULL
AND son.birthDate = useGlobalDate
)
;
Dbms_Output.put_line('Got '||TO_CHAR(loadingList.count)||' Equipment.');

iOffSet := installationEquipmentsList.count;
IF (iOffSet > 0) THEN
installationEquipmentsList.extend(loadingList.count);
END IF;
iRow := 0;
WHILE (iRow < loadingList.count)
LOOP
iRow := iRow +1;
installationEquipmentsList(iRow + iOffSet) := loadingList(iRow);
END LOOP;

Dbms_Output.put_line('Got '||TO_CHAR(installationEquipmentsList.count)||' all of it.');

Dbms_Output.put_line('Done '||TO_CHAR(SYSDATE, usualDateFormat));

End;
/




and Connor said...

Some good information here

http://www.scaleabilities.co.uk/2009/11/09/the-oracle-wait-interface-is-useless-sometimes-pt/

http://blog.tanelpoder.com/2010/01/15/beyond-oracle-wait-interface-part-2/

(The first leads into the second).

But the most common reasons for a process just looking totally frozen (while burning CPU are).

a- endless loop in PLSQL
b- spinning on a latch due to contention
c- spinning on a latch due to bug

To see if its (a), add some diagnostics to your code to spit out some stuff to a table at regular intervals - if it doesn't, ie, its frozen "outside" of your PLSQL code, then its likely (b) or (c).

In that event, consider using something like 'truss' or 'strace' to check the system calls and see if its in a tightly bound loop of calls.

I know from personal experience that 11.2.0.1 had some issues with latches/mutexes that were really only addressed in 11.2.0.3+, so thats certainly a possiblity here.

Hope this helps

Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Reasons for process hang while using CPU

David Marsh, November 27, 2015 - 8:59 am UTC

Thank you for the reply. I am reading the articles suggested with interest.
Going over the three reasons given
a- endless loop in PLSQL
b- spinning on a latch due to contention
c- spinning on a latch due to bug

a has been ruled out with debug logging
b seems unlikely since the problem occurs when this is the only running process (I have since tried removing the outer cursor to ensure that the only active user query in the instance on a test machine is the stalled query).
I guess that leaves c ... presumably an Oracle bug.

Thanks again
Connor McDonald
November 28, 2015 - 3:55 am UTC

I hope you get a resolution. Certainly have a chat to Support - they can assist with more detailed diagnostics.

More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library