You Asked
I could be going mad, but have got unexpected results from use of START WITH...CONNECT BY...PRIOR syntax, and wanted to confirm my understanding in this scenario.
We have a table with a hiearchical data structure, used to hold details of related transactions (parents and children).
Where the ORIGINAL, PARENT and REPLACEMENT columns are NULL, this is the top of the structure – i.e. REFNO 327147 in example below.
A child record can have either the PARENT or REPLACEMENT set, not both.
E.G.
REFNO ORIGINAL PARENT REPLACEMENT
327147
327171 327147 327147
327172 327147 327147
473116 327147 <NULL> 327171
473118 327147 <NULL> 327172
596518 327147 <NULL> 473118
598455 327147 596518
598456 327147 596518
598470 327147 596518
When I write a query to traverse the tree starting from a point where the REPLACEMENT column equals our candidate start points, if there are multiple start points this seems to cause a Cartesian product...
Reproducible with:
create table transactions(REFNO NUMBER(10),ORIGINAL NUMBER(10),PARENT NUMBER(10),REPLACEMENT NUMBER(10));
insert into transactions values(327147,NULL,NULL,NULL);
insert into transactions values(327171,327147,327147,NULL);
insert into transactions values(327172,327147,327147,NULL);
insert into transactions values(473116,327147,NULL,327171);
insert into transactions values(473118,327147,NULL,327172);
insert into transactions values(596518,327147,NULL,473118);
insert into transactions values(598455,327147,596518,NULL);
insert into transactions values(598456,327147,596518,NULL);
insert into transactions values(598470,327147,596518,NULL);
select t.*,CONNECT_BY_ISLEAF leaf_node /* Used to get lowest level leaf node */
from transactions t,(select '327171' start_candidate FROM dual UNION ALL select '327172' start_candidate FROM dual) candids
start with t.replacement = candids.start_candidate connect by NVL(t.replacement,t.parent) = prior t.refno
order by nvl(original,refno),nvl(replacement,parent),refno
This can be seen by tweaking query above to include a "WHERE 1=2" clause for one or other of the in-line view "select ... FROM dual"...
I.E.
i) when only 327171 is returned by in-line view, the query returns 1 record
ii) when only 327172 is returned by in-line view, the query returns 5 records
iii) when both are returned, the query returns 16 records !!!
It seems to be related to my use of NVL, but if I replace this with a CASE statement or a DECODE is still does the same thing.
So, I thought that the DBMS would start with each node specified in the use of "START WITH" clause just once, then separately traverse down the tree walk - i.e. start with 327171 and traverse down the tree structure returning that result set, THEN start with 327172 and traverse down the tree structure returning that result set.
What I seem to have is a munged together tree walk...
HELP !
and Tom said...
You have a cartesian join between T and CANDIDS (start with does NOT 'join', it selects rows from a join - you have a cartesian join between T and candids - you doubled all of the rows)
ops$tkyte%ORA10GR2> select rpad( '*', level, '*') hier, t.*,CONNECT_BY_ISLEAF leaf_node
2 from transactions t
3 start with t.replacement in (select '327171' start_candidate FROM dual UNION ALL select '327172' start_candidate FROM dual)
4 connect by NVL(t.replacement,t.parent) = prior t.refno
5 /
HIER REFNO ORIGINAL PARENT REPLACEMENT LEAF_NODE
---------- ---------- ---------- ---------- ----------- ----------
* 473116 327147 327171 1
* 473118 327147 327172 0
** 596518 327147 473118 0
*** 598455 327147 596518 1
*** 598456 327147 596518 1
*** 598470 327147 596518 1
6 rows selected.
is probably what you meant to do.
Is this answer out of date? If it is, please let us know via a Comment