You Asked
Hello, I have this problem I haven't been able to solve so far, though I suspect it should be using a hierarchical query.
I included sample/test data in the LiveSQL script I saved and shared, it includes the desired result the hypothetical query would return for each row, in column expected_master_group2, for convenience.
The requirements to determine the result:
We have this table for which each row has group1 and group2 values (never null or blank).
We want to find, using sql, for each of them the "master" (for lack of a better word) group2 value for group1, considering:
1-For a given group1 value we can have one or many group2 values.
2-For a given group2 value we can have one or many group1 values.
3-If a group1 has a single group2, and that group2 also has a this single group1 related, then our master group2 will be group2.
4-If a group1 has many group2s, but each of those group2s aren't linked to any other group1, then the min(group2) for group1 will be the master group2.
5-If a group1 has one or many group2s, if any of those group2s is/are linked to one or many other group1s, then we need to find those other linked group1s, find if they are themselves linked to different group2s we've already seen, and so on, until we have found all group2s we can possibly relate to our given group1. Then the master group2 will be the min(group2) from that list we've created by traversing all those relationships.
Example, to determine that for rows having group1=129653, the expected result to be retrieved by the query I'd like to build (and put in expected_master_group2) is '4190412':
-Both rows having group1=129653 have a group2='4190414'.
-group2='4190414' is also related to a row having group1=129654.
-There are 2 rows having group1=129654, one with group2='4190414' (which we've already "seen"), one with group2='4190413'.
-group2='4190413' is related to 3 rows, having respectively group1= 129653 (already seen), 129654 (already seen) and 129669.
-group1=129669 is related to group2s '4190413' (already seen), and '4190412'
-group2='4190412' is related to group1s 129668, 129670 and 129669 (already seen)
-group1s 129668 and 129670 having only group2='4190412', we have finished traversing the relationships.
-We have seen group2s '4190414', '4190413' and '4190412'.
-group2 '4190412' being the smallest of all the group2s we've seen, it is the result I'm looking for, for rows having group1=129653.
I hope I'm being clear and thorough enough in my explanations.
Thanks.
with LiveSQL Test Case:
and Chris said...
Yes. You can use recursive with to walk along the hierarchy. Search for group1 or group2 values for the next row that match the corresponding value in the current.
Several of the group2 values have many rows. So if all you do is the check above, you'll keep coming back to the same g2 value. Which means revisiting g1 values over and over and over... This explodes your hierarchy, making the query take f o r e v e r.
You can avoid this by keeping track of the previous g1/g2 values you visited. And don't go back if either of the next row's values match those from the source.
Put it all together and you get:
with s_data as (
select distinct * from sample_data
), tree ( g1, g2, g1_orig, g2_orig, rt, expect ) as (
select group1, group2, 0, '0', group1, expected_master_group2
from s_data
union all
select s.group1, s.group2, t.g1, t.g2, t.rt , expect
from tree t
join s_data s
on ( ( t.g1 = s.group1 ) or ( t.g2 = s.group2 ) )
and ( t.g1_orig <> s.group1 and t.g2_orig <> s.group2 )
)
select rt, expect, min ( g2 ),
case
when expect = min ( g2 ) then 'SUCCESS!'
else 'FAIL :('
end is_correct
from tree
group by rt, expect;
RT EXPECT MIN(G2) IS_CORRECT
129654 4190412 4190412 SUCCESS!
129659 4188693 4188693 SUCCESS!
129668 4190412 4190412 SUCCESS!
129676 4188693 4188693 SUCCESS!
129670 4190412 4190412 SUCCESS!
129660 4188693 4188693 SUCCESS!
129674 4188693 4188693 SUCCESS!
129653 4190412 4190412 SUCCESS!
129658 4188693 4188693 SUCCESS!
129655 4188693 4188693 SUCCESS!
129656 4188693 4188693 SUCCESS!
129678 4188693 4188693 SUCCESS!
129671 4188693 4188693 SUCCESS!
129672 4188693 4188693 SUCCESS!
129673 4188693 4188693 SUCCESS!
129669 4190412 4190412 SUCCESS!
129657 4188693 4188693 SUCCESS!
129677 4188693 4188693 SUCCESS!
129679 11219712 11219712 SUCCESS!
129675 4188693 4188693 SUCCESS!
HT to Kim Berg Hansen for his recursive subquery graph traversing which was the inspiration for this answer:
https://www.kibeha.dk/2013/01/recursive-subquery-graph-traversing.html
Rating
(1 rating)
Is this answer out of date? If it is, please let us know via a Comment