Skip to Main Content
  • Questions
  • Query Writing Differently (Relational Division)

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Praveen.

Asked: April 20, 2017 - 6:12 am UTC

Last updated: April 26, 2017 - 4:49 am UTC

Version: 11.2

Viewed 1000+ times

You Asked

create table developers (name varchar2(30), skill varchar2(30));

create table projects   (name varchar2(30), skill varchar2(30));

insert into developers values ('SMITH','ORACLE');
insert into developers values ('SMITH','JAVA');
insert into developers values ('ADAMS','JAVA');
insert into developers values ('ADAMS','C++');

insert into projects values ('MISSION MARS','JAVA');
insert into projects values ('MISSION MARS','C++');
insert into projects values ('MISSION IMPOSSIBLE','JAVA');

--QUERY: list out projects and eligible developers, for example Mission Mars 
--  requires knowledge of Java and C++ both, so developers having both skills
--  like Adams is eligible for both the projects and both for MI-project.

-- I have written a query but looking for something better:

select d.name, p.name from developers d, projects p
    minus
    select tmp.d_name, tmp.name -- ineligible developers
        from
            (select * from projects, (select distinct name d_name from developers)) tmp
            left join developers d2 on (d2.skill = tmp.skill and d2.name = tmp.d_name)
        where d2.name is null;


and Chris said...

How do you define "better"? Faster? Easier to read?

Anyway, what you're attempting is relational division. Here's one way to do this so you only need to access each table once, so should be faster and easier to read :)

It cross joins the tables, then groups by developer, project and skill. You want the rows from this where the developer has all the skills for the project. The expression:

sum(case when p.skill = d.skill then 1 else 0 end)


Finds you the rows where the project and developer skills match. If the sum of this (by project and dev) = the number of skills the project needs, you have a winner:

create table developers (name varchar2(30), skill varchar2(30));

create table projects   (name varchar2(30), skill varchar2(30));

insert into developers values ('SMITH','ORACLE');
insert into developers values ('SMITH','JAVA');
insert into developers values ('ADAMS','JAVA');
insert into developers values ('ADAMS','C++');

insert into projects values ('MISSION MARS','JAVA');
insert into projects values ('MISSION MARS','C++');
insert into projects values ('MISSION IMPOSSIBLE','JAVA');

select pname, dname, sum(sm), min(ct) from (
  select d.name dname, p.name pname, p.skill, 
         count(*) ct, 
         sum(case when p.skill = d.skill then 1 else 0 end) sm
  from   developers d
  cross  join projects p
  group  by d.name, p.name, p.skill
)
group  by pname, dname
having sum(sm) = min(ct);

PNAME         DNAME  SUM(SM)  MIN(CT)  
MISSION MARS  ADAMS  2        2  


There are several other approaches you can take to implement relational division. Though arguably most are harder to comprehend.

Here's an article giving a thorough discussion of them (they use SQL Server, but I guess we can forgive them for that ;):

https://www.simple-talk.com/sql/learn-sql-server/high-performance-relational-division-in-sql-server/

Note the first few examples won't work in your case, because you have multiple projects. Go to the "Relational Division with Multiple Divisors" section for solutions that match your problem.

Rating

  (7 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

another

Gh.., April 21, 2017 - 3:29 pm UTC

select project, required, developer, skills, case when instr(skills,required) > 0 then 'Admitted' end as Result
from (
select name as project
, LISTAGG(skill, ';') WITHIN GROUP (ORDER BY skill) as required
from projects
group by name ) a
cross join
(select name as developer , LISTAGG(skill, ';') WITHIN GROUP (ORDER BY skill) as skills
from developers
group by name) b
;

moreover chris qry doent work?

Gh.., April 21, 2017 - 3:44 pm UTC

insert into developers values ('CHRIS','SQL');

insert into developers values ('GUSS','SQL');

select pname, dname, sum(sm), min(ct) from (
select d.name dname, p.name pname, p.skill,
count(*) ct,
sum(case when p.skill = d.skill then 1 else 0 end) sm
from developers d
cross join projects p
group by d.name, p.name, p.skill
)
group by pname, dname
having sum(sm) = min(ct);

PNAME DNAME SUM(SM) MIN(CT)
---------------------- ---------- ---------- ----------
MISSION MARS ADAMS 2 2
MISSION IMPOSSIBLE GUSS 1 1
MISSION IMPOSSIBLE CHRIS 1 1


select project, required, developer, skills, case when instr(skills,required) > 0 then 'Admitted' end as Result
from (
select name as project
, LISTAGG(skill, ';') WITHIN GROUP (ORDER BY skill) as required
from projects
group by name ) a
cross join
(select name as developer , LISTAGG(skill, ';') WITHIN GROUP (ORDER BY skill) as skills
from developers
group by name) b
where instr(skills,required) > 0
;

PROJECT REQUIRED DEVELOPER SKILLS RESULT
-------------------------------------- --
MISSION MARS C++;JAVA ADAMS C++;JAVA Admitted

only Adams matches
Chris has only SQL does not match for MISSION IMPOSSIBLE even though sum=min=1

Chris Saxon
April 21, 2017 - 6:24 pm UTC

Ack, I grouped by the wrong skill column! Switch p.skill for d.skill in the inner query and all should be good.

You can't compare skills using listagg in that way. If the developer has other skills that sort between those required, it isn't returned:

create table developers (name varchar2(30), skill varchar2(30));

create table projects   (name varchar2(30), skill varchar2(30));

insert into developers values ('SMITH','ORACLE');
insert into developers values ('SMITH','JAVA');
insert into developers values ('ADAMS','JAVA');
insert into developers values ('ADAMS','C++');
insert into developers values ('ADAMS','GO');
insert into developers values ('CHRIS', 'SQL');
insert into developers values ('CHRIS', 'ORACLE');

insert into projects values ('MISSION MARS','JAVA');
insert into projects values ('MISSION MARS','C++');
insert into projects values ('MISSION IMPOSSIBLE','JAVA');
insert into projects values ('MISSION SQL','SQL');
insert into projects values ('MISSION SQL','ORACLE');

select pname, dname, sum(sm), min(ct) from (
  select d.name dname, p.name pname, d.skill, 
         count(*) ct, 
         sum(case when p.skill = d.skill then 1 else 0 end) sm
  from   developers d
  cross  join projects p
  group  by d.name, p.name, d.skill
)
group  by pname, dname
having sum(sm) = min(ct);

PNAME   DNAME SUM(SM) MIN(CT)
MISSION SQL  CHRIS 2 2
MISSION MARS  ADAMS 2 2
MISSION IMPOSSIBLE SMITH 1 1
MISSION IMPOSSIBLE ADAMS 1 1

select project, required, developer, skills, case when instr(skills,required) > 0 then 'Admitted' end as Result 
from ( 
select name as project 
, LISTAGG(skill, ';') WITHIN GROUP (ORDER BY skill) as required 
from projects 
group by name ) a 
cross join 
(select name as developer , LISTAGG(skill, ';') WITHIN GROUP (ORDER BY skill) as skills 
from developers 
group by name) b 
where instr(skills,required) > 0 ;

PROJECT   REQUIRED DEVELOPER SKILLS  RESULT
MISSION IMPOSSIBLE JAVA  ADAMS  C++;GO;JAVA Admitted
MISSION IMPOSSIBLE JAVA  SMITH  JAVA;ORACLE Admitted
MISSION SQL  ORACLE;SQL CHRIS  ORACLE;SQL Admitted

Thank you Chris. Sql skill confirmed

Gh.., April 22, 2017 - 6:24 am UTC


Why CROSS JOIN?

Stew Ashton, April 22, 2017 - 4:23 pm UTC

Hi Chris,

Thanks to the OP and you for a very interesting question, answer and link. However, I couldn't figure out why a CROSS JOIN is necessary. It seemed to me that a LEFT OUTER JOIN from project to developer would get all the base rows necessary. So I tried this:
select p_name, d_name, needed_cnt, found_cnt
from (
  select p.name p_name, d.name d_name,
    count(distinct p.skill) over(partition by p.name) needed_cnt,
    count(distinct d.skill) over(partition by p.name, d.name) found_cnt,
    row_number() over(partition by p.name, d.name order by null) rn
  from projects p
  left join developers d on p.skill = d.skill
)
where found_cnt = needed_cnt and rn = 1;

P_NAME              D_NAME  NEEDED_CNT  FOUND_CNT
------------------- ------- ---------- ----------
MISSION IMPOSSIBLE  ADAMS            1          1
MISSION IMPOSSIBLE  SMITH            1          1
MISSION MARS        ADAMS            2          2
MISSION SQL         CHRIS            2          2

Best regards,Stew
Chris Saxon
April 24, 2017 - 5:15 pm UTC

The cross join enables you to find developers who have the skills you need and other skills too:

select pname, dname, sum(sm), min(ct), min(dc) from (
  select d.name dname, p.name pname, d.skill, 
         count(*) ct, 
         sum(case when p.skill = d.skill then 1 else 0 end) sm,
         count(distinct d.skill) over (partition by d.name) dc
  from   developers d
  cross  join projects p
  group  by d.name, p.name, d.skill
)
group  by pname, dname
having sum(sm) = min(ct);

PNAME               DNAME  SUM(SM)  MIN(CT)  MIN(DC)  
MISSION MARS        ADAMS  2        2        3        
MISSION IMPOSSIBLE  ADAMS  1        1        3        
MISSION SQL         CHRIS  2        2        2        
MISSION IMPOSSIBLE  SMITH  1        1        2 


Which could be useful in some situations. If you're only interested in whether devs have exactly the project skills the left join you propose is better.

Another Approach

Praveen Ray, April 24, 2017 - 7:17 am UTC

and you rated our response

 (4 ratings)

I doubt your above calculation - I don't remember ever provided <5 ratings :)

Meanwhile, I found this more interesting at least for readability ;)

SELECT dev.name, prj.name
FROM
(SELECT name, CAST( COLLECT( skill ) AS tbl_type ) dev_skill FROM developers GROUP BY name) dev,
(SELECT name, CAST( COLLECT( skill ) AS tbl_type ) prj_skill FROM projects GROUP BY name) prj
WHERE prj.prj_skill SUBMULTISET OF dev.dev_skill;


Chris Saxon
April 24, 2017 - 5:09 pm UTC

Nice, hadn't thought of submultiset. Not convinced it's easier to read though ;)

Extra developer skills?

Stew Ashton, April 25, 2017 - 6:17 am UTC

Chris,

If we want the extra developer skills, then I think a FULL JOIN on skill is enough.
select * from (
  select p.name p_name, d.name d_name,
  count(distinct p.skill) over(partition by p.name, d.name) found_cnt,
  count(distinct p.skill) over(partition by p.name) project_cnt,
  count(distinct d.skill) over(partition by d.name) dev_cnt,
  row_number() over(partition by p.name, d.name order by null) rn
  from projects p
  full join developers d on p.skill = d.skill
)
where found_cnt = project_cnt and found_cnt > 0 and rn = 1;

P_NAME              D_NAME   FOUND_CNT  PROJECT_CNT  DEV_CNT  RN
------------------- ------- ---------- ------------ -------- ---
MISSION IMPOSSIBLE  ADAMS            1            1        3   1
MISSION IMPOSSIBLE  SMITH            1            1        2   1
MISSION MARS        ADAMS            2            2        3   1
MISSION SQL         CHRIS            2            2        2   1

Chris Saxon
April 25, 2017 - 12:43 pm UTC

Good point, yep a full outer join is enough if you want to find the extra skills.

using Multiset Except

Rajeshwaran, April 25, 2017 - 2:34 pm UTC

I know, i am late in this game.

Here is the another variation of the theme using MULTISET EXCEPT.
( a developer can have many skills, just matching the project skill with the developer skills )

demo@ORA12C> select * from projects;

NAME                           SKILL
------------------------------ ------------
MISSION MARS                   JAVA
MISSION MARS                   C++
MISSION IMPOSSIBLE             JAVA
MISSION SQL                    SQL
MISSION SQL                    ORACLE

demo@ORA12C> select * from developers;

NAME                           SKILL
------------------------------ ------------
SMITH                          ORACLE
SMITH                          JAVA
ADAMS                          JAVA
ADAMS                          C++
ADAMS                          GO
CHRIS                          SQL
CHRIS                          ORACLE

7 rows selected.

demo@ORA12C>
demo@ORA12C> create or replace type array is table of varchar(30);
  2  /

Type created.

demo@ORA12C> select p.name as project_name,d.name as developers_name
  2  from
  3    (select name,
  4      cast( collect(skill) as array ) skillsd
  5    from developers
  6    group by name  ) d,
  7    (select name,
  8      cast( collect(skill) as array ) skillsp
  9    from projects
 10    group by name ) p
 11  where cardinality( skillsp multiset except skillsd ) = 0 ;

PROJECT_NAME                   DEVELOPERS_NAME
------------------------------ ------------------------------
MISSION IMPOSSIBLE             ADAMS
MISSION IMPOSSIBLE             SMITH
MISSION MARS                   ADAMS
MISSION SQL                    CHRIS

demo@ORA12C>

Connor McDonald
April 26, 2017 - 4:49 am UTC

As the saying goes ... "Better late than never" :-)