Skip to Main Content
  • Questions
  • how to equate substring of one table column to a value in a list from another table column as part of an exists?

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Jess.

Asked: December 17, 2018 - 7:48 pm UTC

Last updated: December 19, 2018 - 2:30 am UTC

Version: 12c

Viewed 10K+ times! This question is

You Asked

Hello Oracle Masters,

I was looking at 'connect by level' ( https://asktom.oracle.com/pls/apex/asktom.search?tag=clever-sql ) hoping I could make use of it, but don't seem to be able to. Perhaps there is another way?

We have a table of products and a table of exclusions. 'Products' have a name
(of which only a substring form is usable as a 'code').
'Exclusions' also have the same code, but multiple codes are stored in the same column separated by slash.

For example:
Products:
name = 'ABC*1.2.3.XYZ' (substring will strip out the usable part for comparison)
name = 'KLM*5.A'

Exclusions:
code = 'ABC/KLM'
code = 'XYZ'
code = 'OPQ/RST/UVW/EFG'


Have a horrendously long and complex query that pulls up all the products except those being excluded (and a sister variant of only those excluded).
We only used to care about the first code in the list.
Summed up, query looks like so:

select <loads>
from <loads of tables, including <i>'products p'</i>
where <complex set of conditions>
and exists
     (select from exclusions e where <conditions>
      and ((e.code like '%/%' and subsr(e.code)  = substr(p.name))
            or
           (e.code = substr(p.name))
      )




Now the requirements have changed and we want the query to return records when the [substring] of product name matches ANY of the codes in the slash-separated list of exclusion codes...


Rewriting the whole query from scratch isn't really possible--the bit I provided above is just the inner-most wrapped alias; there are more levels of it being aliased, joined to something else, etc....
(It's also an 'exists' rather than part of the join because there is another flavor of it elsewhere in the code
that's 'not exists', and both should work the same).

That said, the subset of the preferences table with some basic where conditions (e.g., records for this year)
is about 10K records.


Some code to recreate the scenario:

create table products (key varchar2(5) not null, name varchar2(30) not null);
insert into products values ('AB47', 'AB345*Some name');
insert into products values ('AB47', 'AB345*Some name');
insert into products values ('AB88', 'AB345*Some name');
insert into products values ('AB47', 'AB345.A*Some name');
insert into products values ('AB22', 'AB999*Another name');

create table exclusions (code varchar2(30) not null);
insert into exclusions values ('AB345/AB999');
insert into exclusions values ('QD523');


select p.*
from products p
where p.key = 'AB47' -- AB22
and exists (select null from exclusions e where ((e.code like '%/%' and 
              substr(e.code,1,instr(e.code,'/')-1) = 
                                     substr(p.name,1,instr(p.name,'*')-1))
            or
           (e.code = substr(p.name,1,instr(p.name,'*')-1)))
      )
;


Right now key AB47 returns rows but AB22 doesn't because AB999... comes second in the exclusions list.
The query should produce the a not not null result for either key...

Could you please help out with how the query can be amended?
I tried regexp_substr, but couldn't work out how to use the substring of product name as a pattern... Similarly, the connect by level from the aforementioned page, not sure how to loop it in so that we're only selecting from exclusions once..

Just to make it more fun, the full query has another table that keeps codes in the same format as above, so needs parsing the same way...

That query above is essentially aliased to 'q', such that
select from 
(<query above>) q,
manufacturers
where manufacturers.a = q.a and manufacturers.b = q.b ...
and <any of the codes in a code string in b> = a.code



Even when I get the exists to kind of run, this certainly does not....

Thank you ever so much!


and Chris said...

So you want to return all the product rows where the start of the name is in the exclusion values? And the exclusion values could be a slash-separated list of values?

If so, the technique you've linked seems to be the way to go.

On 12c you can use lateral reference values from another table in a subquery. So for each row in exclusions, you can generate N rows for each separated value.

One way to do this is:

select regexp_substr ( code, '[^/]+', 1, r.l ) val
from   exclusions,
       lateral (
         select level l from dual
         connect by level <= 
           length ( code ) - length ( replace ( code, '/' ) ) + 1
       ) r;

VAL     
AB345   
AB999   
QD523   


All you need to do now is reference these in your exists:

with exclusion_rws as (
  select regexp_substr ( code, '[^/]+', 1, r.l ) val
  from   exclusions,
         lateral (
           select level l from dual
           connect by level <= 
             length ( code ) - length ( replace ( code, '/' ) ) + 1
         ) r
) 
select p.*
from   products p
where  exists (
   select * from exclusion_rws er
   where  er.val = substr(p.name,1,instr(p.name,'*')-1)
);

KEY    NAME                 
AB47   AB345*Some name      
AB47   AB345*Some name      
AB88   AB345*Some name      
AB22   AB999*Another name 


You have the codes as rows now, so you can join them to your manufacturers table too.

Rating

  (3 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Jess, December 18, 2018 - 7:04 pm UTC

Hi Chris,

Thank you for looking at this mess :)
I think there are a few things there that may not exactly work work..

We want EVERY code from exclusion names. The original query worked with substr because we only cared about the first code. Now we want the list of ALL the codes to compare to for every row.

I don't think refactoring is ideal to pre-parse like that in terms of performance (if it was done inside 'exists', we could further limit the rows in its join with other tables in the main body). But I think I understand how to work that into a not exists (it'd just be an inner select on that one).

Where you say "You have the codes as rows now, so you can join them to your manufacturers table too"... The manufacturer codes are also strings that need to be parsed, so that's the second bit that's not working for me: parsing and joining at the same time as it were (can't turn manufacturers to 'exists' as we're bubbling up result columns from it).

The 'manufacturers' table is too big to refactor and parse on its own (like a few million records)... Just tried with just the first codes and killed it after about 4 minutes.

Could you please help with the second bit

select from 
(<query above>) q,
manufacturers
where manufacturers.a = q.a and manufacturers.b = q.b ...
and <any of the codes in a code string in b> = a.code


Thank you as always!

Connor McDonald
December 19, 2018 - 2:30 am UTC

glad we could help

Jess, December 18, 2018 - 7:17 pm UTC

On second thought, I think I am getting somewhere :)
Will post if/when it's working :)

Jess, December 18, 2018 - 8:52 pm UTC

Hi Chris,

In the end, this worked exactly as you said it would, and I learned about 'lateral'. Given everything a through tests with assorted data sets, and seem to getting correct results all around.

Thank you again for your invaluable assistance!

More to Explore

SQL

The Oracle documentation contains a complete SQL reference.