Skip to Main Content


Question and Answer

Chris Saxon

Thanks for the question, Antonios.

Asked: December 06, 2019 - 3:25 pm UTC

Answered by: Chris Saxon - Last updated: December 11, 2019 - 11:44 am UTC

Category: SQL - Version:

Viewed 100+ times

You Asked

I have a large number of orders (200) involving around 2000 diferent products and need to group the in batches of 6 orders.
The task is to identify the best possible groups of orders so performance (human performance) can be maximized.

As a start I have a script working on the OE Schema.

This script easily identifies the first two orders with maximum familiarity.

As the query to produce the cross join is quite expensive I need to reuse as much as possible.

The folowing step would be to calculate the best possible matcing order to the accumulated pair of orders.

Is there soem way of using the recurring with clause to manage such a task.

Another solution whould be to build a bitmap for every order and with bitmap And / or /xor funtions calculate the matching factor. I have no idea if i could do something like that inside the database, but I guess it is something the database does know well how to do in order to use the bitmap indexes.

Can you help?

So, Trying to make things clearer i limit the number of orders to just 4 and products to 12 or so.

Form the result (last qurey) it is obvious that orders 2439 and 2391
are the best candidates to pick together as they have 3 products in common out of a maximum of 4
2439 2391 4 7 3

obviously the other two have nothing in common and are a very bad much, but with such a small number of orders that is expected.

The idea is to do the same for a lerga number of orders and products.

As the problem gets more complicated it becomes more dificult to handle.

My idea was to build an bitmap (short of an image) for each order and than use logical functions to try and calculate the matching factors.

Or maybe run a recurring query until i build the required number of matched orders( e.g. 6).

Is it still not readable?

with LiveSQL Test Case:

and we said...

I'm not clear on exactly what you're trying to do. Are you trying to find the orders which have the most products in common and group those together?

In which case the following may help:

- Load the products for each order into a nested table
- Use multiset intersect to find those with common products

Which looks something like:

create or replace type numbers_nt 
  is table of number(6);

with ords as (
  select oi.order_id, cast ( collect ( product_id ) as numbers_nt ) items
  from   oe.order_items oi
  group  by oi.order_id
), intersects as (
  select o1.order_id o1id, o2.order_id o2id,
         cardinality ( o1.items ) o1_item#,
         cardinality ( o2.items ) o2_item#,
         ( o1.items multiset intersect o2.items ) mset
  from   ords o1
  join   ords o2
  on     o1.order_id > o2.order_id
  and    ( o1.items multiset intersect o2.items ) is not empty
  select o1id,
         max ( o2id ) keep (
           dense_rank first 
           order by cardinality ( mset ) desc
         ) o2id,
         max ( cardinality ( mset ) ) card
  from   intersects i
  group  by o1id
  order  by card desc;

O1ID     O2ID      CARD   
   2380    2375       9 
   2399    2365       8 
   2422    2380       8 
   2382    2354       8 
   2429    2382       7 
   2412    2382       7 
   2420    2382       7 
   2419    2354       7 
   2362    2361       7 
   2392    2375       7

But this is still just a starting point. You need to group/filter these results as needed. It's likely you'll need to use recursive with to do this in pure SQL.

Precisely defining how the algorithm should work will help us get you closer to a complete solution.

More to Explore


Analytic SQL got you confused? Check out Connor McDonald's complete video course.