I have a large number of orders (200) involving around 2000 diferent products and need to group the in batches of 6 orders.
The task is to identify the best possible groups of orders so performance (human performance) can be maximized.
As a start I have a script working on the OE Schema.
This script easily identifies the first two orders with maximum familiarity.
As the query to produce the cross join is quite expensive I need to reuse as much as possible.
The folowing step would be to calculate the best possible matcing order to the accumulated pair of orders.
Is there soem way of using the recurring with clause to manage such a task.
Another solution whould be to build a bitmap for every order and with bitmap And / or /xor funtions calculate the matching factor. I have no idea if i could do something like that inside the database, but I guess it is something the database does know well how to do in order to use the bitmap indexes.
Can you help?
So, Trying to make things clearer i limit the number of orders to just 4 and products to 12 or so.
Form the result (last qurey) it is obvious that orders 2439 and 2391
are the best candidates to pick together as they have 3 products in common out of a maximum of 4
2439 2391 4 7 3
obviously the other two have nothing in common and are a very bad much, but with such a small number of orders that is expected.
The idea is to do the same for a lerga number of orders and products.
As the problem gets more complicated it becomes more dificult to handle.
My idea was to build an bitmap (short of an image) for each order and than use logical functions to try and calculate the matching factors.
Or maybe run a recurring query until i build the required number of matched orders( e.g. 6).
Is it still not readable?
I'm not clear on exactly what you're trying to do. Are you trying to find the orders which have the most products in common and group those together?
In which case the following may help:
- Load the products for each order into a nested table
- Use multiset intersect to find those with common products
Which looks something like:
create or replace type numbers_nt
is table of number(6);
/
with ords as (
select oi.order_id, cast ( collect ( product_id ) as numbers_nt ) items
from oe.order_items oi
group by oi.order_id
), intersects as (
select o1.order_id o1id, o2.order_id o2id,
cardinality ( o1.items ) o1_item#,
cardinality ( o2.items ) o2_item#,
( o1.items multiset intersect o2.items ) mset
from ords o1
join ords o2
on o1.order_id > o2.order_id
and ( o1.items multiset intersect o2.items ) is not empty
)
select o1id,
max ( o2id ) keep (
dense_rank first
order by cardinality ( mset ) desc
) o2id,
max ( cardinality ( mset ) ) card
from intersects i
group by o1id
order by card desc;
O1ID O2ID CARD
2380 2375 9
2399 2365 8
2422 2380 8
2382 2354 8
2429 2382 7
2412 2382 7
2420 2382 7
2419 2354 7
2362 2361 7
2392 2375 7
...
But this is still just a starting point. You need to group/filter these results as needed. It's likely you'll need to use recursive with to do this in pure SQL.
Precisely defining how the algorithm should work will help us get you closer to a complete solution.