I have 1000 cubes of weight between 360 and 430 grams. I need to choose a set of cubes which will have weigh 50.000 grams. Do you have any idea how to do it in SQL or PL/SQL?
CREATE TABLE CUBES
(
ID VARCHAR2(100 CHAR),
WEIGHT NUMBER(10,3)
);
BEGIN
FOR i IN 1..1000 LOOP
INSERT INTO cubes VALUES (i, round(dbms_random.value( 360, 430), 3));
END LOOP;
COMMIT;
END;
So you want to find all sets of rows where the sum of the weights is 50 thousand?
If so, a few things to bear in mind:
- There's no guarantee there is such a group in your data!
- This is a form of bin packing problem, which is NP-complete. Which means there are no known solutions that are both fast and optimal
If you're willing to take some shortcuts, you can use match_recognize to find close values quickly and easily. With one pass through the data, you can get reasonably close:
select *
from cubes match_recognize (
measures
sum ( weight ) as total,
match_number () as grp
pattern ( total+ )
define
total as sum ( weight ) <= 50000
)
order by total desc;
TOTAL GRP
49943.371 6
49929.287 3
49878.49 5
49785.969 4
49770.08 1
49763.964 2
49636.947 7
47082.75 8
Close, but 50+ short of the goal. Let's try and get closer. It'd be nice to see which cubes comprise this total too.
We can address both of these by adding these clauses:
all rows per match
after match skip to next row
This means we'll see every matched row in the output. And revisit each row many times, so it could be in many groups.
To find the cubes in the group with the total weight closest to 50,000 include:
max ( grp ) keep (
dense_rank first order by total desc
) over () max_grp
In a subquery, the filter the rows with the same group as this max in the outer query. Giving:
with rws as (
select grps.*,
max ( grp ) keep (
dense_rank first order by total desc
) over () max_grp
from cubes match_recognize (
measures
final sum ( weight ) as total,
match_number () as grp
all rows per match
after match skip to next row
pattern ( total+ )
define
total as sum ( weight ) <= 50000
) grps
)
select * from rws
where grp = max_grp;
TOTAL GRP ID WEIGHT MAX_GRP
49999.505 541 35 369.375 541
49999.505 541 36 367.181 541
...
49999.505 541 160 389.137 541
49999.505 541 161 373.928 541
127 rows selected.
Now we're much closer. And the query is still fast. But we're walking through the rows in the same order every time.
We can see if we can get (a bit) closer by first sorting the table by a random value. Then re-run the query a few times. For example:
with crand as (
select * from cubes
order by dbms_random.value
), rws as (
select ...
from crand match_recognize (
...
)
select * from rws
where grp = max_grp;
This is still quick. But there's no guarantee you'll get any groups with a total of 50k or the closest if there isn't one.
I don't know of a way to find the closest groups in a reasonable time frame. The brute force approach has factorial complexity - even with a small data set of 1,000 rows this is incredibly SLLLOOOOOOOOOOOOOOOOOOOOOOOOOOW!
Other may know some better algorithms you can use to get more accurate results.