• Questions
• Total weight selection of rows

Thanks for the question, Richard.

Asked: August 05, 2020 - 12:34 pm UTC

Answered by: Chris Saxon - Last updated: August 06, 2020 - 4:40 pm UTC

Category: SQL - Version: 19c

Viewed 100+ times

I have 1000 cubes of weight between 360 and 430 grams. I need to choose a set of cubes which will have weigh 50.000 grams. Do you have any idea how to do it in SQL or PL/SQL?

```CREATE TABLE CUBES
(
ID      VARCHAR2(100 CHAR),
WEIGHT  NUMBER(10,3)
);

BEGIN
FOR i IN 1..1000 LOOP
INSERT INTO cubes VALUES (i, round(dbms_random.value( 360, 430), 3));
END LOOP;
COMMIT;
END;```

and we said...

So you want to find all sets of rows where the sum of the weights is 50 thousand?

If so, a few things to bear in mind:

- There's no guarantee there is such a group in your data!
- This is a form of bin packing problem, which is NP-complete. Which means there are no known solutions that are both fast and optimal

If you're willing to take some shortcuts, you can use match_recognize to find close values quickly and easily. With one pass through the data, you can get reasonably close:

```select *
from   cubes match_recognize (
measures
sum ( weight ) as total,
match_number () as grp
pattern ( total+ )
define
total as sum ( weight ) <= 50000
)
order  by total desc;

TOTAL           GRP
49943.371      6
49929.287      3
49878.49      5
49785.969      4
49770.08      1
49763.964      2
49636.947      7
47082.75      8```

Close, but 50+ short of the goal. Let's try and get closer. It'd be nice to see which cubes comprise this total too.

We can address both of these by adding these clauses:

```all rows per match

This means we'll see every matched row in the output. And revisit each row many times, so it could be in many groups.

To find the cubes in the group with the total weight closest to 50,000 include:

```max ( grp ) keep (
dense_rank first order by total desc
) over () max_grp```

In a subquery, the filter the rows with the same group as this max in the outer query. Giving:

```with rws as (
select grps.*,
max ( grp ) keep (
dense_rank first order by total desc
) over () max_grp
from   cubes match_recognize (
measures
final sum ( weight ) as total,
match_number () as grp
all rows per match
pattern ( total+ )
define
total as sum ( weight ) <= 50000
) grps
)
select * from rws
where  grp = max_grp;

TOTAL           GRP ID         WEIGHT    MAX_GRP
49999.505    541 35        369.375        541
49999.505    541 36        367.181        541
...
49999.505    541 160       389.137        541
49999.505    541 161       373.928        541

127 rows selected. ```

Now we're much closer. And the query is still fast. But we're walking through the rows in the same order every time.

We can see if we can get (a bit) closer by first sorting the table by a random value. Then re-run the query a few times. For example:

```with crand as (
select * from cubes
order  by dbms_random.value
), rws as (
select ...
from   crand match_recognize (
...
)
select * from rws
where  grp = max_grp;```

This is still quick. But there's no guarantee you'll get any groups with a total of 50k or the closest if there isn't one.

I don't know of a way to find the closest groups in a reasonable time frame. The brute force approach has factorial complexity - even with a small data set of 1,000 rows this is incredibly SLLLOOOOOOOOOOOOOOOOOOOOOOOOOOW!

Other may know some better algorithms you can use to get more accurate results.

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.