I would like to understand, how Oracle determines when a row is a duplicate. While well aware of the documented difference between UNION and UNION ALL, I would like to know if this behaviour is due to specification and is part of the relational algebra.
I have noticed a difference in the output of the following two statements, depending on the order of the statements, or rather, where the "UNION" is placed:
select 1 from dual union --all
select 1 from dual union all
select 1 from dual union all
select 1 from dual union all
select 1 from dual union all
select 1 from dual;
Yields 5 rows
select 1 from dual union all
select 1 from dual union all
select 1 from dual union all
select 1 from dual union --all
select 1 from dual union all
select 1 from dual;
Yields 2 rows.
Can you refer to the relational model's theory and explain this behaviour?
Union and union all have equal precedence. So neither takes priority over the other; the database processes them top-to-bottom.
So the first statement:
UNIONs 1 with 1, returning one row
It then UNION ALLs the remaining four rows, to give a total of five.
Whereas the second:
UNION ALLs the first for rows, returning 4 rows
UNIONs these with the next row, returning 1 row
UNION ALLs the final row in, giving a total of 2
You can use brackets to change the order of precedence as with other operations. The database processes everything inside the brackets first.
So this returns one row:
select 1 from dual union
( select 1 from dual union all
select 1 from dual union all
select 1 from dual union all
select 1 from dual union all
select 1 from dual
);
1
1
This is because the brackets force the database to combine the last five rows, then union these with the first => one row in the result.
Looking at the execution plans can also help you see where the database applies the union (all) operations:
set serveroutput off
select 1 from dual union all
select 1 from dual union all
select 1 from dual union all
select 1 from dual union
select 1 from dual union all
select 1 from dual;
select *
from table(dbms_xplan.display_cursor(null, null, 'BASIC'));
---------------------------------
| Id | Operation | Name |
---------------------------------
| 0 | SELECT STATEMENT | |
| 1 | UNION-ALL | |
| 2 | SORT UNIQUE | |
| 3 | UNION-ALL | |
| 4 | FAST DUAL | |
| 5 | FAST DUAL | |
| 6 | FAST DUAL | |
| 7 | FAST DUAL | |
| 8 | FAST DUAL | |
| 9 | FAST DUAL | |
---------------------------------
select 1 from dual union
select 1 from dual union all
select 1 from dual union all
select 1 from dual union all
select 1 from dual union all
select 1 from dual;
select *
from table(dbms_xplan.display_cursor(null, null, 'BASIC'));
---------------------------------
| Id | Operation | Name |
---------------------------------
| 0 | SELECT STATEMENT | |
| 1 | UNION-ALL | |
| 2 | SORT UNIQUE | |
| 3 | UNION-ALL | |
| 4 | FAST DUAL | |
| 5 | FAST DUAL | |
| 6 | FAST DUAL | |
| 7 | FAST DUAL | |
| 8 | FAST DUAL | |
| 9 | FAST DUAL | |
---------------------------------
select 1 from dual union
( select 1 from dual union all
select 1 from dual union all
select 1 from dual union all
select 1 from dual union all
select 1 from dual);
select *
from table(dbms_xplan.display_cursor(null, null, 'BASIC'));
---------------------------------
| Id | Operation | Name |
---------------------------------
| 0 | SELECT STATEMENT | |
| 1 | SORT UNIQUE | |
| 2 | UNION-ALL | |
| 3 | FAST DUAL | |
| 4 | FAST DUAL | |
| 5 | FAST DUAL | |
| 6 | FAST DUAL | |
| 7 | FAST DUAL | |
| 8 | FAST DUAL | |
---------------------------------