Thanks.
Steve, September 08, 2020 - 9:04 pm UTC
I hadn't considered the default windowing clause...
I don't include the order by myself because it never made sense to me.
However, I have seen it in code and just ignored it not realizing it made a difference until recently when I was troubleshooting an issue.
Now, I'll need to keep an eye on these situations for potential unexpected behavior.
Thanks.
September 09, 2020 - 2:46 am UTC
More examples here
I do see a reason for the order by
Jeff Seger, June 22, 2023 - 9:06 pm UTC
I was looking for a way to get the id of the last entry when sorted by date when the IDs are not guaranteed to be in the same order as the dates. so I needed
MAX(action_id) OVER(
PARTITION BY account
ORDER BY
action_date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) as last_id_within_acct_by_date
Thanks for the answer!!
June 23, 2023 - 9:22 am UTC
Thanks for sharing - I'm not sure that does what you're looking for though! It returns the highest ACTION_ID per account.
If you want the (highest) ACTION_ID for the last ACTION_DATE, you need:
max ( action_id ) keep (
dense_rank last order by action_date
) over (
partition by account
)
For example, compare the difference:
select employee_id, hire_date,
max ( employee_id ) over (
order by hire_date
rows between unbounded preceding and unbounded following
) mx_id,
max ( employee_id ) keep (
dense_rank last order by hire_date
) over () mx_id_for_lastest_hired
from hr.employees
where department_id = 60
order by hire_date;
EMPLOYEE_ID HIRE_DATE MX_ID MX_ID_FOR_LASTEST_HIRED
----------- -------------------- ---------- -----------------------
105 25-JUN-2015 00:00:00 107 104
103 03-JAN-2016 00:00:00 107 104
106 05-FEB-2016 00:00:00 107 104
107 07-FEB-2017 00:00:00 107 104
104 21-MAY-2017 00:00:00 107 104
Default ORDER BY clause
mathguy, June 25, 2023 - 3:36 pm UTC
When considering a feature (like an analytic function, let's say analytic MAX(...) to be specific) with various options, I find it helpful to imagine that there's only one way that feature works, and to think of the options as special cases of a single global definition.
Some analytic functions require an ORDER BY clause, even when we aren't particularly interested in an ordering. For example ROW_NUMBER() does. If we don't care about the ordering, we can always say ORDER BY NULL, which means all rows (in a partition) are "equal". (Why a function like ROW_NUMBER() requires the ORDER BY to be present, instead of allowing the obvious default to be used when ORDER BY is missing, I never understood. We can - and do - write that in ourselves anyway, whenever needed.)
Analytic MAX does not require an ORDER BY clause, and it appears to behave differently when it only has an (optional) PARTITION BY clause without ORDER BY, vs. when it includes an ORDER BY clause.
If we don't like "different behavior depending on <whatever>" (I personally don't like that!), an alternative way to think about it is to imagine that analytic MAX always has an ORDER BY clause. If an explicit one is not present, the default is ORDER BY NULL. Then, a windowing clause always applies as well; as Chris explained, that is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. Since the ordering is by NULL, all rows are "equal"; and since the windowing clause is a RANGE condition, all rows are included in the window - no matter what row we are currently on.
At the abstract level, the "default" ordering (ORDER BY NULL) coupled with the default windowing clause means "all rows in the partition". This way we don't have to consider the behavior in the absence of ORDER BY as a special, different case.
June 26, 2023 - 12:46 pm UTC
Thanks for your thoughts
last id by date
Greg, June 27, 2023 - 7:56 pm UTC
@Jeff Seger,
if I understand what you're asking, you probably don't/can't use MAX efficiently, instead, use FIRST_VALUE:
with td as
( select 123 id, to_date('09/03/2020 09:00:00','MM/DD/YYYY HH24:MI:SS') dt from dual union all
select 127 id, to_date('09/03/2020 19:00:00','MM/DD/YYYY HH24:MI:SS') dt from dual union all
select 126 id, to_date('09/03/2021 09:00:00','MM/DD/YYYY HH24:MI:SS') dt from dual union all
select 125 id, to_date('09/03/2022 09:00:00','MM/DD/YYYY HH24:MI:SS') dt from dual union all
select 124 id, to_date('09/03/2023 09:00:00','MM/DD/YYYY HH24:MI:SS') dt from dual
)
select id, dt,
first_value(id) over (order by dt desc) latest_id
from td;
ID DT LATEST_ID
---------- -------------------- ----------
124 03-sep-2023 09:00:00 124
125 03-sep-2022 09:00:00 124
126 03-sep-2021 09:00:00 124
127 03-sep-2020 19:00:00 124
123 03-sep-2020 09:00:00 124
5 rows selected.
based on the date order, id 124 is the "last one" .. and that picks it out for each ...
June 28, 2023 - 12:57 pm UTC
Yep - though this is non-deterministic if two rows have the same last DT