Skip to Main Content

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Iudith.

Asked: February 03, 2017 - 11:24 pm UTC

Last updated: February 06, 2017 - 2:50 pm UTC

Version: 12.2

Viewed 1000+ times

You Asked

Hello All,

Today, while commenting on a blog post, I found a weird behavior of the NTH_VALUE function,
and I wonder whether it is not a bug.

It's true, I performed my test on LiveSQL only, but I am not sure that this could be the cause.

It can be reproduced as follows:

create table tmp as
with x as (
select level r#
     , to_char( to_date(level,'J'),'Jsp') as nm
  from dual connect by level < 10
),
y as (
select 
       r#, nm
     , ROW_NUMBER() OVER(ORDER BY DBMS_RANDOM.VALUE()) AS rn
  from x
)
select * from y
/

Table created.


select r#, nm, rn
     , NTH_VALUE(nm, rn) OVER(order by r#)  shuffle
from tmp
/

R# NM RN SHUFFLE
-------------------------------
1 One 2  - 
2 Two 1 One
3 Three 8 One
4 Four 3 One
5 Five 9 One
6 Six 4 One
7 Seven 5 One
8 Eight 7 One
9 Nine 6 One

9 rows selected.



This query uses the default window of "range between unbounded preceding and current row",
which is not the one I intended, but, anyway, the result is buggy !


With more explicit windows, we get the following:

select r#, nm, rn
     , NTH_VALUE(nm, rn) from first
         OVER(order by r#
              range between unbounded preceding and current row)  shuffle1
     , NTH_VALUE(nm, rn) from first
         OVER(order by r#
              rows between unbounded preceding and current row)   shuffle2
     , NTH_VALUE(nm, rn) from last
         OVER(order by r#
              range between unbounded preceding and current row)  shuffle3
     , NTH_VALUE(nm, rn) from last
         OVER(order by r#
              rows between unbounded preceding and current row)   shuffle4
from tmp
/

R# NM RN SHUFFLE1 SHUFFLE2 SHUFFLE3 SHUFFLE4
--------------------------------------------------------------------------------
1 One 2  -    -    -    - 
2 Two 1 One  One  Two  Two
3 Three 8 One  One   -    - 
4 Four 3 One  One  Two  Two
5 Five 9 One  One   -    - 
6 Six 4 One  One  Three  Three
7 Seven 5 One  One  Three  Three
8 Eight 7 One  One  Two  Two
9 Nine 6 One  One  Four  Four

9 rows selected.




Here we see that the result columns SHUFFLE1 and SHUFFLE2 which use the (default) FROM FIRST clause
are wrong, while the result columns SHUFFLE3 and SHUFFLE4 which use FROM LAST are correct.


If we use a fix offset in the function, then we still see similar wrong results:

select r#, nm, rn
     , NTH_VALUE(nm, 5) from first
         OVER(order by r#
              range between unbounded preceding and current row)  shuffle1
     , NTH_VALUE(nm, 5) from first
         OVER(order by r#
              rows between unbounded preceding and current row)   shuffle2
     , NTH_VALUE(nm, 5) from last
         OVER(order by r#
              range between unbounded preceding and current row)  shuffle3
     , NTH_VALUE(nm, 5) from last
         OVER(order by r#
              rows between unbounded preceding and current row)   shuffle4
from tmp
/

R# NM RN SHUFFLE1 SHUFFLE2 SHUFFLE3 SHUFFLE4
--------------------------------------------------------------------------------
1 One 2  -   -    -    - 
2 Two 1  -    -    -    - 
3 Three 8  -    -    -    - 
4 Four 3  -    -    -    - 
5 Five 9 Five  Five  One  One
6 Six 4 Five  Five  Two  Two
7 Seven 5 Five  Five  Three  Three
8 Eight 7 Five  Five  Four  Four
9 Nine 6 Five  Five  Five  Five

9 rows selected.




Finally, the intended result was in fact one that specifies a window containing ALL rows,
and then ALL the results are correct:

select r#, nm, rn
     , NTH_VALUE(nm, rn) from first
         OVER(order by r#
              range between unbounded preceding and unbounded following)  shuffle1
     , NTH_VALUE(nm, rn) from first
         OVER(order by r#
              rows between unbounded preceding and unbounded following)   shuffle2
     , NTH_VALUE(nm, rn) from last
         OVER(order by r#
              range between unbounded preceding and unbounded following)  shuffle3
     , NTH_VALUE(nm, rn) from last
         OVER(order by r#
              rows between unbounded preceding and unbounded following)   shuffle4
from tmp
/

R# NM RN SHUFFLE1 SHUFFLE2 SHUFFLE3 SHUFFLE4
--------------------------------------------------------------------------------
1 One 2 Two  Two  Eight  Eight
2 Two 1 One  One  Nine  Nine
3 Three 8 Eight  Eight  Two  Two
4 Four 3 Three  Three  Seven  Seven
5 Five 9 Nine  Nine  One  One
6 Six 4 Four  Four  Six  Six
7 Seven 5 Five  Five  Five  Five
8 Eight 7 Seven  Seven  Three  Three
9 Nine 6 Six  Six  Four  Four

9 rows selected.




Thanks a lot in advance if you can look into this issue, in more versions/environments,
because it looks to me like a bug that maybe should be reported to Oracle.

Best Regards,
Iudith Mentzel

and Connor said...

I agree with you, because the docs state for the 'n' (ie, second column) in nth_value that:

"n can be a column, constant, bind variable, or an expression involving them."

I did the following to reproduce your output.

SQL> select r#, nm, rn
  2       , NTH_VALUE(nm,rn) OVER(order by r# range between unbounded preceding and current row)  shuffle
  3  from tmp
  4  /

        R# NM                 RN SHUFFLE
---------- ---------- ---------- ----------
         1 One                 1 One
         2 Two                 7 One
         3 Three               4 One
         4 Four                3 One
         5 Five                5 One
         6 Six                 2 One
         7 Seven               6 One
         8 Eight               9 One
         9 Nine                8 One

9 rows selected.


And then to see what would happen if those nth_value expressions has been calculated using a literal value instead of 'rn', I wrapped it within a function

SQL> create or replace
  2  function f(p_rn int,p_r# int) return varchar2 is
  3    x varchar2(10);
  4  begin
  5    execute immediate
  6    'select s
  7    from  (
  8      select r#, NTH_VALUE(nm,'||p_rn||') OVER(order by r# range between unbounded preceding and current row) s
  9      from   tmp
 10    )
 11    where r# = '||p_r# into x;
 12
 13    return x;
 14  end;
 15  /

Function created.

SQL> col dynsql format a10
SQL> select r#, nm, rn
  2       , NTH_VALUE(nm,rn) OVER(order by r# range between unbounded preceding and current row)  shuffle
  3       , f(rn,r#) dynsql
  4  from tmp
  5  /

        R# NM                 RN SHUFFLE    DYNSQL
---------- ---------- ---------- ---------- ----------
         1 One                 1 One        One
         2 Two                 7 One
         3 Three               4 One
         4 Four                3 One        Three
         5 Five                5 One        Five
         6 Six                 2 One        Two
         7 Seven               6 One        Six
         8 Eight               9 One
         9 Nine                8 One        Eight

9 rows selected.

SQL>
SQL>


and we can see something that looks more like we would have expected.

I'll log a bug for this - I see the same on 12.1 and 11.2, and a 10053 trace doesnt show any optimizer transformations going on, so possibly been there since nth_value came along.

Quick unrelated tip - if you're using dbms_random to build data, before you use it in your session, do something like:

exec dbms_random.seed(0)

Then your test case becomes repeatable in a consisten fashion for yourself, and also means we can replicate the data here.

Nice discovery !

Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Excellent :)

Iudith Mentzel, February 04, 2017 - 10:25 pm UTC

Hello Connor,

Great, thanks a lot :)

Your idea of testing with a PL/SQL function "opened my eyes"
to the fact that, contrary to what I thought, the results of
my test case that used a literal offset were in fact correct.

Thanks a lot for filing a bug.
My "two penny" contribution to making Oracle perfect :)

Best Regards,
Iudith Mentzel
Connor McDonald
February 06, 2017 - 2:50 pm UTC

Every little bit helps :-)

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.