Skip to Main Content
  • Questions
  • Common table expression using sample() referenced more than once does not produce correct results.

Breadcrumb

May 4th

Question and Answer

Connor McDonald

Thanks for the question, lh.

Asked: April 03, 2020 - 6:21 am UTC

Last updated: April 06, 2020 - 1:21 am UTC

Version: 12.2

Viewed 1000+ times

You Asked

Hi

Here is a very much simplified example of real world situation.

 drop table lh_a; 
 create table lh_a as 
 (select level nro, cast(' ' as char(100)) dummy from dual connect by level < 1000000); 
 create unique index lh_a_ind on lh_a(nro) pctfree 0; 

 with 
 b as (select nro from lh_a sample (1)) 
 select * 
 from 
 b b1 
 left outer join b b2 on b2.nro = b1.nro; 


This does not manage to find all rows from cte b2.

 create table b_materialized as select nro from lh_a sample (1); 
 select * from b_materialized b1 
 left outer join b_materialized b2 on b2.nro = b1.nro; 

This does.

Is this expected?


lh

and Connor said...

Sorry - can you elaborate? This is why I get from my 12.2 instance

SQL> create table lh_a as
  2   (select level nro, cast(' ' as char(100)) dummy from dual connect by level < 1000000);

Table created.

mcdonac@db122
SQL> create unique index lh_a_ind on lh_a(nro) pctfree 0;

Index created.

mcdonac@db122
SQL> 
mcdonac@db122
SQL> set autotrace traceonly stat
mcdonac@db122
SQL> 
mcdonac@db122
SQL> with
  2   b as (select nro from lh_a sample (1))
  3  select *
  4  from b b1
  5   left outer join b b2 on b2.nro = b1.nro;

10003 rows selected.


Statistics
----------------------------------------------------------
  10  recursive calls
  10  db block gets
      10709  consistent gets
      17263  physical reads
       1044  redo size
     200115  bytes sent via SQL*Net to client
       7934  bytes received via SQL*Net from client
 668  SQL*Net roundtrips to/from client
   0  sorts (memory)
   0  sorts (disk)
      10003  rows processed

mcdonac@db122
SQL> 
mcdonac@db122
SQL> create table b_materialized as select nro from lh_a sample (1);

Table created.

mcdonac@db122
SQL>   select * from b_materialized b1
  2   left outer join b_materialized b2 on b2.nro = b1.nro;

10097 rows selected.


Statistics
----------------------------------------------------------
   5  recursive calls
  11  db block gets
 727  consistent gets
  16  physical reads
       1048  redo size
     241929  bytes sent via SQL*Net to client
       8011  bytes received via SQL*Net from client
 675  SQL*Net roundtrips to/from client
   0  sorts (memory)
   0  sorts (disk)
      10097  rows processed

mcdonac@db122
SQL> 


Samples aren't deterministic between runs, eg

SQL> select count(*) from lh_a sample (1);

  COUNT(*)
----------
      9993

mcdonac@db122
SQL> /

  COUNT(*)
----------
     10073

mcdonac@db122
SQL> /

  COUNT(*)
----------
     10103

mcdonac@db122
SQL> /

  COUNT(*)
----------
      9976



Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

lh, April 06, 2020 - 11:48 am UTC

Hi

Sorry about not being able to give good explanation of the suspected error.

In our environment (12.2) the version using CTE is not outputting values from joined cte b2 to all rows.

..
16376
16487
16488
16608
16627
16674 16674
16911
16963
17000
17198
17331
..

To me this looks like that because sample(1) is nondeterministic and result is cte is not materialized, the joined cte:s do have different datasets.

I tried to use materialize hint, but I couldn't get it to work.

This is a very simplified test case of real world case
(managed to get it eventually work by using random -function to get sample data instead of sample() clause )


lh

More to Explore

SQL

The Oracle documentation contains a complete SQL reference.